-
Notifications
You must be signed in to change notification settings - Fork 521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bonding and vlan tagging support to Bottlerocket #2596
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall. A few minor comments. Should also clean up some of the commented out code before we merge. Nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few bits and bobs on the markdown.
0cd146d
to
9bf5b68
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the PR from the comments so far plus tidied up some dead code.
72e4d43
to
2a8e6ed
Compare
sources/api/netdog/test_data/net_config/bonding/arpmon_no_targets.toml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
3b9b143
to
ae2e422
Compare
7e1ce18
to
dc2ac0f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to save a few thoughts and move on to other things. I'll continue with this PR when I can. Nice work!
impl HasStatic for &NetVlanV1 { | ||
fn has_static(&self) -> bool { | ||
self.static4.is_some() || self.static6.is_some() | ||
} | ||
|
||
fn validate_static4(&self) -> Result<()> { | ||
if let Some(config) = &self.static4 { | ||
config.validate()? | ||
} | ||
Ok(()) | ||
} | ||
|
||
fn validate_static6(&self) -> Result<()> { | ||
if let Some(config) = &self.static6 { | ||
config.validate()? | ||
} | ||
Ok(()) | ||
} | ||
} | ||
|
||
impl HasDhcp for &NetVlanV1 { | ||
fn has_dhcp(&self) -> bool { | ||
self.dhcp4.is_some() || self.dhcp6.is_some() | ||
} | ||
} | ||
|
||
impl HasRoutes for &NetVlanV1 { | ||
fn has_routes(&self) -> bool { | ||
self.routes.is_some() | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, all these helper traits are hurting the clarity of the code.
I'd recommend looking at a derive macro that can be reused across the types, so you can just write:
#[derive(Debug, Deserialize, Validate)]
pub(crate) struct NetVlanV1 {
}
Then reviewers can focus on two things:
- do all the types that need to derive
Validate
? - does the single implementation of
Validate
cover all the combinations?
Right now, it's more like this:
- is
HasRoutes
implemented the same way forNetInterfaceV2
,NetVlanV1
,NetBondV1
? - is
HasStatic
... ? - is
HasDhcp
... ? - do all the structs that implement all three also call
validate_addressing
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, I'll look at implementing a derive macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't go as far as a derive, but I did clean up the code and move most of this boiler plate into a single trait that is generated for those devices that need it. So now someone should only need to reason that this has been added to the right places and the macro correctly generates the logic required to validate IP Addressing.
f41c607
to
740cd11
Compare
[bond1.monitoring] | ||
arpmon-interval = 200 | ||
arpmon-validate = "all" | ||
arpmon-targets = ["192.168.1.1", "10.0.0.2"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these always ipv4? Should you have one ipv6 value in the test data here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I believe they could be IPv6 as well, but I just never saw any examples. I can probably confirm that is valid but the kernel docs only use IPv4 (which might just date when the doc was written).
PROVISIONING-METAL.md
Outdated
* `miimon-downdelay-ms` (int in ms): MII Monitoring delay before the link is disabled after link is no longer detected in milliseconds | ||
* `arpmon-interval-ms` (int in ms): Number of milliseconds between intervals to determine link status, must be greater than 0 | ||
* `arpmon-validate-ms` (enum of `all`, `none`, `active`, `backup`): What packets should be used to validate link | ||
* `arpmon-targets-ms` (list of quoted IPv4 address including prefix): List of targets to use for validating ARP. Min = 1, Max = 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, am I not understanding what the ms
suffix means? I thought it was milliseconds.
* `arpmon-targets-ms` (list of quoted IPv4 address including prefix): List of targets to use for validating ARP. Min = 1, Max = 16 | |
* `arpmon-targets` (list of quoted IPv4 address including prefix): List of targets to use for validating ARP. Min = 1, Max = 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, was too aggressive.
} | ||
} | ||
|
||
#[derive(Clone, Debug, Deserialize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are probably a few enums like this in the PR. It can be handy when writing code if "simple" enums implement Copy
since they are nothing but an integer under the hood.
#[derive(Clone, Debug, Deserialize)] | |
#[derive(Clone, Copy, Debug, Deserialize)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll take a look at adding them to the Enums that are "simple" like you call out.
self.addresses.iter().all(|a| matches!(a, IpNet::V4(_))) | ||
|| self.addresses.iter().all(|a| matches!(a, IpNet::V6(_))), | ||
InvalidNetConfigSnafu { | ||
reason: "static configuration must only contain all IPv4 or all IPv6 addresses" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this answers a question I asked elsewhere.
In general I'm sad to see all of this validation code. And I'm sure I was asked for help with this and probably didn't come up with anything, but there's always a way to avoid "validate after construction", especially with Rust. Just something to keep in mind in future designs.
(Also, I realize that making it compatible with the serialization shape is challenging)
Instead of Vec<V4orV6Enum>
...
enum IpV4OrIpV6List {
IpV4(Vec<IpV4Struct>),
IpV6(Vec<IpV6Struct>),
}
Which is one example of eliminating a validate()
call.
#[test] | ||
fn ok_config() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generating test code like this confounds IDE tooling. Just something to keep in mind in the future. One cannot step through code or click on the green "run this test" button with generated code.
That being said, I love the fact that so much testing is taking place and the test configuration files serve as excellent documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, CLion does do some interesting things where you can view the generated file on some failures, I'm not sure how I got it doing it, but it did help. It's not the most ergonomic experience when dealing with these tests but I also don't think I would rather have the alternative. And its great to have tests that can map across versions pretty easily as we move forward.
} | ||
} | ||
|
||
#[allow(clippy::to_string_in_format_args)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you delete the .to_string()
calls do you get some other clippy message? If so that must be a false positive (assuming the types implement Display
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it complains differently about this:
error[E0277]: `interface_name::InterfaceName` doesn't implement `std::fmt::Display`
--> api/netdog/src/net_config/v3.rs:81:33
|
81 | ... interface,
| ^^^^^^^^^ `interface_name::InterfaceName` cannot be formatted with the default formatter
|
= help: the trait `std::fmt::Display` is not implemented for `interface_name::InterfaceName`
= note: in format strings you may be able to use `{:?}` (or {:#?} for pretty-print) instead
= note: this error originates in the macro `$crate::__export::format_args` which comes from the expansion of the macro `format` (in Nightly builds, run with -Z macro-backtrace for more info)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the "right" thing to do would be to implement Display
for interface_name::InterfaceName
instead of implementing a to_string
function (if that's what you did)
068372d
to
68ad650
Compare
The net.toml version is bumped to 3 to accommodate the new functionality of bonding and vlans. Version 3 does require different naming than version 2 so version net.toml files are not drop in ready for version 3. This also specifies which options are support for bonding and vlans. Bonding is a complex topic and right now, we are making a deliberate choice to support only a subset of bonding before full support is enabled. This mostly comes in the form of only mode 1 (active-primary). VLAN tagging is supported by creating a virtual network device on top of another device. This allows mixing and matching of vlan tags with whatever configuration is required. A user can choose to only configure VLAN Tagged devices for addressing if that is required, or add in vlan tagging on a device for special handling.
This adds the structures and tests that are used to support bonding in net.toml. This will be enabled via a version 3 of the net_config module in a future commit. The current implementation only supports mode 1 (active-backup) for bonding and a handful of settings beyond the minimum set required for bonding. The code is ready for more modes and settings to be added when the need arises.
This adds the stuctures to specify a vlan in net.toml. This allows deserialization and serialization of vlan structures along with some tests.
This adds version 3 of network configuration along with bonding and vlan support end to end from net.toml to wicked xml. The support now works as intended for both vlan virtual devices and bonding devices on metal. This also moves around some of the device files into a new module so that future expansion of devices is seperated from expansion of network configuration versions. This also changes the validation logic to add in traits that allow a generic IP Addressing validation function to run on any network device that accepts addressing.
68ad650
to
9e43a1a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
Let's open an issue re: better error messaging when deserializing untagged enums. Pulling that in will allow us to move all validation into Deserialize
implementations and remove the Validate
trait since our error messages won't be masked by the top-level enum deserialize.
I added #2657 to track the work to fix the error messaging for enums so we can move all the rest of the validation into the Deserialization. |
Issue number:
Closes #2369
Description of changes:
This will add bonding and vlan tagging support to bottlerocket. There is a lot of code here mostly because we needed to bump the configuration version to 3 and refactor a bunch of logic since there are now different types (kinds) of devices which have different validation logic, testing concerns, etc. This should allow us to add more virtual network devices in the future without so much refactoring. It also moves some of the device-specific code into a new
devices
module for easier understanding of what is anet_config
concern and what is a "network device" concern.Testing done:
I've validated this on hardware and qemu. (More testing will come as the PR progresses) The tests all pass as well right now. I also ran clippy to cover those bases as well.
Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.