Skip to content

Commit

Permalink
Merge pull request #1658 from thomasferrandiz/useClusterCIDRConfig
Browse files Browse the repository at this point in the history
Implement MultiClusterCIDR feature in flannel
  • Loading branch information
thomasferrandiz authored Dec 13, 2022
2 parents a2976e8 + 243cf7e commit 025e7a0
Show file tree
Hide file tree
Showing 22 changed files with 866 additions and 191 deletions.
137 changes: 137 additions & 0 deletions Documentation/MultiClusterCIDR/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
Flannel provides experimental support for the new [MultiClusterCIDR API](/~https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2593-multiple-cluster-cidrs) introduced as an alpha feature in Kubernetes 1.26.

## Prerequisites
* A cluster running Kubernetes 1.26 (this was tested on version `1.26.0-alpha.1`)
* Use flannel version `0.21.0` or later
* The MultiClusterCIDR API can be used with vxlan, wireguard and host-gw backend

*Note*: once a PodCIDR is allocated to a node, it cannot be modified or removed. So you need to configure the MultiClusterCIDR before you add the new nodes to your cluster.

## How to use the MultiClusterCIDR API
### Enable the new API in the control plane
* Edit `/etc/kubernetes/manifests/kube-controller-manager.yaml` and add the following lines in the `spec.containers.command` section:
```
- --cidr-allocator-type=MultiCIDRRangeAllocator
- --feature-gates=MultiCIDRRangeAllocator=true
```

* Edit `/etc/kubernetes/manifests/kube-apiserver.yaml` and add the following line in the `spec.containers.command` section:
```
- --runtime-config=networking.k8s.io/v1alpha1
```

Both components should restart automatically and a default ClusterCIDR resource will be created based on the usual `pod-network-cidr` parameter.

For example:
```bash
$ kubectl get clustercidr
NAME PERNODEHOSTBITS IPV4 IPV6 AGE
default-cluster-cidr 8 10.244.0.0/16 2001:cafe:42::/112 24h

$ kubectl describe clustercidr default-cluster-cidr
Name: default-cluster-cidr
Labels: <none>
Annotations: <none>
NodeSelector:
PerNodeHostBits: 8
IPv4: 10.244.0.0/16
IPv6: 2001:cafe:42::/112
Events: <none>
```

### Enable the new feature in flannel
This feature is disabled by default. To enable it, add the following flag to the args of the `kube-flannel` container:
```
- --use-multi-cluster-cidr
```

Since you will specify the subnets to use for pods IP addresses through the new API, you do not need the `Network` and `IPv6Network` sections in the flannel configuration. Thus your flannel configuration could look like this:
```json
{
"EnableIPv6": true,
"Backend": {
"Type": "host-gw"
}
}
```


If you let them in, they will simply be ignored by flannel.
NOTE: this only applies when using the MultiClusterCIDR API.

### Configure the required `clustercidr` resources
Before adding nodes to the cluster, you need to add new `clustercidr` resources.

For example:
```yaml
apiVersion: networking.k8s.io/v1alpha1
kind: ClusterCIDR
metadata:
name: my-cidr-1
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- "worker1"
perNodeHostBits: 8
ipv4: 10.248.0.0/16
ipv6: 2001:cafe:43::/112
---
apiVersion: networking.k8s.io/v1alpha1
kind: ClusterCIDR
metadata:
name: my-cidr-2
spec:
nodeSelector:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- "worker2"
perNodeHostBits: 8
ipv4: 10.247.0.0/16
ipv6: ""
```
For more details on the `spec` section, see the [feature specification page](/~https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/2593-multiple-cluster-cidrs#expected-behavior).

*WARNING*: all the fields in the `spec` section are immutable.

For more information on Node Selectors, see [the Kubernetes documentation](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/).

### Add nodes to the cluster
The new nodes will be allocated a `PodCIDR` based on the configured `clustercidr`.
flannel will ensure connectivity between all the pods regardless of the subnet in which the pod's IP address has been allocated.

## Notes on the subnet.env file
flanneld writes a file (located by default at /run/flannel/subnet.env) that is used by the flannel cni plugin which is called by the kubelet every time a pod is added or removed from the node. This file changes slightly with the new API. The `FLANNEL_NETWORK` and `FLANNEL_IPV6_NETWORK` become lists of CIDRs instead of sigle CIDR entry. They will hold the list of CIDRs declared in the `clustercidr` resource of the API. The file is updated by flanneld every time a new `clustercidr` is created.

As an example, it could look like this:
```bash
FLANNEL_NETWORK=10.42.0.0/16,192.168.0.0/16
FLANNEL_SUBNET=10.42.0.1/24
FLANNEL_IPV6_NETWORK=2001:cafe:42::/56
FLANNEL_IPV6_SUBNET=2001:cafe:42::1/64,2001:cafd:42::1/64
FLANNEL_MTU=1450
FLANNEL_IPMASQ=true
```

## Notes on using IPv6 with the MultiClusterCIDR API
The feature is fully compatible with IPv6 and dual-stack networking.
Each `clustercidr` resource can include an IPv4 and/or an IPv6 subnet.
If both are provided, the PodCIDR allocated based on this `clustercidr` will be dual-stack.
The controller allows you to use IPv4, IPv6 and dual-stack `clustercidr` resources all at the same time to facilitate cluster migrations.
As a result, it is up to you to ensure the coherence of your IP allocation.

If you want to use dual-stack networking with the new API, we recommend that you do not specify the `--pod-network-cidr` flag to `kubeadm` when installing the cluster so that you can manually configure the controller later.
In that case, when you edit `/etc/kubernetes/manifests/kube-controller-manager.yaml`, add:
```
- --cidr-allocator-type=MultiCIDRRangeAllocator
- --feature-gates=MultiCIDRRangeAllocator=true
- --cluster-cidr=10.244.0.0/16,2001:cafe:42::/112 #replace with your own default clusterCIDR
- --node-cidr-mask-size-ipv6=120
- --allocate-node-cidrs
```
4 changes: 2 additions & 2 deletions Documentation/kube-flannel-psp.yml
Original file line number Diff line number Diff line change
Expand Up @@ -166,8 +166,8 @@ spec:
serviceAccountName: flannel
initContainers:
- name: install-cni-plugin
#image: flannelcni/flannel-cni-plugin:v1.1.0 for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
#image: flannelcni/flannel-cni-plugin:v1.1.2 for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.2
command:
- cp
args:
Expand Down
15 changes: 11 additions & 4 deletions Documentation/kube-flannel.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,13 @@ rules:
- nodes/status
verbs:
- patch
- apiGroups:
- "networking.k8s.io"
resources:
- clustercidrs
verbs:
- list
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
Expand Down Expand Up @@ -123,8 +130,8 @@ spec:
serviceAccountName: flannel
initContainers:
- name: install-cni-plugin
#image: flannelcni/flannel-cni-plugin:v1.1.0 for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.0
#image: flannelcni/flannel-cni-plugin:v1.1.2 #for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel-cni-plugin:v1.1.2
command:
- cp
args:
Expand All @@ -135,7 +142,7 @@ spec:
- name: cni-plugin
mountPath: /opt/cni/bin
- name: install-cni
#image: flannelcni/flannel:v0.20.2 for ppc64le and mips64le (dockerhub limitations may apply)
#image: flannelcni/flannel:v0.20.2 #for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel:v0.20.2
command:
- cp
Expand All @@ -150,7 +157,7 @@ spec:
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
#image: flannelcni/flannel:v0.20.2 for ppc64le and mips64le (dockerhub limitations may apply)
#image: flannelcni/flannel:v0.20.2 #for ppc64le and mips64le (dockerhub limitations may apply)
image: docker.io/rancher/mirrored-flannelcni-flannel:v0.20.2
command:
- /opt/bin/flanneld
Expand Down
7 changes: 6 additions & 1 deletion backend/ipip/ipip.go
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,12 @@ func (be *IPIPBackend) RegisterNetwork(ctx context.Context, wg *sync.WaitGroup,
return nil, fmt.Errorf("failed to acquire lease: %v", err)
}

link, err := be.configureIPIPDevice(n.SubnetLease, subnet.GetFlannelNetwork(config))
net, err := config.GetFlannelNetwork(&n.SubnetLease.Subnet)
if err != nil {
return nil, err
}

link, err := be.configureIPIPDevice(n.SubnetLease, net)

if err != nil {
return nil, err
Expand Down
6 changes: 5 additions & 1 deletion backend/udp/udp_amd64.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,11 +78,15 @@ func (be *UdpBackend) RegisterNetwork(ctx context.Context, wg *sync.WaitGroup, c
return nil, fmt.Errorf("failed to acquire lease: %v", err)
}

net, err := config.GetFlannelNetwork(&l.Subnet)
if err != nil {
return nil, err
}
// Tunnel's subnet is that of the whole overlay network (e.g. /16)
// and not that of the individual host (e.g. /24)
tunNet := ip.IP4Net{
IP: l.Subnet.IP,
PrefixLen: subnet.GetFlannelNetwork(config).PrefixLen,
PrefixLen: net.PrefixLen,
}

return newNetwork(be.sm, be.extIface, cfg.Port, tunNet, l)
Expand Down
12 changes: 10 additions & 2 deletions backend/vxlan/vxlan.go
Original file line number Diff line number Diff line change
Expand Up @@ -191,12 +191,20 @@ func (be *VXLANBackend) RegisterNetwork(ctx context.Context, wg *sync.WaitGroup,
// This IP is just used as a source address for host to workload traffic (so
// the return path for the traffic has an address on the flannel network to use as the destination)
if config.EnableIPv4 {
if err := dev.Configure(ip.IP4Net{IP: lease.Subnet.IP, PrefixLen: 32}, subnet.GetFlannelNetwork(config)); err != nil {
net, err := config.GetFlannelNetwork(&lease.Subnet)
if err != nil {
return nil, err
}
if err := dev.Configure(ip.IP4Net{IP: lease.Subnet.IP, PrefixLen: 32}, net); err != nil {
return nil, fmt.Errorf("failed to configure interface %s: %w", dev.link.Attrs().Name, err)
}
}
if config.EnableIPv6 {
if err := v6Dev.ConfigureIPv6(ip.IP6Net{IP: lease.IPv6Subnet.IP, PrefixLen: 128}, subnet.GetFlannelIPv6Network(config)); err != nil {
net, err := config.GetFlannelIPv6Network(&lease.IPv6Subnet)
if err != nil {
return nil, err
}
if err := v6Dev.ConfigureIPv6(ip.IP6Net{IP: lease.IPv6Subnet.IP, PrefixLen: 128}, net); err != nil {
return nil, fmt.Errorf("failed to configure interface %s: %w", v6Dev.link.Attrs().Name, err)
}
}
Expand Down
11 changes: 10 additions & 1 deletion backend/wireguard/device.go
Original file line number Diff line number Diff line change
Expand Up @@ -219,12 +219,21 @@ func (dev *wgDevice) upAndAddRoute(dst *net.IPNet) error {
return fmt.Errorf("failed to set interface %s to UP state: %w", dev.attrs.name, err)
}

err = dev.addRoute(dst)
if err != nil {
return fmt.Errorf("failed to add route to destination (%s) to interface (%s): %w", dst, dev.attrs.name, err)
}
return nil
}

func (dev *wgDevice) addRoute(dst *net.IPNet) error {
route := netlink.Route{
LinkIndex: dev.link.Attrs().Index,
Scope: netlink.SCOPE_LINK,
Dst: dst,
}
err = netlink.RouteAdd(&route)

err := netlink.RouteAdd(&route)
if err != nil {
return fmt.Errorf("failed to add route %s: %w", dev.attrs.name, err)
}
Expand Down
16 changes: 12 additions & 4 deletions backend/wireguard/wireguard.go
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ func (be *WireguardBackend) RegisterNetwork(ctx context.Context, wg *sync.WaitGr
}
publicKey = dev.attrs.publicKey.String()
} else {
return nil, fmt.Errorf("No valid Mode configured")
return nil, fmt.Errorf("no valid Mode configured")
}

subnetAttrs, err := newSubnetAttrs(be.extIface.ExtAddr, be.extIface.ExtV6Addr, config.EnableIPv4, config.EnableIPv6, publicKey)
Expand All @@ -168,17 +168,25 @@ func (be *WireguardBackend) RegisterNetwork(ctx context.Context, wg *sync.WaitGr
}

if config.EnableIPv4 {
err = dev.Configure(lease.Subnet.IP, subnet.GetFlannelNetwork(config))
net, err := config.GetFlannelNetwork(&lease.Subnet)
if err != nil {
return nil, err
}
err = dev.Configure(lease.Subnet.IP, net)
if err != nil {
return nil, err
}
}

if config.EnableIPv6 {
ipv6net, err := config.GetFlannelIPv6Network(&lease.IPv6Subnet)
if err != nil {
return nil, err
}
if cfg.Mode == Separate {
err = v6Dev.ConfigureV6(lease.IPv6Subnet.IP, subnet.GetFlannelIPv6Network(config))
err = v6Dev.ConfigureV6(lease.IPv6Subnet.IP, ipv6net)
} else {
err = dev.ConfigureV6(lease.IPv6Subnet.IP, subnet.GetFlannelIPv6Network(config))
err = dev.ConfigureV6(lease.IPv6Subnet.IP, ipv6net)
}
if err != nil {
return nil, err
Expand Down
Loading

0 comments on commit 025e7a0

Please sign in to comment.