Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel: Only build metal specific drivers for metal variants #2279

Merged
merged 1 commit into from
Jul 19, 2022

Conversation

foersleo
Copy link
Contributor

Adjust kernel spec files to only merge `config-bottlerocket-metal`
config fragment when the variant contains the substring `metal`. To
make this work we need to force a rebuild of the kernel for each variant
by setting the `variant-sensitive` option.

Signed-off-by: Leonard Foerster <foersleo@amazon.com>

Issue number: #2211

Description of changes:

Make the kernels we use in metal variants (kernel-5.10 and kernel-5.15) variant-sensitive to force rebuild with every variant switch even if no changes happened to the code. Merge the metal specific config fragment conditionally on if we are building a metal kernel or not.

This is not an ideal solution as it will rebuild the kernel when there is no rebuild necessary, for example when switching from aws-k8s-1.21 to aws-k8s-1.22 which use the same kernel. In practice the overhead from rebuilding the kernel for each variant is smaller than I anticipated (See testing below).

Testing done:

  • Build time
    In order to examine the impact of rebuilding the kernel for each variant switch I build a bunch of variants once with the patch and once without using the following script:
#!/bin/bash

declare -a variants=(aws-k8s-1.20
aws-k8s-1.21
aws-k8s-1.22
metal-k8s-1.21
metal-k8s-1.22
metal-k8s-1.23)

cargo make clean

for v in "${variants[@]}"
do
    echo ">>>>>> Building variant: ${v}"

    cargo make -e BUILDSYS_VARIANT=$v
done

Grepping for interesting lines we can see the build time per variant and if a kernel was build.

without patch:

>>>>>> Building variant: aws-k8s-1.20
   Compiling kernel-5_10 v0.1.0 (/home/fedora/src/bottlerocket/packages/kernel-5.10)
[cargo-make] INFO - Build Done in 776.62 seconds.
>>>>>> Building variant: aws-k8s-1.21
[cargo-make] INFO - Build Done in 331.52 seconds.
>>>>>> Building variant: aws-k8s-1.22
[cargo-make] INFO - Build Done in 333.62 seconds.
>>>>>> Building variant: metal-k8s-1.21
[cargo-make] INFO - Build Done in 316.58 seconds.
>>>>>> Building variant: metal-k8s-1.22
[cargo-make] INFO - Build Done in 316.18 seconds.
>>>>>> Building variant: metal-k8s-1.23
[cargo-make] INFO - Build Done in 325.77 seconds.

with patch:

>>>>>> Building variant: aws-k8s-1.20
   Compiling kernel-5_10 v0.1.0 (/home/fedora/src/bottlerocket/packages/kernel-5.10)
[cargo-make] INFO - Build Done in 774.20 seconds.
>>>>>> Building variant: aws-k8s-1.21
   Compiling kernel-5_10 v0.1.0 (/home/fedora/src/bottlerocket/packages/kernel-5.10)
[cargo-make] INFO - Build Done in 366.73 seconds.
>>>>>> Building variant: aws-k8s-1.22
   Compiling kernel-5_10 v0.1.0 (/home/fedora/src/bottlerocket/packages/kernel-5.10)
[cargo-make] INFO - Build Done in 372.68 seconds.
>>>>>> Building variant: metal-k8s-1.21
   Compiling kernel-5_10 v0.1.0 (/home/fedora/src/bottlerocket/packages/kernel-5.10)
[cargo-make] INFO - Build Done in 354.89 seconds.
>>>>>> Building variant: metal-k8s-1.22
   Compiling kernel-5_10 v0.1.0 (/home/fedora/src/bottlerocket/packages/kernel-5.10)
[cargo-make] INFO - Build Done in 358.23 seconds.
>>>>>> Building variant: metal-k8s-1.23
   Compiling kernel-5_10 v0.1.0 (/home/fedora/src/bottlerocket/packages/kernel-5.10)
[cargo-make] INFO - Build Done in 375.42 seconds.

We see roughly a 30 to 40 second increase in the build times with the rebuild per variant. I would have expected more impact here, but probably there is things about buildsys and some caching effects I do not fully understand yet.

With regards to the configurations we see the changes we would expect. For the metal variant there is no difference between building with or without this patch:

$ diff -U3 config-bottlerocket-x86_64-kernel-5.15-5.15.43-1.x86_64_metal_without_patch config-bottlerocket-x86_64-kernel-5.15-5.15.43-1.x86_64_metal_with_patch

For non-metal variants we can see the expected difference in the configs:

  • building aws-k8s-1.22 for kernel-5.10: *
$ diff -U3 config-bottlerocket-x86_64-kernel-5.10-5.10.118-1.x86_64_aws_without_patch config-bottlerocket-x86_64-kernel-5.10-5.10.118-1.x86_64_aws_with_patch 
--- config-bottlerocket-x86_64-kernel-5.10-5.10.118-1.x86_64_aws_without_patch	2022-07-15 11:27:53.000000000 +0000
+++ config-bottlerocket-x86_64-kernel-5.10-5.10.118-1.x86_64_aws_with_patch	2022-07-15 09:19:07.790100495 +0000
@@ -1653,7 +1653,7 @@
 CONFIG_MPLS_IPTUNNEL=m
 CONFIG_NET_NSH=m
 # CONFIG_HSR is not set
-CONFIG_NET_SWITCHDEV=y
+# CONFIG_NET_SWITCHDEV is not set
 CONFIG_NET_L3_MASTER_DEV=y
 # CONFIG_QRTR is not set
 # CONFIG_NET_NCSI is not set
@@ -1958,9 +1958,9 @@
 #
 # SCSI device support
 #
-CONFIG_SCSI_MOD=y
-CONFIG_RAID_ATTRS=y
-CONFIG_SCSI=y
+CONFIG_SCSI_MOD=m
+CONFIG_RAID_ATTRS=m
+CONFIG_SCSI=m
 CONFIG_SCSI_DMA=y
 CONFIG_SCSI_NETLINK=y
 CONFIG_SCSI_PROC_FS=y
@@ -1968,7 +1968,7 @@
 #
 # SCSI support type (disk, tape, CD-ROM)
 #
-CONFIG_BLK_DEV_SD=y
+CONFIG_BLK_DEV_SD=m
 CONFIG_CHR_DEV_ST=m
 CONFIG_BLK_DEV_SR=m
 CONFIG_CHR_DEV_SG=m
@@ -1983,7 +1983,7 @@
 CONFIG_SCSI_SPI_ATTRS=m
 CONFIG_SCSI_FC_ATTRS=m
 CONFIG_SCSI_ISCSI_ATTRS=m
-CONFIG_SCSI_SAS_ATTRS=y
+CONFIG_SCSI_SAS_ATTRS=m
 # CONFIG_SCSI_SAS_LIBSAS is not set
 # CONFIG_SCSI_SRP_ATTRS is not set
 # end of SCSI Transports
@@ -2013,12 +2013,12 @@
 # CONFIG_SCSI_ESAS2R is not set
 # CONFIG_MEGARAID_NEWGEN is not set
 # CONFIG_MEGARAID_LEGACY is not set
-CONFIG_MEGARAID_SAS=y
+# CONFIG_MEGARAID_SAS is not set
 CONFIG_SCSI_MPT3SAS=m
 CONFIG_SCSI_MPT2SAS_MAX_SGE=128
 CONFIG_SCSI_MPT3SAS_MAX_SGE=128
 CONFIG_SCSI_MPT2SAS=m
-CONFIG_SCSI_SMARTPQI=y
+CONFIG_SCSI_SMARTPQI=m
 # CONFIG_SCSI_UFSHCD is not set
 # CONFIG_SCSI_HPTIOP is not set
 CONFIG_SCSI_BUSLOGIC=m
@@ -2061,7 +2061,7 @@
 # CONFIG_SCSI_DH is not set
 # end of SCSI device support
 
-CONFIG_ATA=y
+CONFIG_ATA=m
 CONFIG_SATA_HOST=y
 CONFIG_PATA_TIMINGS=y
 CONFIG_ATA_VERBOSE_ERROR=y
@@ -2073,7 +2073,7 @@
 #
 # Controllers with non-SFF native interface
 #
-CONFIG_SATA_AHCI=y
+CONFIG_SATA_AHCI=m
 CONFIG_SATA_MOBILE_LPM_POLICY=0
 # CONFIG_SATA_AHCI_PLATFORM is not set
 # CONFIG_SATA_INIC162X is not set
@@ -2092,7 +2092,7 @@
 #
 # SATA SFF controllers with BMDMA
 #
-CONFIG_ATA_PIIX=y
+CONFIG_ATA_PIIX=m
 # CONFIG_SATA_DWC is not set
 # CONFIG_SATA_MV is not set
 # CONFIG_SATA_NV is not set
@@ -2280,7 +2280,6 @@
 # end of Distributed Switch Architecture drivers
 
 CONFIG_ETHERNET=y
-CONFIG_MDIO=m
 # CONFIG_NET_VENDOR_3COM is not set
 # CONFIG_NET_VENDOR_ADAPTEC is not set
 # CONFIG_NET_VENDOR_AGERE is not set
@@ -2293,20 +2292,7 @@
 # CONFIG_NET_VENDOR_ARC is not set
 # CONFIG_NET_VENDOR_ATHEROS is not set
 # CONFIG_NET_VENDOR_AURORA is not set
-CONFIG_NET_VENDOR_BROADCOM=y
-# CONFIG_B44 is not set
-# CONFIG_BCMGENET is not set
-# CONFIG_BNX2 is not set
-# CONFIG_CNIC is not set
-CONFIG_TIGON3=m
-CONFIG_TIGON3_HWMON=y
-# CONFIG_BNX2X is not set
-# CONFIG_SYSTEMPORT is not set
-CONFIG_BNXT=m
-CONFIG_BNXT_SRIOV=y
-CONFIG_BNXT_FLOWER_OFFLOAD=y
-# CONFIG_BNXT_DCB is not set
-CONFIG_BNXT_HWMON=y
+# CONFIG_NET_VENDOR_BROADCOM is not set
 # CONFIG_NET_VENDOR_BROCADE is not set
 CONFIG_NET_VENDOR_CADENCE=y
 # CONFIG_MACB is not set
@@ -2328,17 +2314,13 @@
 # CONFIG_E100 is not set
 CONFIG_E1000=m
 CONFIG_E1000E=m
-CONFIG_E1000E_HWTS=y
+# CONFIG_E1000E_HWTS is not set
 CONFIG_IGB=m
 CONFIG_IGB_HWMON=y
 CONFIG_IGB_DCA=y
-CONFIG_IGBVF=m
-CONFIG_IXGB=m
-CONFIG_IXGBE=m
-CONFIG_IXGBE_HWMON=y
-CONFIG_IXGBE_DCA=y
-CONFIG_IXGBE_DCB=y
-CONFIG_IXGBE_IPSEC=y
+# CONFIG_IGBVF is not set
+# CONFIG_IXGB is not set
+# CONFIG_IXGBE is not set
 CONFIG_IXGBEVF=m
 CONFIG_IXGBEVF_IPSEC=y
 # CONFIG_I40E is not set
@@ -2348,22 +2330,7 @@
 # CONFIG_IGC is not set
 # CONFIG_JME is not set
 # CONFIG_NET_VENDOR_MARVELL is not set
-CONFIG_NET_VENDOR_MELLANOX=y
-# CONFIG_MLX4_EN is not set
-CONFIG_MLX5_CORE=m
-# CONFIG_MLX5_FPGA is not set
-CONFIG_MLX5_CORE_EN=y
-CONFIG_MLX5_EN_ARFS=y
-CONFIG_MLX5_EN_RXNFC=y
-CONFIG_MLX5_MPFS=y
-CONFIG_MLX5_ESWITCH=y
-CONFIG_MLX5_CLS_ACT=y
-CONFIG_MLX5_CORE_EN_DCB=y
-# CONFIG_MLX5_CORE_IPOIB is not set
-# CONFIG_MLX5_IPSEC is not set
-CONFIG_MLX5_SW_STEERING=y
-# CONFIG_MLXSW_CORE is not set
-CONFIG_MLXFW=m
+# CONFIG_NET_VENDOR_MELLANOX is not set
 # CONFIG_NET_VENDOR_MICREL is not set
 # CONFIG_NET_VENDOR_MICROCHIP is not set
 # CONFIG_NET_VENDOR_MICROSEMI is not set
@@ -3978,7 +3945,6 @@
 # CONFIG_INFINIBAND_MTHCA is not set
 # CONFIG_INFINIBAND_EFA is not set
 # CONFIG_MLX4_INFINIBAND is not set
-CONFIG_MLX5_INFINIBAND=m
 # CONFIG_INFINIBAND_OCRDMA is not set
 # CONFIG_INFINIBAND_VMWARE_PVRDMA is not set
 # CONFIG_INFINIBAND_USNIC is not set
  • building aws-k8s-1.22 for kernel-5.10: *
--- config-bottlerocket-x86_64-kernel-5.15-5.15.43-1.x86_64_aws_without_patch	2022-07-15 11:44:43.000000000 +0000
+++ config-bottlerocket-x86_64-kernel-5.15-5.15.43-1.x86_64_aws_with_patch	2022-07-15 09:19:07.796100464 +0000
@@ -1647,7 +1647,7 @@
 CONFIG_MPLS_IPTUNNEL=m
 CONFIG_NET_NSH=m
 # CONFIG_HSR is not set
-CONFIG_NET_SWITCHDEV=y
+# CONFIG_NET_SWITCHDEV is not set
 CONFIG_NET_L3_MASTER_DEV=y
 # CONFIG_QRTR is not set
 # CONFIG_NET_NCSI is not set
@@ -1788,7 +1788,6 @@
 #
 # Generic Driver Options
 #
-CONFIG_AUXILIARY_BUS=y
 CONFIG_UEVENT_HELPER=y
 CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
 CONFIG_DEVTMPFS=y
@@ -2010,10 +2009,10 @@
 #
 # SCSI device support
 #
-CONFIG_SCSI_MOD=y
-CONFIG_RAID_ATTRS=y
-CONFIG_SCSI_COMMON=y
-CONFIG_SCSI=y
+CONFIG_SCSI_MOD=m
+CONFIG_RAID_ATTRS=m
+CONFIG_SCSI_COMMON=m
+CONFIG_SCSI=m
 CONFIG_SCSI_DMA=y
 CONFIG_SCSI_NETLINK=y
 CONFIG_SCSI_PROC_FS=y
@@ -2021,7 +2020,7 @@
 #
 # SCSI support type (disk, tape, CD-ROM)
 #
-CONFIG_BLK_DEV_SD=y
+CONFIG_BLK_DEV_SD=m
 CONFIG_CHR_DEV_ST=m
 CONFIG_BLK_DEV_SR=m
 CONFIG_CHR_DEV_SG=m
@@ -2037,7 +2036,7 @@
 CONFIG_SCSI_SPI_ATTRS=m
 CONFIG_SCSI_FC_ATTRS=m
 CONFIG_SCSI_ISCSI_ATTRS=m
-CONFIG_SCSI_SAS_ATTRS=y
+CONFIG_SCSI_SAS_ATTRS=m
 # CONFIG_SCSI_SAS_LIBSAS is not set
 # CONFIG_SCSI_SRP_ATTRS is not set
 # end of SCSI Transports
@@ -2067,13 +2066,13 @@
 # CONFIG_SCSI_ESAS2R is not set
 # CONFIG_MEGARAID_NEWGEN is not set
 # CONFIG_MEGARAID_LEGACY is not set
-CONFIG_MEGARAID_SAS=y
+# CONFIG_MEGARAID_SAS is not set
 CONFIG_SCSI_MPT3SAS=m
 CONFIG_SCSI_MPT2SAS_MAX_SGE=128
 CONFIG_SCSI_MPT3SAS_MAX_SGE=128
 CONFIG_SCSI_MPT2SAS=m
 # CONFIG_SCSI_MPI3MR is not set
-CONFIG_SCSI_SMARTPQI=y
+CONFIG_SCSI_SMARTPQI=m
 # CONFIG_SCSI_UFSHCD is not set
 # CONFIG_SCSI_HPTIOP is not set
 CONFIG_SCSI_BUSLOGIC=m
@@ -2116,7 +2115,7 @@
 # CONFIG_SCSI_DH is not set
 # end of SCSI device support
 
-CONFIG_ATA=y
+CONFIG_ATA=m
 CONFIG_SATA_HOST=y
 CONFIG_PATA_TIMINGS=y
 CONFIG_ATA_VERBOSE_ERROR=y
@@ -2128,7 +2127,7 @@
 #
 # Controllers with non-SFF native interface
 #
-CONFIG_SATA_AHCI=y
+CONFIG_SATA_AHCI=m
 CONFIG_SATA_MOBILE_LPM_POLICY=0
 # CONFIG_SATA_AHCI_PLATFORM is not set
 # CONFIG_SATA_INIC162X is not set
@@ -2147,7 +2146,7 @@
 #
 # SATA SFF controllers with BMDMA
 #
-CONFIG_ATA_PIIX=y
+CONFIG_ATA_PIIX=m
 # CONFIG_SATA_DWC is not set
 # CONFIG_SATA_MV is not set
 # CONFIG_SATA_NV is not set
@@ -2330,7 +2329,6 @@
 # CONFIG_VSOCKMON is not set
 # CONFIG_ARCNET is not set
 CONFIG_ETHERNET=y
-CONFIG_MDIO=m
 # CONFIG_NET_VENDOR_3COM is not set
 # CONFIG_NET_VENDOR_ADAPTEC is not set
 # CONFIG_NET_VENDOR_AGERE is not set
@@ -2343,20 +2341,7 @@
 # CONFIG_NET_VENDOR_ARC is not set
 # CONFIG_NET_VENDOR_ATHEROS is not set
 # CONFIG_CX_ECAT is not set
-CONFIG_NET_VENDOR_BROADCOM=y
-# CONFIG_B44 is not set
-# CONFIG_BCMGENET is not set
-# CONFIG_BNX2 is not set
-# CONFIG_CNIC is not set
-CONFIG_TIGON3=m
-CONFIG_TIGON3_HWMON=y
-# CONFIG_BNX2X is not set
-# CONFIG_SYSTEMPORT is not set
-CONFIG_BNXT=m
-CONFIG_BNXT_SRIOV=y
-CONFIG_BNXT_FLOWER_OFFLOAD=y
-# CONFIG_BNXT_DCB is not set
-CONFIG_BNXT_HWMON=y
+# CONFIG_NET_VENDOR_BROADCOM is not set
 CONFIG_NET_VENDOR_CADENCE=y
 # CONFIG_MACB is not set
 # CONFIG_NET_VENDOR_CAVIUM is not set
@@ -2376,17 +2361,13 @@
 # CONFIG_E100 is not set
 CONFIG_E1000=m
 CONFIG_E1000E=m
-CONFIG_E1000E_HWTS=y
+# CONFIG_E1000E_HWTS is not set
 CONFIG_IGB=m
 CONFIG_IGB_HWMON=y
 CONFIG_IGB_DCA=y
-CONFIG_IGBVF=m
-CONFIG_IXGB=m
-CONFIG_IXGBE=m
-CONFIG_IXGBE_HWMON=y
-CONFIG_IXGBE_DCA=y
-CONFIG_IXGBE_DCB=y
-CONFIG_IXGBE_IPSEC=y
+# CONFIG_IGBVF is not set
+# CONFIG_IXGB is not set
+# CONFIG_IXGBE is not set
 CONFIG_IXGBEVF=m
 CONFIG_IXGBEVF_IPSEC=y
 # CONFIG_I40E is not set
@@ -2397,25 +2378,7 @@
 # CONFIG_JME is not set
 # CONFIG_NET_VENDOR_LITEX is not set
 # CONFIG_NET_VENDOR_MARVELL is not set
-CONFIG_NET_VENDOR_MELLANOX=y
-# CONFIG_MLX4_EN is not set
-CONFIG_MLX5_CORE=m
-# CONFIG_MLX5_FPGA is not set
-CONFIG_MLX5_CORE_EN=y
-CONFIG_MLX5_EN_ARFS=y
-CONFIG_MLX5_EN_RXNFC=y
-CONFIG_MLX5_MPFS=y
-CONFIG_MLX5_ESWITCH=y
-CONFIG_MLX5_BRIDGE=y
-CONFIG_MLX5_CLS_ACT=y
-CONFIG_MLX5_TC_SAMPLE=y
-CONFIG_MLX5_CORE_EN_DCB=y
-# CONFIG_MLX5_CORE_IPOIB is not set
-# CONFIG_MLX5_IPSEC is not set
-CONFIG_MLX5_SW_STEERING=y
-# CONFIG_MLX5_SF is not set
-# CONFIG_MLXSW_CORE is not set
-CONFIG_MLXFW=m
+# CONFIG_NET_VENDOR_MELLANOX is not set
 # CONFIG_NET_VENDOR_MICREL is not set
 # CONFIG_NET_VENDOR_MICROCHIP is not set
 # CONFIG_NET_VENDOR_MICROSEMI is not set
@@ -4076,11 +4039,9 @@
 # CONFIG_INFINIBAND_MTHCA is not set
 # CONFIG_INFINIBAND_EFA is not set
 # CONFIG_MLX4_INFINIBAND is not set
-CONFIG_MLX5_INFINIBAND=m
 # CONFIG_INFINIBAND_OCRDMA is not set
 # CONFIG_INFINIBAND_VMWARE_PVRDMA is not set
 # CONFIG_INFINIBAND_USNIC is not set
-# CONFIG_INFINIBAND_BNXT_RE is not set
 CONFIG_INFINIBAND_QEDR=m
 # CONFIG_INFINIBAND_RDMAVT is not set
 # CONFIG_RDMA_RXE is not set

Note: that the above config diffs do not include the config settings merged in #2271 as the experiments were done before rebasing on top of latest develop

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

Adjust kernel spec files to only merge `config-bottlerocket-metal`
config fragment when the variant contains the substring `metal`. To
make this work we need to force a rebuild of the kernel for each variant
by setting the `variant-sensitive` option.

Signed-off-by: Leonard Foerster <foersleo@amazon.com>
Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

🥇

@bcressey
Copy link
Contributor

bcressey commented Jul 16, 2022

We see roughly a 30 to 40 second increase in the build times with the rebuild per variant. I would have expected more impact here, but probably there is things about buildsys and some caching effects I do not fully understand yet.

I'd guess that this is because a full kernel build only takes 30 to 40 seconds longer than the full os build that is triggered by changing the variant. Rust builds are ... not fast.

That's consistent with what I see on my ccache branch, which is that the kernel build speeds up by roughly 60 seconds, and finishes slightly before the os build, instead of slightly after as with your results.

@foersleo foersleo merged commit e7a1b22 into bottlerocket-os:develop Jul 19, 2022
@foersleo foersleo deleted the metal_cfg_split_build branch July 19, 2022 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants