Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[design doc] Secrets encryption key management subcommand #3407

Closed
brandond opened this issue Jun 4, 2021 · 10 comments
Closed

[design doc] Secrets encryption key management subcommand #3407

brandond opened this issue Jun 4, 2021 · 10 comments
Assignees
Labels
kind/enhancement An improvement to existing functionality
Milestone

Comments

@brandond
Copy link
Member

brandond commented Jun 4, 2021

Problem Statement

K3s currently has only a single command for managing secrets encryption - the --secrets-encryption flag. If this is turned on, a secrets encryption configuration document is generated using a random aescbckey key; this document is stored in the bootstrap data and shared among server nodes via the kube-apiserver --encryption-provider-config CLI flag. There are no provisions for rotating the key, disabling encryption after enabling it, etc.

Proposal

We want to expand this to allow for rotating the encryption key via a multi-step process:

  1. On one server run k3s secrets-encryption prepare which will do the following:
    1. Validate that all servers have the same encryption-configuration hash (via node annotation)
    2. Generate a new key
    3. Add the new key to the encryption configuration provider list in a non-primary position so that it can be used to decrypt but is not used for encryption
    4. Store the new configuration document to the bootstrap data.
  2. Restart k3s on all servers, this will write the new encryption configuration to disk and make it active for kube-apiserver
  3. On one server run k3s secrets-encryption rotate which will do the following:
    1. Validate that all servers have the same encryption-configuration hash (via node annotation)
    2. Move the new key into the first position in the provider list
    3. Store the new configuration document to the boostrap data
  4. Restart k3s on all servers, this will write the new encryption configuration to disk and make it active for kube-apiserver
  5. On one server run k3s secrets-encryption finalize which will do the following:
    1. Validate that all servers have the same encryption-configuration hash (via node annotation)
    2. Rewrite all secrets in the Kubernetes datastore in order to ensure that they are encrypted with the new key
    3. Remove the old key from the provider list
    4. Store the new configuration document to the bootstrap data
  6. Optionally, restart k3s on all servers in order to remove the old key from the active configuration

Additional Considerations

  • Should we support a k3s secrets-encryption status to display info about the current configuration - the the hash of the encryption configuration document in the boostrap data, the hash of the active configuration document on each node, the type and order of providers currently configured, etc?
  • Should we support k3s secrets-encryption disable (as an alternative to steps 1-4) that would move the identity encryption provider into the first position in the list?
  • How can we detect and properly handle or prevent running the same steps multiple times, or out of order?

Alternative Approaches

  • It was proposed that steps 3-6 should all be handled in a single operation, but I don't believe that this is currently possible to do in a way that works on both K3s and RKE2. RKE2 allows restarting the local kube-apiserver to reload configuration without restarting the entire supervisor, but this is not possible on K3s.
  • We discussed writing a standalone KMS binary that used the same aescbckey encryption code, but allowed for dynamic configuration reloading. This was dismissed as being excessively complicated.

Related Issues

@brandond brandond added the kind/enhancement An improvement to existing functionality label Jun 4, 2021
@brandond brandond added this to the v1.22 - Backlog milestone Jun 4, 2021
@brandond brandond changed the title [design doc] secrets encryption key management subcommand [design doc] Secrets encryption key management subcommand Jun 7, 2021
@brandond
Copy link
Member Author

brandond commented Jun 9, 2021

Talked through this in the design call and there was general agreement on the approach. There was some concern about this relying on a fix for #3015. It was suggested that we might try to find a better way to reload just the apiserver and/or other apiserver components on-demand in a way that doesn't require restarting the supervisor.

Would like to have @ibuildthecloud take a look at it as well before we start.

@vakrat1
Copy link

vakrat1 commented Aug 2, 2021

Hi
What are the chances that this issue will be part of version 1.22 (currently we see it is part of the Backlog list) ?
This issue is critical for us, since it is related to a security regulation our company would like to adhere.

Thank you in advance

@vakrat1
Copy link

vakrat1 commented Sep 12, 2021

Hi @brandond
I'd like to ask what are the chances this issue will impl for the next release of K3S

thank you

@cjellick
Copy link
Contributor

cjellick commented Oct 6, 2021

@brandond @dereknola - one thing to consider that we can discuss in design call tomorrow:

@galal-hussein said he can support users having their own custom encryption config file if the user puts that file in a certain place.

Would we ONLY support rotation if WE generated the file or could we support rotation against custom encryption config files?

@cjellick
Copy link
Contributor

cjellick commented Oct 6, 2021

from Bill: step 3 can cause load spikes and availability problems. it needs throttled.

@brandond
Copy link
Member Author

brandond commented Oct 6, 2021

I personally would not want to support rotation against random user-provided encryption configuration files; we essentially need to manage the whole file and the order of encryption providers it specifies across all servers in the cluster so that we can get the secrets reencrypted with the new token without preventing any node from being able to read them.

@cjellick
Copy link
Contributor

cjellick commented Oct 6, 2021

cool...that jives with how RKE1 behaves, fwiw. See: https://rancher.com/docs/rke/latest/en/config-options/secrets-encryption/#rotating-keys-with-the-rke-cli

rotation only supported with "managed" config.

@cjellick
Copy link
Contributor

cjellick commented Oct 7, 2021

From @Oats87 and @thedadams

  • Commands need to be idempotent
  • Approach generally works, keying off point is the node annotation hash
  • Rotating idempotency: what happens when you rotate a second time?

From @brandond on the finalize step: the UX doesnt imply the implementation. For example, the actual secret rekeying could be implemented via a job. Up to the implementor.

Documentation and even the CLI needs to very explicitly WARN the dangers of this command.

Let's stay away from a job for the actual rewriting of secrets. Requires stable cluster. Derek needs to go off and design this step and come back with it.

@cjellick
Copy link
Contributor

@dereknola this does need backported to 1.21, please open a backport issue

@dereknola
Copy link
Member

Closing this as the main feature is now being tracked by #4254

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement An improvement to existing functionality
Projects
None yet
Development

No branches or pull requests

6 participants