Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k8s static pods support #1317

Merged
merged 4 commits into from
Feb 18, 2021
Merged

Conversation

etungsten
Copy link
Contributor

@etungsten etungsten commented Feb 11, 2021

Issue number:
N/A

Description of changes:

Author: Erikson Tung <etung@amazon.com>
Date:   Tue Feb 9 13:20:22 2021 -0800

    settings, static-pods: new settings and helper for k8s static pods
    
    Adds a set of new settings for managing k8s static pods.
    Adds a new binary helper to manage k8s static pod manifests.

Author: Erikson Tung <etung@amazon.com>
Date:   Tue Feb 9 13:27:48 2021 -0800

    aws-k8s-*: specify 'staticPodPath' in kubelet config
    
    Specify pod manifest path in kubelet config for all kubernetes variants.

Author: Erikson Tung <etung@amazon.com>
Date:   Fri Feb 12 12:17:29 2021 -0800

    pluto: only build for kubernetes variants
    
    We only need to build pluto for kubernetes variants

Author: Erikson Tung <etung@amazon.com>
Date:   Fri Feb 12 13:23:43 2021 -0800

    Makefile: conditionally exclude packages from cargo test
    
    In tasks.unit-test, we need to conditionally exclude variant specific
    packages from being compiled if the variant does not match up

Author: Erikson Tung <etung@amazon.com>
Date:   Tue Feb 16 10:41:34 2021 -0800

    migrations: add `add-static-pods` migration
    
    Adds a new migration for new k8s static-pods settings

Testing done:

Verified that buildsys does not build the package for non-k8s variants.

Build aws-k8s-1.15, aws-k8s-1.16, aws-k8s-1.17, aws-k8s-1.18, aws-k8s-1.19 images and launched instances for all of them

Kubelet started up fine.

Then I specified a static pod that ran nginx on port 80 with host networking:

apiclient set kubernetes.static-pods.control-plane-stuff.manifest=YXBpVmVyc2lvbjogdjEKa2luZDogUG9kCm1ldGFkYXRhOgogICBuYW1lOiBzdGF0aWMtcG9kLW5naW54LXRlc3QKc3BlYzoKICAgaG9zdE5ldHdvcms6IHRydWUKICAgY29udGFpbmVyczoKICAgICAgLSBuYW1lOiBuZ2lueAogICAgICAgIGltYWdlOiBuZ2lueDoxLjE0LjIKICAgICAgICBwb3J0czoKICAgICAgICAgICAtIGNvbnRhaW5lclBvcnQ6IDgwCg==
apiclient set kubernetes.static-pods.control-plane-stuff.enabled=true

The base64 blob decodes to the following:

apiVersion: v1
kind: Pod
metadata:
   name: static-pod-nginx-test
spec:
   hostNetwork: true
   containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
           - containerPort: 80

Observed that the manifest gets created successfully in the pod manifest path:

bash-5.0# cat /etc/kubernetes/static-pods/control-plane-stuff 
apiVersion: v1
kind: Pod
metadata:
   name: static-pod-nginx-test
spec:
   hostNetwork: true
   containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
           - containerPort: 80

Then curling localhost:80 from within the admin container I can see nginx is running as expected:

[ec2-user@ip-192-168-3-187 ~]$ curl localhost:80
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>
<style>
    body {
        width: 35em;
        margin: 0 auto;
        font-family: Tahoma, Verdana, Arial, sans-serif;
    }
</style>
</head>
<body>
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href="http://nginx.org/">nginx.org</a>.<br/>
Commercial support is available at
<a href="http://nginx.com/">nginx.com</a>.</p>

<p><em>Thank you for using nginx.</em></p>
</body>
</html>
[ec2-user@ip-192-168-3-187 ~]$ 

Checking pods via kubectl, I can see mirror pod entries in the k8s API:

$ kubectl get pods -A
NAMESPACE      NAME                                                                READY   STATUS    RESTARTS   AGE
default        static-pod-nginx-test-ip-192-168-3-187.us-west-2.compute.internal   1/1     Running   0          6m16s
default        static-pod-nginx-test-ip-192-168-8-47.us-west-2.compute.internal    1/1     Running   0          20m
....

Disabling the static pod by setting enabled to false also works as expected. The manifest file gets correctly deleted.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@etungsten etungsten changed the title Static pods k8s static pods support Feb 11, 2021
@etungsten etungsten marked this pull request as draft February 11, 2021 00:48
@etungsten

This comment has been minimized.

@etungsten etungsten force-pushed the static-pods branch 3 times, most recently from 571d157 to c5ee2a4 Compare February 11, 2021 04:05
@etungsten etungsten marked this pull request as ready for review February 11, 2021 05:28
@etungsten
Copy link
Contributor Author

Push above adds some documentation to the settings section of the top-level README.

README.md Outdated Show resolved Hide resolved
packages/os/os.spec Outdated Show resolved Hide resolved
sources/api/static-pods/src/main.rs Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Outdated Show resolved Hide resolved
packages/release/release.spec Outdated Show resolved Hide resolved
packages/os/os.spec Show resolved Hide resolved
sources/api/static-pods/src/main.rs Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/static_pods.rs Show resolved Hide resolved
Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎼

Copy link
Contributor

@srgothi92 srgothi92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice 🎉

Just out of curiosity and may not be relevant at all to the changes.
I was wondering what would happen if someone sends incorrect manifest file ? I think Kubernetes would not launch a pod for it. However, how can someone debug such cases.

@etungsten
Copy link
Contributor Author

Nice tada

Just out of curiosity and may not be relevant at all to the changes.
I was wondering what would happen if someone sends incorrect manifest file ? I think Kubernetes would not launch a pod for it. However, how can someone debug such cases.

If a user sends a poorly formatted manifest file, kubelet would just ignore that manifest. To debug, by virtue of not seeing the pod come up, the user would need to recheck their manifest. If they would like to see kubelet's log, then they'd have to go through logdog.

@etungsten etungsten requested a review from bcressey February 12, 2021 21:26
@etungsten
Copy link
Contributor Author

etungsten commented Feb 12, 2021

The force push from fda4739 to e3e9e1a addresses @zmrow and @bcressey 's comments.
a4b0608 above fixes the build check by modifying tasks.unit-test to conditionally exclude variant specific packages if the variant isn't aligned.

Tested and things work as expected.

@etungsten
Copy link
Contributor Author

I'll remove the conditional compilation from ecs-settings-applier and do the same for it in tasks.unit-tests in a separate PR.

README.md Outdated Show resolved Hide resolved
@etungsten
Copy link
Contributor Author

Push above addresses #1317 (comment)

@etungsten
Copy link
Contributor Author

Push above rebases onto develop to fix merging conflicts.

README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
sources/api/static-pods/src/main.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/main.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/main.rs Outdated Show resolved Hide resolved
sources/api/static-pods/src/main.rs Outdated Show resolved Hide resolved
@etungsten
Copy link
Contributor Author

d409af8adac486f904bc4216bdf8328fa2a31bb9 adds the migration for migrating the new static-pods settings.
The force push above addresses @samuelkarp 's comments.

Tested the migration and it works as expected on downgrades and upgrades.

Comment on lines +19 to +20
"static-pods",
"pluto"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Not now, but I wonder if there's a k8s.spec and k8s package for these in our future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or a k8s-release - but pluto is AWS-specific where static-pods is not.

Makefile.toml Outdated
Comment on lines 204 to 208
# Exclude Kubernetes specific packages from being built if the variant isn't Kubernetes specific
if ! echo "${VARIANT}" | grep -q "k8s"; then
EXCLUDE="--exclude static-pods pluto"
fi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels so far away from other conditionals. :(
I think this also means that I can no longer run:

cd sources
export VARIANT=aws-ecs-1 && cargo build

Which makes me sad (and might have IDE ramifications, not sure). Are we sure this is how we want to do this?

Copy link
Contributor Author

@etungsten etungsten Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has to be a conditional somewhere for this unit test task to work. Either we continue conditionally compiling variant specific binaries, or we do this.

Copy link
Contributor

@webern webern Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I realize that's the problem we're solving for. With ecs-settings-applier, the build.rs exports a cfg and the 'guts' are not compiled unless its ecs.

The problem with your approach is that the rust workspace is no longer usable on its own and we're moving toward a reliance on the makefile for things to work at all.

Both solutions are undesirable, so I'm not demanding a change here, but inviting discussion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for jumping in between the conversation.

cd sources
export VARIANT=aws-ecs-1 && cargo build

@webern Why won't this work ? This conditional is not inside sources dir packages right ? Not able to figure out how it affects.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that the reason @etungsten added this to the makefile is because the pluto and static-pods packages will fail to compile unless the model package is set to use k8s settings. If we try to compile the workspace for ECS, the model will have the wrongs structs and our build will fail.

Copy link
Contributor Author

@etungsten etungsten Feb 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically tasks.unit-tests will fail because it just compiles everything in the workspace with the same VARIANT env var.

Regarding the cd sources; export VARIANT=aws-ecs-1 && cargo build use case. I feel like that's somewhat unusual. What reasons are there for people wanting to build the entire workspace without going through the Makefile? I would understand if people wanted to build specific workspace components for testing or w.e. But if they want to build everything in the workspace for use, shouldn't they just go through building the OS package via the Makefile?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm OK with your original conditional compilation for static-pods if it makes the overall workspace easier to build. I like the property that these crates can be built without bringing the entire build system to bear.

I'd prefer to get to the point where the entire sources tree can be built for each model simultaneously, and although the Makefile.toml workaround doesn't preclude that, it may invite us to continue segmenting the tree in ad-hoc ways rather than driving toward that goal.

Copy link
Contributor

@webern webern Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use the workspace without bringing the build system to bear all the time. I'm not sure how to describe the use case, but having a workspace that can't be compiled with cargo build or cargo test seems to defeat the purpose of a workspace. Right now the only "gotchya" is that the VARIANT env is needed, and the error messages document this very well (I often forget, read the error then export a VARIANT).

I think the options are:

  1. Merge this as is with the Makefile.toml changes, and no longer can we cargo build the workspace (without adding the --exclude statements depending on VARIANT), or
  2. Change to the cfg solution that we used with ecs-settings-applier so that we can always build all of the workspace, but the incompatible code is 'commented out'.

@bcressey are you saying that you choose #1 and that, because the workspace no longer builds, we may be more motivated to fix it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll revert back to conditionally compiling static-pods since that's what people seem to prefer so far.

Copy link
Contributor

@bcressey bcressey Feb 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bcressey are you saying that you choose #1 and that, because the workspace no longer builds, we may be more motivated to fix it?

No, I was agreeing with you and I prefer #2.

sources/api/static-pods/src/main.rs Outdated Show resolved Hide resolved
@etungsten
Copy link
Contributor Author

etungsten commented Feb 16, 2021

Push above addresses @webern 's comment #1317 (comment)

Revamps arg parsing error handling. Still works as expected:

etung in api/static-pods/debug on 🌱 static-pods [$!?] 
$ ./static-pods --log-level asdasd
Invalid log level 'asdasd'
Usage: ./static-pods
            [ --socket-path PATH ]
            [ --log-level trace|debug|info|warn|error ]

    Socket path defaults to /run/api.sock
etung in api/static-pods/debug on 🌱 static-pods [$!?] 
$ ./static-pods --socket-path
Did not give argument to --socket-path
Usage: ./static-pods
            [ --socket-path PATH ]
            [ --log-level trace|debug|info|warn|error ]

    Socket path defaults to /run/api.sock
etung in api/static-pods/debug on 🌱 static-pods [$!?] 
$ ./static-pods --log-level
Did not give argument to --log-level
Usage: ./static-pods
            [ --socket-path PATH ]
            [ --log-level trace|debug|info|warn|error ]

    Socket path defaults to /run/api.sock

@etungsten etungsten requested a review from webern February 16, 2021 21:37
Comment on lines 50 to 58
ensure!(
code.is_success(),
error::APIResponse {
method,
uri,
code,
response_body,
}
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you actually need this? Both apiclient::raw_request and schnauzer::get_settings (through schnauzer::get_json) have this check already

@etungsten
Copy link
Contributor Author

etungsten commented Feb 17, 2021

Push above addresses @samuelkarp @arnaldo2792's comment.
Use schnauzer instead of raw api requests.
Cleans up a bunch of unused errors, oops

Things still work as expected.

@etungsten
Copy link
Contributor Author

Push above sheds a few unnecessary imports.

Adds a set of new settings for managing k8s static pods.
Adds a new binary helper to manage k8s static pod manifests.
Specify pod manifest path in kubelet config for all kubernetes variants.
We only need to build pluto for kubernetes variants
Adds a new migration for new k8s static-pods settings
@etungsten
Copy link
Contributor Author

Push above reverts back to conditionally compiling static-pods based on the VARIANT env var. Drops changes to tasks.unit-tests.

Copy link
Contributor

@webern webern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happier with the cfg approach than the Makefile approach for conditional compilation, but I'll survive either way :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants