Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ECS Agent parameters ImagePullBehavior and WarmPoolsSupport #2063

Merged
merged 1 commit into from
May 21, 2022

Conversation

mello7tre
Copy link
Contributor

@mello7tre mello7tre commented Apr 11, 2022

Issue number:
None.
I have looked in the backlog and found no task regarding new ecs-agent parameters, so i directly opened the PR.
Maybe WarmPoolsSupport can be used for #1788 (for the ECS part).

Description of changes:
Added support for ECS Agent parameters ImagePullBehavior and WarmPoolsSupport.
(relative to variables ECS_IMAGE_PULL_BEHAVIOR and ECS_WARM_POOLS_CHECK)

Can be configured using the keys:
settings.ecs.image-pull-behavior
settings.ecs.warm-pools-check

image-pull-behavior can have the values: default, always, once, prefer-cached.

Testing done:
executed cargo make unit-tests.
Launched Bottlerocket instance using builded ami with user-data:

[settings.ecs]
image-pull-behavior = "once"
warm-pools-check = true

verified that the content of file /etc/ecs/ecs.config.json included the new settings:

{
  "InstanceAttributes": {
    "bottlerocket.variant": "aws-ecs-1"
  },
  "PrivilegedDisabled": true,
  "AvailableLoggingDrivers": [
    "json-file",
    "awslogs",
    "none"
  ],
  "WarmPoolsSupport": true,
  "ImagePullBehavior": 2,
  "TaskIAMRoleEnabled": true,
  "TaskIAMRoleEnabledForNetworkHost": true,
  "SELinuxCapable": true,
  "OverrideAWSLogsExecutionRole": true,
  "TaskENIEnabled": true
}

Note: ImagePullBehavior is a iota.

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@zmrow
Copy link
Contributor

zmrow commented Apr 18, 2022

Hi @mello7tre ! Appreciate the contribution. We're looking at this - we'd love to figure out a good way to accomplish this for both ECS/EKS in a more generic way.

@kdaula kdaula added this to the 1.9.0 milestone May 2, 2022
@bcressey
Copy link
Contributor

bcressey commented May 3, 2022

Can you help me better understand your use case? Since the ECS agent just waits until the instance reaches the Ready state, what's actually happening behind the scenes to advance the state?

In #1788 @samjo-nyang proposed a method for pulling container images and then marking the instance as Ready. If you're using a similar host container approach, I'm not entirely clear on the value of that vs. just pulling the images in a bootstrap container before we even try to start ecs-agent.

@mello7tre
Copy link
Contributor Author

mello7tre commented May 3, 2022

Sure, the problem arise if you use a CapacityProvider [CP], start an instance out of the AutoscalingGroup [ASG] that the CP is using and later attach it to the ASG (this is what happen for warm pools).

I have opend a ticket on AWS Support too, but then i found out that they have already taken in account that case and solved it with the code triggered by the param ECS_WARM_POOLS_CHECK.

In detail:
Seem that when a new container instance is registered to an ECS cluster using a CP there is a process that just one time check if that instance belongs to the CP ASG, if not, no further check are executed, so when later we attach the container instance to the CP ASG, it's not taken in account by the CP.

The effect is that even if the container instance is registered in the cluster, services configured to use the capacity providers are unable to start on that instance.

So, the ECS_WARM_POOLS_CHECK, do not check if the instance is in a Ready state, but wait until the instance is part of an AutoscalingGroup before registering it with the ECS Cluster.

README.md Outdated Show resolved Hide resolved
sources/models/src/modeled_types/ecs.rs Outdated Show resolved Hide resolved
sources/api/ecs-settings-applier/src/ecs.rs Outdated Show resolved Hide resolved
sources/models/src/modeled_types/ecs.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This looks really close, just a few clean-up suggestions. Let me know if you want us to handle them.

sources/models/shared-defaults/defaults.toml Outdated Show resolved Hide resolved
sources/models/src/lib.rs Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks!

👋

@bcressey
Copy link
Contributor

Can you squash all the commits together into a single commit? I'd also still prefer to back out the changes to aws-k8s-* and aws-dev since the functionality isn't supported on those variants yet, but we can do that in a follow-up PR if needed.

Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

sources/models/shared-defaults/kubernetes-aws.toml Outdated Show resolved Hide resolved
@bcressey
Copy link
Contributor

CI failed for this reason:

error: the lock file /home/ec2-user/_work/bottlerocket/bottlerocket/sources/Cargo.lock needs to be updated but --locked was passed to prevent this

Looks like you'll need to add the modified Cargo.lock to the commit. It shouldn't be explicitly updated, or it will pull in a lot of other newer dependencies outside the scope of this PR. We just want the Cargo.lock with minor changes that you'd get from a clean checkout of your branch and running a sequence of commands like this:

cd sources
export VARIANT=aws-ecs-1
cargo test

If you can add that and squash together the commits again, I'll kick off the CI run.

@mello7tre
Copy link
Contributor Author

running cargo tree inside sources dir does the trick

@bcressey
Copy link
Contributor

Now the cargo fmt check is failing. You can run this to fix:

cargo fmt --manifest-path sources/Cargo.toml --message-format short --all

Copy link
Contributor

@bcressey bcressey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mello7tre
Copy link
Contributor Author

sorry for the troubles, next time (if any), i will remember to include Cargo.lock and to run format command.

Copy link
Contributor

@zmrow zmrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mello7tre
Copy link
Contributor Author

let me know if i need to rebase it on latest commits to resolve conflicts....

@bcressey
Copy link
Contributor

@mello7tre yes, please rebase this if you don't mind, and then we can try to get it merged before more conflicts appear.

…upport

They be configured using the keys:
settings.ecs.image-pull-behavior: (default*|always|once|prefer-cached)
settings.autoscaling.should-wait: (true|false*)

WarmPoolsSupport use a more generic key settings, as for advice of
@bcressey, this way it can be used, in future, for other variants.
At the moment `settings.autoscaling.should-wait` is used only for the
aws-ecs* variants.
@mello7tre
Copy link
Contributor Author

rebased

@bcressey bcressey merged commit 0184f4c into bottlerocket-os:develop May 21, 2022
@bcressey
Copy link
Contributor

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants