-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add aws_launch_template & config vars to enable options to specify pre-bootstrap commands and/or custom AMI for EKS nodes #2621
Conversation
…ustom AMI Id for EKS nodes
Hi @joneszc thanks for opening the PR, I will review it later today. |
user_data = base64encode(<<-EOF | ||
MIME-Version: 1.0 | ||
Content-Type: multipart/mixed; boundary="==MYBOUNDARY==" | ||
|
||
--==MYBOUNDARY== | ||
Content-Type: text/x-shellscript; charset="us-ascii" | ||
%{ if var.node_prebootstrap_command != null }${var.node_prebootstrap_command}%{ endif } | ||
|
||
%{ if var.node_prebootstrap_command != null && var.node_groups[count.index].custom_ami != null }--==MYBOUNDARY==%{ endif } | ||
%{ if var.node_prebootstrap_command != null && var.node_groups[count.index].custom_ami != null }Content-Type: text/x-shellscript; charset="us-ascii"%{ endif } | ||
%{ if var.node_prebootstrap_command != null && var.node_groups[count.index].custom_ami == null }%{ else }#!/bin/bash%{ endif } | ||
%{ if var.node_prebootstrap_command != null && var.node_groups[count.index].custom_ami == null }%{ else }set -ex%{ endif } | ||
%{ if var.node_prebootstrap_command != null && var.node_groups[count.index].custom_ami == null }%{ else }B64_CLUSTER_CA=${aws_eks_cluster.main.certificate_authority[0].data}%{ endif } | ||
%{ if var.node_prebootstrap_command != null && var.node_groups[count.index].custom_ami == null }%{ else }API_SERVER_URL=${aws_eks_cluster.main.endpoint}%{ endif } | ||
%{ if var.node_prebootstrap_command != null && var.node_groups[count.index].custom_ami == null }%{ else }/etc/eks/bootstrap.sh ${aws_eks_cluster.main.name} --b64-cluster-ca $B64_CLUSTER_CA --apiserver-endpoint $API_SERVER_URL%{ endif } | ||
|
||
--==MYBOUNDARY==-- | ||
EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest moving this into /files
as a .tpl file and reading it using templatefile(path, vars)
, e.g :
base64encode(templatefile("${path.module}/files/user_data.tpl", {
node_prebootstrap_command = var.node_prebootstrap_command
node_groups = var.node_groups
aws_eks_cluster = aws_eks_cluster.main
count_index = count.index
cust_ami_node_index = local.cust_ami_node_index
}))
as this will become easier to maintain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've implemented the suggested modification with updated main.tf and added files/user_data.tftpl
@@ -118,6 +118,7 @@ class AzureInputVars(schema.Base): | |||
class AWSNodeGroupInputVars(schema.Base): | |||
name: str | |||
instance_type: str | |||
custom_ami: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to follow the aws standard naming, let's use ami_type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hello @viniciusdc,
in this case, the custom_ami
prompts the user to input an AMI Id, as opposed to the ami_type (e.g. "CUSTOM"). The corresponding aws var here is image_id
. My impression of image_id
is that it seems a bit ambiguous for nebari config purposes since it does not make it obvious that an ami is expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@viniciusdc how about using ami_image_id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joneszc I see; I was under the impression ami_type
would support the ID of a custom AMI as well, but after your comment and re-reding the docs, that does not seem to be true (unfortunately). I agree with you that image_id would also be ambiguous, considering the overall shape of the nebari-config.
ami_type = var.node_groups[count.index].custom_ami != null ? "CUSTOM" : (var.node_groups[count.index].gpu == true ? "AL2_x86_64_GPU" : "AL2_x86_64") | ||
disk_size = var.node_prebootstrap_command == null && var.node_groups[count.index].custom_ami == null ? 50 : null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is nothing wrong with the way it is right now, but it would be better if "AL2_x86_64_GPU"
and "AL2_x86_64"
were set as defaults in the python schema itself, so that we could clean this conditionals a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@viniciusdc
ami_type
(i.e. AL2_x86_64_GPU
/AL2_x86_64
) is not fetched from nebari-config.yaml or otherwise processed in __init__.py
in either the current state of Nebari or in the scope of the proposed PR. Adding ami_type to the python schema would require a separate update.
…be read in from files/user_data.tftpl
I am leaving this as future ref. not to be implemented in this PR, we could also have something like this: class AMIConfig(BaseModel):
type: str = "CUSTOM"
id: Optional[str] = Field(default=None, description="AMI ID. Required if type is set to CUSTOM")
@RootValidator
def check_custom_ami_id(cls, values):
ami_type = values.get('type')
ami_id = values.get('id')
if ami_type == "CUSTOM" and not ami_id:
raise ValueError('AMI ID must be provided when type is CUSTOM.')
return values
class AWSNodeGroupInputVars(BaseModel):
name: str
instance_type: str
ami: Optional[AWSConfig] = None The benefit would be that we can later extend this to include the |
src/_nebari/stages/infrastructure/template/aws/modules/kubernetes/main.tf
Fixed
Show fixed
Hide fixed
@joneszc, So far, this looks good to me; I will do another final pass later this week so that we can include it as part of this month's release. We have this security alert above suggesting us to make the IMDS token a requirements for the launch_template metadata. I don't see any issues with enabling it as these templates are non-default, though I wonder if you could try that inclusion in your current test worspace |
oh btw, the endpoint_private_access = var.eks_endpoint_private_access
node_prebootstrap_command = var.node_prebootstrap_command
public_access_cidrs = var.eks_public_access_cidrs
permissions_boundary = var.permissions_boundary |
@viniciusdc is your review still in progress or is this work good to be merged? |
Hi @tylergraff I will be looking into that today. It all looks good to me already, I just want to do another pass in the main logic for conditioning the template resource itself. |
Is the failing local-integration test case something my team needs to look into? |
Hi @tylergraff
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @joneszc, I did another pass today and deployed this last Friday to make sure terraform didn't complain about any missing vars. All looks good. I just have a final suggestion regarding the user_data
contents
%{ if split_user_data == true }Content-Type: text/x-shellscript; charset="us-ascii"%{ endif } | ||
%{ if include_bootstrap_cmd == true }#!/bin/bash%{ endif } | ||
%{ if include_bootstrap_cmd == true }set -ex%{ endif } | ||
%{ if include_bootstrap_cmd == true }B64_CLUSTER_CA=${cluster_cert_authority}%{ endif } | ||
%{ if include_bootstrap_cmd == true }API_SERVER_URL=${cluster_endpoint}%{ endif } | ||
%{ if include_bootstrap_cmd == true }/etc/eks/bootstrap.sh ${cluster_name} --b64-cluster-ca $B64_CLUSTER_CA --apiserver-endpoint $API_SERVER_URL%{ endif } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these lines could be passed as part of the user's nore_prebootstrap_command
instead of being present in the template itself. I am okay with providing the variables by default, but I would prefer that any user's logic be passed down via "overwriting" or variables. Maybe we could have in the schema a user_data_template
wich could be a filepath
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@viniciusdc please implement the changes you recommend. Let me know if you do not have access to commit to the MetroStar fork.
@viniciusdc should this PR be closed given that the changes will be made in #2668? |
Closing because the functionality in this PR is duplicated in #2668, and this PR now has conflicts. |
Reference Issues or PRs
Fixes #2603
Fixes #2604
What does this implement/fix?
Put a
x
in the boxes that applyTesting
Any other comments?