Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable AWS launch_template from nebari-config schema #2856

Merged
merged 3 commits into from
Nov 18, 2024

Conversation

viniciusdc
Copy link
Contributor

@viniciusdc viniciusdc commented Nov 15, 2024

Reference Issues or PRs

What does this implement/fix?

This PR disables the user-facing object from the nebari-config schema while still keeping the logic introduced initially in #2621.

Motivation

The #2842 attempted to disable the ami_id setting from the launch_template configuration under the node groups. However, this change caused Terraform deployment issues because the aws node_group relied on the ami_id attribute internaly, leading to a failed Nebari deployments when it was removed in the linked PR.

Additionally, we've encountered recent internal connection issues with the nodes using just the pre_bootstrap_command. These issues indicated that the current configuration for launch_template is not functioning correctly.

It also includes:

  • a custom error message in the schema validator for letting users know this is a temporary removal;
  • A better docstring to the auxiliary construct_aws_ami_type function since the choice for the CUSTOM instance type depends on the presence of both launch_template and ami_id.

Put a x in the boxes that apply

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds a feature)
  • Breaking change (fix or feature that would cause existing features not to work as expected)
  • Documentation Update
  • Code style update (formatting, renaming)
  • Refactoring (no functional changes, no API changes)
  • Build related changes
  • Other (please describe):

Testing

  • Did you test the pull request locally?
  • Did you add new tests?

How to test this PR?

  • init an aws config file
  • (optional) run a deployment;
  • Include a new node group with the launch_template field. This would simulate a scenario where the user has such a config already (for example):
      launch_template:
        # Replace with your custom AMI ID
        ami_id: ami-0abcdef1234567890
        # Command to run before the node joins the cluster
        pre_bootstrap_command: |
          #!/bin/bash
          # This script is executed before the node is bootstrapped
          # You can use this script to install additional packages or configure the node
          # For example, to install the `htop` package, you can run:
          # sudo apt-get update
          # sudo apt-get install -y htop"

Though, just adding launch_template will trigger the error.

  • run nebari render or nebari validate to trigger the load of the schemas. This should lead you to a ValueError

Any other comments?

@viniciusdc
Copy link
Contributor Author

Tested as follows:

This PR was tested in a previously deployed nebari instance on AWS that had node groups using the launch_template. When attempting the re-deployment from this version, the following (expected) message occurs:

image

As the launch_template configuration is heavily coupled with the instance type used for both normal nodes and GPU utilization, I also tested to make sure both could be fully utilized:

  • Usual node group with GPU showing correctly assigned AWS instance type
    gpu:
      instance: g4dn.xlarge
      min_nodes: 0
      max_nodes: 5
      gpu: true
      single_subnet: false
      permissions_boundary:

image

@viniciusdc viniciusdc marked this pull request as ready for review November 15, 2024 17:06
@viniciusdc
Copy link
Contributor Author

viniciusdc commented Nov 18, 2024

I am testing a fresh init and deploy just to be safe.

Update:
As shown above, deployments are working as expected, and the appropriate instance_types are been assigned. I also tested this afternoon a nebair init to double check new deployments will not be affected by the removal of the template schema options.

@viniciusdc viniciusdc merged commit f448892 into main Nov 18, 2024
26 checks passed
@viniciusdc viniciusdc deleted the disable-launch-templates branch November 18, 2024 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done 💪🏾
Development

Successfully merging this pull request may close these issues.

2 participants