Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify or clarify how to refactor an CDK codebase #4733

Closed
abelmokadem opened this issue Oct 29, 2019 · 19 comments
Closed

Simplify or clarify how to refactor an CDK codebase #4733

abelmokadem opened this issue Oct 29, 2019 · 19 comments
Assignees
Labels
@aws-cdk/core Related to core CDK functionality feature-request A feature should be added or improved.

Comments

@abelmokadem
Copy link
Contributor

This is just to start a discussion, but currently I'm finding it really hard to refactor an AWS CDK codebase because of the CloudFormation resource names. Is it possible to create a separation between the names and the structure of the code. I understand that this is at the core of aws-cdk and it allows us to create reusable constructs, but this also makes refactoring nearly impossible.

@abelmokadem abelmokadem added the needs-triage This issue or PR still needs to be triaged. label Oct 29, 2019
@abelmokadem abelmokadem changed the title Separate constructs from cloudformation names Simplify or clarify how to refactor an CDK codebase Oct 29, 2019
@eladb
Copy link
Contributor

eladb commented Oct 29, 2019

Can you elaborate why does it make refactoring impossible?

@abelmokadem
Copy link
Contributor Author

abelmokadem commented Oct 29, 2019

I think I need to clarify what I mean by refactoring. In this case I'm just talking about moving your resources into and out of constructs or breaking up a construct into smaller constructs. The end result of this refactoring process should be that cdk diff doesn't show any difference. That would be my ideal situation.

In the current situation resources are renamed, thus removed and new ones are created. Think of databases.

@orangewise
Copy link
Contributor

I agree, @abelmokadem has a valid point. Would be great if this was easier.

@eladb
Copy link
Contributor

eladb commented Oct 29, 2019

Thanks for the clarification. Unfortuntelly, we don't have good solution to this problem that will not require you to explicitly specify absolute logical IDs to all resources that you define, and as you observed, this will dramatically hinder composability and reusability.

But we can talk about a few things:

  1. You can always simply set logicalId in the defaultChild if you want to preserve a logical ID after a refactor. Obvsiouly that's not ideal, but it's an "escape hatch" that is available to you in case, for example, you don't want you DynamoDB table to be removed.
  2. We are looking at adding some more formal definition of "stateful resources" (#2282). This will give us a bit more understanding as to which resources in your stack are sensitive to replacement, and we could, at least, protect you from disastrous renames.

This has been something we have discussed for over a year as we designed the CDK, and possibly the biggest challange we had in designing the programming model. I can't say we have the perfect solution, but we could not come up with a better solution.

If anyone has ideas, we would love to hear from you, but we can't compromise compability. It's a key design tenet that makes CDK possible.

EDIT: added reference to #2282

@abelmokadem
Copy link
Contributor Author

abelmokadem commented Oct 29, 2019

I have somewhat of an idea. Please let me know if this absolutely can't work.

I also managed to get myself confused somewhere down the road, but I still think it could work. It could be a big change for aws-cdk, but for end users it could just be a matter of creating a stack with enableRefactor option set to true.

In short:

  • Use UUID as logicalId for all resources in the stack
  • Introduce cdk refactor --mapping <mapping_file> command to "refactor" your stack
  • Create a mapping file where you map the generated logicalId to the new cdk path

When running cdk deploy:

for resource in stack:
  // Update existing resource
  logicalId = find uuid for resource by current cdkPath

  if (!logicalId):
    // Create new resource
    logicalId = generate uuid
  
  addMetadata(cdkPath)

When running cdk refactor --mapping mapping.yaml:

for resource in stack:
  // "Refactor" existing resource to new cdk path
  logicalId = mappings[cdkPath]
  
  if (!logicalId)
    // Update existing resource
    logicalId = find uuid for resource by current cdkPath

  if (!logicalId):
    // Create new resource
    logicalId = generate uuid
  
  addMetadata(cdkPath)

@eladb
Copy link
Contributor

eladb commented Oct 29, 2019

Use UUID as logicalId for all resources in the stack

When will the UUID be generated? It has to be stable across synths, so we can't generate it at runtime.

@abelmokadem
Copy link
Contributor Author

abelmokadem commented Oct 29, 2019

@eladb it is generated only when cdk can't find the CDK path in the cloudformation template of the currently deployed stack.

cdk deploy

if any resource in deployed stack contains cdk path {
  use uuid of that resource from the deployed stack
} else {
  generate uuid
}

@eladb
Copy link
Contributor

eladb commented Oct 29, 2019

use uuid of that resource from the deployed stack

How do we know the uuid of a resource from the deployed stack? We need a stable address/identity in order to correlate between the resource in your code and the deployed resource.

@abelmokadem
Copy link
Contributor Author

You find the uuid by looking for the cdk metadata field. You can use this field to correlate between the resource that is in your code.

      "Metadata": {
        "aws:cdk:path": ".../Dns/StackHostedZone/Resource"
      }

If this path does not exists in both the mapping file and in the deployed stack, it means you are creating a new resource.

@eladb
Copy link
Contributor

eladb commented Oct 29, 2019

aws:cdk:path": ".../Dns/StackHostedZone/Resource

But the problem is that if you refactor your code, this path will change and you "lose" your connection to the actual resource.

@abelmokadem
Copy link
Contributor Author

abelmokadem commented Oct 29, 2019

That is what the mapping file is for. If you treat refactor as a separate action, then you can create a cdk refactor --mapping mapping.json command.

mapping.json

{
  "new cdk path": "existing uuid"
}

This action will simply update the cdk path metadata in cloudformation. That is all it should do if there are no changes to the stack but you moved a resource.

@abelmokadem
Copy link
Contributor Author

You could even consider making a part of cdk deploy to make things stateless. I haven't thought of how to do that. You would have to add additional metadata to keep track of which "refactor" was executed and which was not.

@moofish32
Copy link
Contributor

As another CDK user/contributor I think this mapping file really pushes against the grain of CDK and CloudFormation (CFN). In my mind CFN owns the stateful management, I feel like this mapping file starts to be an artifact I might need to preserve. For example, if you make a refactor and you need to promote this from dev to perf to prod, who validates this mapping file in each step and how long do I need to keep it? In VCS or out?

Counter proposal

If I take a little inspiration from state file management in other systems (see below for TLDR). I like the direction of a stateful boolean. In my head instead of stateful, I think about this as a "root" namespace where the rule of logical IDs are changed. In this namespace the logical ID is the result of the name/ID provided in the constructor and the hash of the properties that cause destroy/recreate (no longer is CDK path in the logical ID here and we have to manage name length). The user/consumer now owns the responsibility of ensuring unique names for the "root" namespace and if you change a property that causes a destroy/recreate the logical ID will clearly indicate that and diff will detect it.

state file management in other systems TLDR

If we look across the landscape this is why Terraform maintains state externally and it's also why if I corrupt my statefile it can be very hard and potentially impossible to restore. However, because the user accepts the responsibility we now have the ability manually manipulate the state file and move resources between modules.

@eladb
Copy link
Contributor

eladb commented Oct 29, 2019

@abelmokadem are you aware that we already have support for renaming logical IDs globally, exactly for the refactor use case. It's almost exactly like your mapping file, but in code (and of course, if you want you can load a rename map from a file).

@SomayaB SomayaB added @aws-cdk/core Related to core CDK functionality feature-request A feature should be added or improved. and removed needs-triage This issue or PR still needs to be triaged. labels Oct 29, 2019
@eladb
Copy link
Contributor

eladb commented Oct 31, 2019

For posterity, people can also completely override how resource logical IDs are allocated by subclassing Stack and overriding: allocateLogicalId.

I am closing this for now. Please reopen if you feel you don't have sufficient tools to be able to control your logical names in a refactor.

@eladb eladb closed this as completed Oct 31, 2019
@foriequal0
Copy link

After some refactoring, I have so many stack.renameLogicalId()s in my codebases, and I can't remove them to keep the resources. I wish there is some renaming deploy.

Here's a draft of the idea:

  1. Mark some resource elements with an API something like this
const subnet = new PrivateSubnet(this, "Private", { ... });
rename({ element: subnet.node.findChild("NATGateway"), oldId: "FooBarVpcSubnetNATGatewayABCD1234") })
  1. Check all the resources can be imported into existing stack.
    https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/resource-import-supported-resources.html
  2. Prepare following ChangeSets.
    1. PREPARE: All the resources in a template have old ID, with "DeletionPolicy": "Retain".
    2. REMOVE: Remove all marked resources from PREPARE template. All their references are replaced with their physical ID.
    3. REIMPORT: All the old ID is changed to the new ID from the PREPARE template. The ChangeSet's ResourcesToImport has new ID and physical ID pairs.
    4. REIMPORT_CLEANUP: Detach "DeletionPolicy": "Retain" from REIMPORT template.
    5. REMOVE_REVERT: The template is same with PREPARE. The ChangeSet's ResourcesToImport has old ID and physical ID pairs.
    6. REMOVE_REVERT_CLEANUP: Detach "DeletionPolicy": "Retain" from REMOVE_REVERT template.
  3. Execute ChangeSets in this order.
PREPARE -> REMOVE -> REIMPORT --Okay--> REIMPORT_CLEANUP
                            \--Error--> REMOVE_REVERT -> REMOVE_REVERT_CLEANUP
  1. Remove rename(...) calls in the code.

@abelmokadem
Copy link
Contributor Author

aws/aws-cdk-rfcs#162

@brendonparker
Copy link

brendonparker commented Jul 9, 2021

@eladb I know this is an old issue, but do you have an example we can reference for how to combine allocateLogicalId with renameLogicalId?

Much appreciated.

EDIT
Something like this? (in c#):

protected override string AllocateLogicalId(CfnElement cfnElement)
{
    if (cfnElement.ToString() == "App/DDB/TableEverything/Resource [AWS::DynamoDB::Table]")
    {
        RenameLogicalId(cfnElement.LogicalId, "TableEverything9827B373");
    }
    return base.AllocateLogicalId(cfnElement);
}

In my case, I did some refactoring which moved the DDB table into its own construct.
IDK if there is a "better" way to detect which CfnElement I'm interested in.

@brendonparker
Copy link

Hmm, I couldn't get that to work.

I was attempting to move a table from the stack contstuct into its own construct.
In the end, I did a cdk synth after the construct change/refactor and compared that logicalid in output with my past output.

Then added:

RenameLogicalId("DDBTableEverythingB1F30BC5", "TableEverything9827B373");

Where the first param is the logicalid generated by the most recent change, and second param is the historical logicalid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/core Related to core CDK functionality feature-request A feature should be added or improved.
Projects
None yet
Development

No branches or pull requests

7 participants