Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add feature azurerm_kubernetes_cluster_trusted_access_role_binding #871

Merged

Conversation

drew0ps
Copy link
Contributor

@drew0ps drew0ps commented Nov 7, 2024

Description of your changes

Adds azurerm_kubernetes_cluster_trusted_access_role_binding to the authorization provider.

Fixes #

I have:

  • Read and followed Crossplane's [contribution process].
  • Run make reviewable to ensure this PR is ready for review.
  • [-] Added backport release-x.y labels to auto-backport this PR if necessary.

Notes

2 notable things about this PR:

  • external name configured quite standard, the referenced var exists in the generated init paramenets type file.
  • Added an additional resource configurator and it's dependency rconfig here so that the resource ends up in the authorization provider as opposed to the containerservice one.

How has this code been tested

  • I've built the provider as a docker image using make e2e
  • I've took the image to my company's test environment and deployed the following tempalte:
          ---
          apiVersion: authorization.azure.upbound.io/v1beta1
          kind: TrustedAccessRoleBinding
          metadata:
            annotations:
              gotemplating.fn.crossplane.io/composition-resource-name: trustedaccessrolebinding
            name: "{{ $.observed.composite.resource.metadata.name }}-ta"
          spec:
            forProvider:
              kubernetesClusterIdRef:
                name: {{ .observed.composite.resource.metadata.name }}
              roles:
                - "Microsoft.DataProtection/backupVaults/backup-operator"
              sourceResourceId: "{{ .observed.resources.backupvault.resource.status.atProvider.id }}"
            providerConfigRef:
              name: "{{ $.observed.composite.resource.metadata.name }}" 
  • Then I took a look at the resources:
k get trustedaccessrolebindings.authorization.azure.upbound.io 
NAME                    SYNCED   READY   EXTERNAL-NAME           AGE
aks-cluster-3-ta   True     True    aks-cluster-3-ta   13m
aks-cluster-4-ta  True     True    aks-cluster-3-ta  13m 
  • Then I described the resources:
Status:
  At Provider:
    Id:                     /subscriptions/redactedname/resourceGroups/redactednameproviders/Microsoft.ContainerService/managedClusters/redactedname/trustedAccessRoleBindings/redactedname-ta
    Kubernetes Cluster Id:  /subscriptions/redactedname/resourceGroups/redactedname/providers/Microsoft.ContainerService/managedClusters/redactedname
    Roles:
      Microsoft.DataProtection/backupVaults/backup-operator
    Source Resource Id:  /subscriptions/redactedid/resourceGroups/redactedname/providers/Microsoft.DataProtection/backupVaults/redactedname
  Conditions:
    Last Transition Time:  2024-11-07T17:43:35Z
    Reason:                Available
    Status:                True
    Type:                  Ready
    Last Transition Time:  2024-11-07T17:43:34Z
    Reason:                ReconcileSuccess
    Status:                True
    Type:                  Synced
    Last Transition Time:  2024-11-07T17:43:35Z
    Reason:                Success
    Status:                True
    Type:                  LastAsyncOperation
Events:
  Type    Reason                   Age   From                                                                           Message
  ----    ------                   ----  ----                                                                           -------
  Normal  CreatedExternalResource  48s   managed/authorization.azure.upbound.io/v1beta1, kind=trustedaccessrolebinding  Successfully requested creation of external resource 

@drew0ps drew0ps force-pushed the feature-az-trusted-identity branch from d112624 to 5a3af44 Compare November 7, 2024 18:08
@drew0ps drew0ps changed the title Feature az trusted identity Add feature azurerm_kubernetes_cluster_trusted_access_role_binding Nov 7, 2024
@drew0ps drew0ps force-pushed the feature-az-trusted-identity branch 15 times, most recently from 15c97b5 to 676473f Compare November 10, 2024 22:13
@mergenci
Copy link
Collaborator

As far as I understand, TrustedAccessRoleBinding and Workspace resources failed. Uptest output easily gets crowded, especially when there are multiple dependent resources and the tests are failing. I'm planning to work on a solution. Here's the relevant parts that I could find:

- apiVersion: authorization.azure.upbound.io/v1beta1
  kind: TrustedAccessRoleBinding
status:
    conditions:
    - lastTransitionTime: "2024-11-10T22:36:23Z"
      message: 'cannot resolve references: mg.Spec.ForProvider.SourceResourceID: referenced
        field was empty (referenced resource may not yet be ready)'
      reason: ReconcileError
      status: "False"
      type: Synced
- apiVersion: machinelearningservices.azure.upbound.io/v1beta2
  kind: Workspace
  status:
    conditions:
    - lastTransitionTime: "2024-11-10T22:35:42Z"
      message: |-
        create failed: async create failed: failed to create the resource: ["***"0 creating Workspace (Subscription: "2895a7df-ae9f-41b8-9e78-3ce4926df838"
        Resource Group Name: "example-tarb-rg"
        Workspace Name: "example-tarb-wspace"): polling after CreateOrUpdate: polling failed: the Azure API returned the following error:

        Status: "BadRequest"
        Code: ""
        Message: "Soft-deleted workspace exists. Please purge or recover it. https://aka.ms/wsoftdelete"
        Activity Id: ""

        ---

        API Response:

        ----[start]----
        "***"
          "status": "Failed",
          "error": "***"
            "code": "BadRequest",
            "message": "Soft-deleted workspace exists. Please purge or recover it. https://aka.ms/wsoftdelete"
          "***"
        "***"
        -----[end]-----
          []"***"]
      reason: ReconcileError
      status: "False"
      type: Synced
    - lastTransitionTime: "2024-11-10T22:35:27Z"
      reason: Creating
      status: "False"
      type: Ready
    - lastTransitionTime: "2024-11-10T22:35:42Z"
      message: |-
        async create failed: failed to create the resource: ["***"0 creating Workspace (Subscription: "2895a7df-ae9f-41b8-9e78-3ce4926df838"
        Resource Group Name: "example-tarb-rg"
        Workspace Name: "example-tarb-wspace"): polling after CreateOrUpdate: polling failed: the Azure API returned the following error:

        Status: "BadRequest"
        Code: ""
        Message: "Soft-deleted workspace exists. Please purge or recover it. https://aka.ms/wsoftdelete"
        Activity Id: ""

        ---

        API Response:

        ----[start]----
        "***"
          "status": "Failed",
          "error": "***"
            "code": "BadRequest",
            "message": "Soft-deleted workspace exists. Please purge or recover it. https://aka.ms/wsoftdelete"
          "***"
        "***"
        -----[end]-----
          []"***"]
      reason: AsyncCreateFailure
      status: "False"
      type: LastAsyncOperation

@drew0ps
Copy link
Contributor Author

drew0ps commented Nov 11, 2024

@mergenci Thanks for the efforts to help. That issue is clear, that's just because there is a soft delete retention setting on the vault which apparently can't be disabled and must be set between 7 and 90 days as per this doc.
The only way around this is changing the name until the deletion completes after 7 days.
The run I am curious about is this one where everything seems to have succeeded but the pipeline still doesn't proceed as it complains about the vault without an error message.

@mergenci
Copy link
Collaborator

That one seems to be missing the Test status condition in kind: Vault, which usually means that there was an update loop. If so, you can examine pod logs to see if there are any “Diff detected” messages. As far as I know, the only indication of an update loop in Uptest logs is the timeout message:

case.go:363: command "$***KUBECTL*** wait vault.keyvault.azure.upbound.io/example-ai-v --for=condition=Test --timeout 10s" exceeded 7 sec timeout, context deadline exceeded

Copy link
Collaborator

@turkenf turkenf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your effort @drew0ps, I left two small comments.

Here, let's come to the reason why uptest fails, while creating the resources in the example yaml file, two extra resources are created and during the deletion phase, the ResourceGroup cannot be deleted because of these resources and uptest fails:

    message: |-
      async delete failed: failed to delete the resource: [{0 deleting Resource Group "example-tarb-rg": the Resource Group still contains Resources.

      Terraform is configured to check for Resources within the Resource Group when deleting the Resource Group - and
      raise an error if nested Resources still exist to avoid unintentionally deleting these Resources.

      Terraform has detected that the following Resources still exist within the Resource Group:

      * `/subscriptions/2895a7df-ae9f-41b8-9e78-3ce4926df838/resourceGroups/example-tarb-rg/providers/microsoft.alertsmanagement/smartDetectorAlertRules/Failure Anomalies - example-ai-tarb`
      * `/subscriptions/2895a7df-ae9f-41b8-9e78-3ce4926df838/resourceGroups/example-tarb-rg/providers/microsoft.insights/actiongroups/Application Insights Smart Detection`

Screenshot 2024-11-12 at 14 36 08

Have you seen this in your manual tests? Frankly, I don't have much information about why these resources were created at the moment. Maybe we can test this manually (apply the yaml file, create and delete all resources, without intervening from the console) and investigate the reason if these two resources are created in your account.

@drew0ps drew0ps force-pushed the feature-az-trusted-identity branch from 9d2bc31 to 888aae2 Compare November 12, 2024 16:32
@drew0ps
Copy link
Contributor Author

drew0ps commented Nov 12, 2024

Hi @turkenf,

Thanks a lot for the response and your suggestions - i've added both of them to the example file.

The "resource exists with the same name" issue is present for all resources in the example file until the deletion topic is solved. I'm not sure if it would be possible to force delete the resource group in azure? In my opinion that would be a good approach since we could end up in this situation in different cases as well. I am not sure if the resources created by my pipeline are running indefinitely in Azure.

Unfortunately I can't test these specific examples manually as I could only test the AKS Trusted Access relation creation for the BackupInstanceKubernetesCluster at my workplace - where my user is only able to create some specific resources which I have to manage.

@turkenf
Copy link
Collaborator

turkenf commented Nov 12, 2024

The "resource exists with the same name" issue is present for all resources in the example file until the deletion topic is solved.

Hi @drew0ps, in fact, all resources except the resource group in the example YAML file are deleted. Just randomizing the name of the workspace resource as you mentioned here is enough.
As I mentioned above, the main problem here is the creation of two extra resources Application Insights Smart Detection and Failure Anomalies. These are not in our YAML file and I think they are created by the API because something went wrong.

I'm not sure if it would be possible to force delete the resource group in azure?

I don't know if this is possible, but even if it was, I wouldn't choose it.

I am not sure if the resources created by my pipeline are running indefinitely in Azure.

I spent some time today cleaning these out of our test account 🙂

Unfortunately I can't test these specific examples manually as I could only test the AKS Trusted Access relation creation for the BackupInstanceKubernetesCluster at my workplace - where my user is only able to create some specific resources which I have to manage.

I understand and I really appreciate your interest, but I prefer not to proceed without understanding why the extra resources are being created and seeing if we can resolve this situation.

@drew0ps
Copy link
Contributor Author

drew0ps commented Nov 12, 2024

Hi @turkenf,

Thanks a lot for the additional insights - randomizing the workspace name makes sense with your explanation.

I spent some time today cleaning these out of our test account 🙂

Sorry about this one, I won't run the pipeline until I figure this out manually. I thought after 6 hours, when the pipeline ends (times out?), the resources are somehow force deleted.

... I prefer not to proceed without understanding why the extra resources are being created and seeing if we can resolve this situation.

Understood - I'll spend a bit more time testing manually on why this happens and how to prevent it. The screenshot you added already helps a lot in this.

@mjnovice
Copy link

👍

@drew0ps drew0ps force-pushed the feature-az-trusted-identity branch from 888aae2 to 3a79099 Compare November 15, 2024 13:28
@drew0ps
Copy link
Contributor Author

drew0ps commented Nov 15, 2024

/test-examples="examples/alertsmanagement/v1beta1/monitorsmartdetectoralertrule.yaml"

@drew0ps drew0ps force-pushed the feature-az-trusted-identity branch from 3a79099 to a3d59e1 Compare November 15, 2024 14:23
Signed-off-by: drew0ps <ad.marton@proton.me>
@drew0ps drew0ps force-pushed the feature-az-trusted-identity branch from a3d59e1 to c21c95d Compare November 15, 2024 15:15
@drew0ps
Copy link
Contributor Author

drew0ps commented Nov 15, 2024

/test-examples="examples/authorization/v1beta1/trustedaccessrolebinding.yaml"

@drew0ps
Copy link
Contributor Author

drew0ps commented Nov 15, 2024

Hi @mergenci and @turkenf - Thanks a lot for the hints - the problem was some missing resources for the Application Insights, which get created by default if not defined in crossplane. Of course when not defined crossplane can't request their deletion. It's fixed now, uptest pipeline is green.

@jeanduplessis jeanduplessis merged commit 9682266 into crossplane-contrib:main Nov 15, 2024
9 checks passed
@jeanduplessis
Copy link
Collaborator

jeanduplessis commented Nov 15, 2024

Thank you for your contribution and persevering on this one @drew0ps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants