Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix import issue for Dataplex Datascan #8578

Merged
merged 5 commits into from
Aug 10, 2023

Conversation

saurabh-net
Copy link
Contributor

@saurabh-net saurabh-net commented Aug 8, 2023

Fix import issue for Datascan Dataprofile scans with existing jobs.
The TopN value message was incorrectly encoded as a map instead of as a array.

If this PR is for Terraform, I acknowledge that I have:

  • Searched through the issue tracker for an open issue that this either resolves or contributes to, commented on it to claim it, and written "fixes {url}" or "part of {url}" in this PR description. If there were no relevant open issues, I opened one and commented that I would like to work on it (not necessary for very small changes).
  • Ensured that all new fields I added that can be set by a user appear in at least one example (for generated resources) or third_party test (for handwritten resources or update tests).
  • Generated Terraform providers, and ran make test and make lint in the generated providers to ensure it passes unit and linter tests.
  • Ran relevant acceptance tests using my own Google Cloud project and credentials (If the acceptance tests do not yet pass or you are unable to run them, please let your reviewer know).
  • Read Write release notes before writing my release note below.

Release Note Template for Downstream PRs (will be copied)

dataplex: fixed a bug when importing `google_dataplex_datascan` after running a job

@modular-magician
Copy link
Collaborator

Hello! I am a robot. It looks like you are a community contributor. @hao-nan-li, a repository maintainer, has been assigned to review your changes. If you have not received review feedback within 2 business days, please leave a comment on this PR asking them to take a look.

You can help make sure that review is quick by doing a self-review and by running impacted tests locally.

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Breaking Change(s) Detected

The following breaking change(s) were detected within your pull request.

  • Field data_profile_result.profile.fields.profile.top_n_values MinItems went from 0 to 0 on google_dataplex_datascan - reference

If you believe this detection to be incorrect please raise the concern with your reviewer. If you intend to make this change you will need to wait for a major release window. An override-breaking-change label can be added to allow merging.

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 1 file changed, 14 insertions(+), 11 deletions(-))
Terraform Beta: Diff ( 1 file changed, 14 insertions(+), 11 deletions(-))

@saurabh-net
Copy link
Contributor Author

saurabh-net commented Aug 8, 2023

@hao-nan-li This is just a change in one of our output only fields.
I don't believe it will lead to any breaking changes.

(edited: didn't mean for this to be in bold :))


I don't actually think the Result message as a whole is useful for Terraform at all. If we want to just ignore the entire message, can we just choose not to specify the message in the YAML file?

For context, dataProfileResult shows the result of the latest Job run under a DataScan definition. I don't think this is ever useful in a Terraform context at all.

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2936
Passed tests 2634
Skipped tests: 302
Affected tests: 0

$\textcolor{green}{\textsf{All tests passed in REPLAYING mode.}}$
View the build log

Copy link
Contributor

@hao-nan-li hao-nan-li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Looks like this field is not marked as an output only field, so I may still consider it as a breaking-change.

We are currently preparing for a major release, breaking changes are accepted following these instructions https://googlecloudplatform.github.io/magic-modules/develop/make-a-breaking-change/#contributing-to-the-next-major-release-500.

@saurabh-net
Copy link
Contributor Author

In line 644, we mark the parent message as output: true.

Wouldn't that apply to all inner nested fields and messages?

@saurabh-net saurabh-net changed the base branch from main to FEATURE-BRANCH-major-release-5.0.0 August 8, 2023 18:25
@saurabh-net saurabh-net changed the base branch from FEATURE-BRANCH-major-release-5.0.0 to main August 8, 2023 18:29
@hao-nan-li
Copy link
Contributor

In line 644, we mark the parent message as output: true.

Wouldn't that apply to all inner nested fields and messages?

Sorry I missed the part from the parent.

On the other hand, could you take a quick look at https://googlecloudplatform.github.io/magic-modules/develop/make-a-breaking-change/#what-counts-as-a-breaking-change? Changing the output format of a field is part of it.

@saurabh-net
Copy link
Contributor Author

Ahh, okay, i'm comfortable modifying my PR and waiting for it to be released with the next major version. Let me update that by end-of-day


One question I have is, if I don't believe the message is ever useful, can I simply remove the entire parent message (name: 'dataProfileResult') entirely from the YAML spec?

I'm aware that this would also be a breaking change, I'd just rather keep the YAML spec simple where I can.

@hao-nan-li
Copy link
Contributor

Ahh, okay, i'm comfortable modifying my PR and waiting for it to be released with the next major version. Let me update that by end-of-day

One question I have is, if I don't believe the message is ever useful, can I simply remove the entire parent message (name: 'dataProfileResult') entirely from the YAML spec?

I'm aware that this would also be a breaking change, I'd just rather keep the YAML spec simple where I can.

For us, not having output:true field in the yaml file is always an option. Note that you can create PRs with breaking-change. Just need to merge it against the 5.0.0 major release branch. https://googlecloudplatform.github.io/magic-modules/develop/make-a-breaking-change/#contributing-to-the-next-major-release-500

@saurabh-net
Copy link
Contributor Author

Thanks for the confirmation. I've created a fresh pull request targeting the next major release.
Will close this one.

#8583

@saurabh-net
Copy link
Contributor Author

@hao-nan-li @c2thorn
I'm still working on removing the dataprofileresult and dataqualityresult fields outright in parallel to this.


Given that we have a customer blocked on us fixing this, I'd like to check again whether the fix in this PR should really be considered a breaking change.

Anyone currently trying to import a Dataprofile datascan would just run into TF crashing, i.e. they would not have any state in a different format like typical field format changes. There should not be any risk of incompatible states / formats stored anywhere.

I understand the general guidance around field format changes, but in this case:

  • There should be no breaking change for any existing customers
  • We would like to unblock users earlier than later
  • We are moving to GA as a product soon (order of weeks), and this would affect other users as well

To be clear, in this PR, I'd like to do the minimal change to unblock existing users. The larger change of removing dataprofileresult and dataqualityresult entirely is definitely safer to do in a new major version.

For additional context: #8583 (comment)

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Breaking Change(s) Detected

The following breaking change(s) were detected within your pull request.

  • Field data_profile_result.profile.fields.profile.top_n_values MinItems went from 0 to 0 on google_dataplex_datascan - reference

If you believe this detection to be incorrect please raise the concern with your reviewer. If you intend to make this change you will need to wait for a major release window. An override-breaking-change label can be added to allow merging.

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 1 file changed, 14 insertions(+), 11 deletions(-))
Terraform Beta: Diff ( 1 file changed, 14 insertions(+), 11 deletions(-))

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2939
Passed tests 2637
Skipped tests: 302
Affected tests: 0

$\textcolor{green}{\textsf{All tests passed in REPLAYING mode.}}$
View the build log

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR hasn't generated any diffs, but I'll let you know if a future commit does.

@c2thorn
Copy link
Member

c2thorn commented Aug 9, 2023

@saurabh-net Looks like ignore_read did not interact as I'd hoped. We can try a custom flattener to just return nil or something similar like in /~https://github.com/GoogleCloudPlatform/magic-modules/blob/main/mmv1/templates/terraform/custom_flatten/vertex_ai_index_ignore_contents_delta_uri.go.erb

The YAML looks like /~https://github.com/GoogleCloudPlatform/magic-modules/blob/main/mmv1/products/vertexai/Index.yaml#L110

This should be helpful as the crash was in the generated flattener.

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 1 file changed, 2 insertions(+), 20 deletions(-))
Terraform Beta: Diff ( 1 file changed, 2 insertions(+), 20 deletions(-))

@saurabh-net
Copy link
Contributor Author

saurabh-net commented Aug 9, 2023

Looks like the custom flatten did the trick :) (based on the diff)

@c2thorn
Copy link
Member

c2thorn commented Aug 9, 2023

Looks like the custom flatten did the trick :) (based on the diff)

great, let us know when you have been able to give a manual test a try. @hao-nan-li I can take over this PR at this point

@hao-nan-li
Copy link
Contributor

Looks like the custom flatten did the trick :) (based on the diff)

great, let us know when you have been able to give a manual test a try. @hao-nan-li I can take over this PR at this point

Thanks!

@hao-nan-li hao-nan-li requested a review from c2thorn August 9, 2023 22:27
@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2939
Passed tests 2637
Skipped tests: 302
Affected tests: 0

$\textcolor{green}{\textsf{All tests passed in REPLAYING mode.}}$
View the build log

@modular-magician
Copy link
Collaborator

Hi there, I'm the Modular magician. I've detected the following information about your changes:

Diff report

Your PR generated some diffs in downstreams - here they are.

Terraform GA: Diff ( 1 file changed, 2 insertions(+), 362 deletions(-))
Terraform Beta: Diff ( 1 file changed, 2 insertions(+), 362 deletions(-))

@saurabh-net
Copy link
Contributor Author

Ouch, looks like this affects all terraform operations (plan / apply) for Profile Datascans with any jobs created for them.

I locally tested it, and there were a few other fields which are incorrectly specified.
I've added a custom_flattener at the data_profile_result level now instead.

With that, it is working fine locally.

@modular-magician
Copy link
Collaborator

Tests analytics

Total tests: 2939
Passed tests 2637
Skipped tests: 302
Affected tests: 0

$\textcolor{green}{\textsf{All tests passed in REPLAYING mode.}}$
View the build log

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants