Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unit Tests for Compiler #1624

Closed

Conversation

alanmcruickshank
Copy link
Contributor

This resolves issue #1603

Add the relative-ref functionality.

  • Either inline: select * from {{ ref('root_table_a', relative=True) }}
  • Through config: 'relative_ref': True

@drewbanin @beckjake I've run this through the unit tests (and added a load more unit tests for the compiler so that we can test other functionality), but I've not run it through the full test suite because I don't have access to all the infrastructure. Any chance one of you guys could kick this off through the full test suite to check it's all good?

@alanmcruickshank
Copy link
Contributor Author

@beckjake @drewbanin . The tests are failing on one single postgres test, which I don't quite understand. I've just pushed some extra debug logging to see if that lends some more clarity. It looks like something unexpected is being passed through in the new calling_model kwarg (which I introduced so that the relation can work out how close it is to it's called), during the TestOperations.test__postgres_vacuum_ref test.

@alanmcruickshank
Copy link
Contributor Author

@beckjake - ok, postgres tests are passing now, I've ignored macros when considering whether to use relative referencing. Can you guys think of any cases where that might cause issues?

Can we test the changes against the other integration targets?

@drewbanin
Copy link
Contributor

Hey @alanmcruickshank - I was writing a followup on the issue thread, but you beat me to the punch with this PR!

I just want to say: the code you wrote here looks really slick! I applaud you biasing towards action, and I'm super impressed that you were able to get something working at all in one of the more complex parts of the dbt codebase!

I'll also say: my instinct is that getting configs involved here is probably more trouble than it's worth. ref can be used in all sorts of places like macros, operations, snapshots, custom data tests, and schema tests. For these resources, dbt's config scoping context isn't super well defined. We should fix that (#1255), but there's really a whole lot think about in order to serve this particular use case! I think this might be the issue you're seeing with that failing TestOperations.test__postgres_vacuum_ref test, but I'd need to dig in further to say for sure.

I also think that the workflow you're making possible with this code is super Snowflake-specific, so it might turn out that create_from_node isn't the best place for this particular logic. While the check you've implemented for overlapping database/schema names is definitely the way to go on Snowflake, I think that's probably not the right way to accomplish a similar task on postgres/redshift/bigquery.

So, in all, my instinct is to make this functionality addressable in user-space (or at least, plugin-space). I think a good path forwards involves letting users implement their own ref macros (or similar) which call dbt's implementation of ref under the hood. I think this maximizes the flexibility for end-users and minimizes the amount of database-specific knowledge that we need to encode in dbt-core. That's just the way I've been thinking about this though - super happy to explore other ideas here too!

@beckjake definitely feel free to chime in if you have any strong thoughts - I don't need to have the last word here - I'm very happy to be wrong, now and always :)

Thanks again for the really slick PR @alanmcruickshank - looking forward to hearing what you think!

@alanmcruickshank
Copy link
Contributor Author

Interesting idea. I hear the point about it being snowflake specific, so perhaps to make it pluginnable.

I'd rather patch the ref function, than go down the macros route, but I'll take a look at the plugin architecture to see what could be possible...

Have you got any examples in the existing codebase where we patch things for specific adapters?

@alanmcruickshank
Copy link
Contributor Author

Heya guys, I've moved the logic out of the core app, and into the snowflake plugin. That means I've also separated the tests. I'm in the process of setting up some integration tests with our snowflake credentials to check that all that works in practice too.

In the mean time:

  • Although it's in a plugin, the logic still lives in the relation in a subclassed version of create_from_node. I can't think of a better place to put the logic, because I need the details of both nodes (the calling and caller) so that I can find out how close they are. Does this work for you guys or do we want to move it even further up the pile?
  • I still rely on dbt.context.runtime.create_relation to pass the calling_model down to create_from_node, which is still additional functionality in the core of dbt. Does having this accessible fit with what you're architecturally thinking?

The only other option I could think of would be patching the context for the snowflake adapter, but that sounds like it just makes the problems from #1255 even worse.

@drewbanin
Copy link
Contributor

Thanks @alanmcruickshank - I'll take a look at this later today!

@alanmcruickshank
Copy link
Contributor Author

Thanks @drewbanin - maybe don't merge it yet. The integration tests have brought up some more wrinkles. Would still appreciate your thoughts on how it's architecturally put together.

@drewbanin
Copy link
Contributor

Hey @alanmcruickshank - you mentioned above:

I'd rather patch the ref function, than go down the macros route, but I'll take a look at the plugin architecture to see what could be possible...

Can you say more about why you'd prefer not to go down the macros route? I really think that letting users override ref (or, creating an alternative macro like relative_ref) is going to be the most flexible answer here. I don't think that will be easy per se -- there's probably a lot of little things for us to consider, but ultimately, I think this is one place where letting users write code (a la generate_schema_name) is a good idea.

In the interest of being very direct: I'd have a really hard time merging code that looks approximately like what you've pushed up here. I think these commits are really nice-looking, thoughtful, and well-architected, but I also think that your general approach here does not represent the "dbt way" of doing things. I am very unopinionated on the code quality aspect of this discussion -- I care a lot more about the experience for end-users of dbt.

Please feel very free to keep pushing commits (I'm happy to keep taking a look at them) but I wanted to give you a super clear idea of my current thinking around this PR. Let me know your thoughts around leveraging macros here. I'll give this some more though too.

Looking forward to discussing!

@alanmcruickshank
Copy link
Contributor Author

alanmcruickshank commented Jul 24, 2019

@drewbanin I'll take a look at the macros. The reason I didn't want to go down that route initially was that introducing a new command (e.g. relative_ref) introduces a lot of overhead for people using dbt day to day, and hurts the experience that I think both of us are keen to preserve. For almost all of our users, whether or not relative referencing happening won't be something they will appreciate or need to think about.

Being able to patch things by default like generate_schema_name wasn't something I was aware of - but I'll look at using it to directly patch ref. Given I wasn't aware that this kind of functionality existed - where should we document it? I'm happy to brain dump what I learn about it while trying to extend it if you point me in the right direction.

I've amended the pull request to just include the unit tests for the compiler - which are hopefully still of use. Assuming they pass, can you merge them so they don't go stale? I'll submit the macro extensions as a seperate PR.

@alanmcruickshank alanmcruickshank changed the title Relative Refs Unit Tests for Compiler Jul 24, 2019
@alanmcruickshank
Copy link
Contributor Author

I've renamed this PR, and I'm going to move the discussion back to the original issue #1603

@drewbanin
Copy link
Contributor

Hey @alanmcruickshank - this PR is pretty outdated anymore. I'm going to close it out, but happy to reopen if there's still merit to getting these unit tests merged! I think though that these cases are adequately tested by other parts of the existing test suite.

@drewbanin drewbanin closed this Jan 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants