Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Terraform for System Testing #9009

Closed
feluelle opened this issue May 25, 2020 · 11 comments
Closed

Use Terraform for System Testing #9009

feluelle opened this issue May 25, 2020 · 11 comments

Comments

@feluelle
Copy link
Member

feluelle commented May 25, 2020

Description

As @ashb @potiuk and me already shortly discussed in a slack thread..

We could use terraform for testing Airflow integrations with real services from providers.
Key benefits:

  • Write declarative configuration files
  • Plan and predict changes
  • Create reproducible infrastructure

I didn't think about the implementation details (pytest + terraform) yet, but I think it would make it a lot easier using terraform instead of having to write a lot of system helper classes in python.

EDIT:

Some interesting resources:

@boring-cyborg
Copy link

boring-cyborg bot commented May 25, 2020

Thanks for opening your first issue here! Be sure to follow the issue template!

@potiuk
Copy link
Member

potiuk commented May 26, 2020

Absolutely!

@szha
Copy link
Member

szha commented May 31, 2020

HashiCorp recently added a clause to their term of evaluation that forbids the usage of their enterprise software in China. While this may or may not affect this proposal, it does pose potential legal risk to contributors in China.

Update:
HashiCorp revised its terms to only include the enterprise version of vault.

@turbaszek
Copy link
Member

I think it would make it a lot easier using terraform instead of having to write a lot of system helper classes in python.

In my opinion it depends... Is creating a single GCS bucket worth adding terraform? However, I fully support the idea for usting terraform to setup complex environment.

@potiuk
Copy link
Member

potiuk commented Jun 1, 2020

Fully Agree with Tomek. Especially with some setup/teardown that is quick and can be repeated for each test every time, using terraform makes no sense. But when you have longer setup that is more complex and takes quite some time, this might be definitely worthwhile.

The usual scenario for that will be

a) create the environment -> terraform script

b) iterate on the tests
alternating with
c) potentially changing target state of the environment - updating and re-running the terraform script applying the changes.

d) tear-down the environment (using terraform script to delete the environment).

This already pretty much reflects our complex "helpers" for system tests - basically when we really need a helper, seems that terraform scripts might to better job.

@feluelle
Copy link
Member Author

feluelle commented Jun 3, 2020

So what I have in mind is that the def setUpClass automatically terraform init and applys the infra located in tests/providers/*/infrastructure for a file named as the example dag file and should end with .tf if exists

What we could also do would be only running the terraform init fortests/providers/*/infrastructure once for all example dags when the providers system flag is used --system amazon for example.

Is creating a single GCS bucket worth adding terraform?

@turbaszek in my opinion we should do it consistenly for all example_dags. Do you think it would be slow or just superfluous? Or what is your concern?
Under the hood terraform is also using the cli's we use to create resources for aws and google, right? So why not do it safely and more manageable using terraform? I think the biggest plus is that the infra automatically gets destroyed so we do not accidently let something exists longer as the test needs it.

@turbaszek
Copy link
Member

So what I have in mind is that the def setUpClass automatically terraform init and applys the infra located in tests/providers/*/infrastructure for a file named as the example dag file and should end with .tf if exists

I like this idea!

Is creating a single GCS bucket worth adding terraform?

@turbaszek in my opinion we should do it consistenly for all example_dags. Do you think it would be slow or just superfluous? Or what is your concern?

My main concern is that normally I would never use terraform to create single bucket. But it's not a strong opinion. I'm in favor of unified and opinionated approach 👌

I think the biggest plus is that the infra automatically gets destroyed so we do not accidently let something exists longer as the test needs it.

Theoretically tearDown should always be executed so currently all infrastructure should also be destroyed. However, the advantage of using .tf is that everything we defined will be created and destroyed, so no place for discrepancies between setUp and tearDown.

@feluelle feluelle self-assigned this Jun 4, 2020
@feluelle
Copy link
Member Author

feluelle commented Jun 5, 2020

I have a question: When I am adding terraform to airflow and I want it through docker like it was done here what volumes should I mount? I need to mount the .aws, .azure, etc. folder but also the folders containing the .tf files. How should I do it? Mount the whole airflow dir?

WDYT? @potiuk @turbaszek :)

@potiuk
Copy link
Member

potiuk commented Jun 5, 2020

Yeah. Airflow sources are best. But for that, you will need to create a HOST_AIRFLOW_SOURCES variable (similarly as I do with HOST_HOME) as an environment variable and mount it as /opt/airlfow.

@potiuk
Copy link
Member

potiuk commented Jun 5, 2020

And BTW. IT would be great to also add this mapping -v ${HOST_ARIFLOW_SOURCES}:/opt/airflow also to the other commands. We might wan to upload stuff to gcloud/aws as well.

@feluelle
Copy link
Member Author

Terraform was introduced during #8877

For further integration i.e. automatisation of system tests we can also use terraform but that should be a separate issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants