Skip to content

idealista/airflow-role

Repository files navigation

Apache Airflow Ansible role

GitHub release (latest by date) Ansible Galaxy Build Status

Logo

This ansible role installs a Apache Airflow server in a Debian/Ubuntu environment.

Getting Started

These instructions will get you a copy of the role for your ansible playbook. Once launched, it will install Apache Airflow in a Debian or Ubuntu system.

Prerequisites ☑️

Ansible >= 2.9.9 version installed (Tested with 2.18). Inventory destination should be a Debian (preferable Debian >= 10 Buster ) or Ubuntu environment.

ℹ️ This role should work with older versions of Debian but you need to know that due to Airflow minimum requirements you should check that 🐍 Python 3.8 (or higher) is installed before (👉 See: Airflow prerequisites).

ℹ️ By default this role use the predefined installation of Python that comes with the distro.

For testing purposes, Molecule with Docker as driver.

Installing 📥

Create or add to your roles dependency file (e.g requirements.yml) from GitHub:

- src: http://github.com/idealista/airflow-role.git
  scm: git
  version: 3.0.0
  name: airflow

or using Ansible Galaxy as origin if you prefer:

- src: idealista.airflow_role
  version: 3.0.0
  name: airflow

Install the role with ansible-galaxy command:

ansible-galaxy install -p roles -r requirements.yml -f

Use in a playbook:

---
- hosts: someserver
  roles:
    - { role: airflow }

Usage 🏃

Look to the defaults properties files to see the possible configuration properties, take a look for them:

❗Attention:❗

  • ⚠️ This version is no longer compatible with Apache Airflow 1.x versions.
  • ⚠️ Check out the new way to set airflow.cfg parameters in airflow-cfg.yml file.

👉 Don't forget :

  • 🦸 To set your Admin user.
  • 🔑 To set Fernet key.
  • 🔑 To set webserver secret key.
  • 📝 To set your AIRFLOW_HOME and AIRFLOW_CONFIG at your own discretion.
  • 📝 To set your installation and config skelton paths at your own discretion.
    • 👉 See airflow_skeleton_paths in main.yml
  • 🐍 Python and pip version.
  • 📦 Extra packages if you need additional operators, hooks, sensors...
  • 📦 Required Python packages with version specific like SQLAlchemy for example (to avoid known Airflow bugs❗️) like below or because are necessary

📦 Required Python packages

airflow_required_python_packages should be a list following this format:

# This is an example of how to set the required python packages
airflow_required_python_packages:
  - { name: SQLAlchemy, version: major.minor.patch }
  - { name: psycopg2 }
  - {name: pyasn1}

📦 Extra packages

airflow_extra_packages should be a list following this format:

# This is an example of how to set the extra packages
airflow_extra_packages:
  - apache.atlas
  - celery
  - ssh

👉 For more info about this extra packages see: Airflow extra packages

Testing 🧪

pipenv install -r test-requirements.txt --python 3.12

# Optional
pipenv shell  # if in shell just use `molecule COMMAND`

pipenv run molecule test  # To run role test
# or
pipenv run molecule converge  # To run play with the role

Built With 🏗️

Ansible

Versioning 🗃️

For the versions available, see the tags on this repository.

Additionally you can see what change in each version in the CHANGELOG.md file.

Authors 🦸

See also the list of contributors who participated in this project.

License 🗒️

Apache 2.0 License

This project is licensed under the Apache 2.0 license - see the LICENSE file for details.

Contributing 👷

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.