Skip to content

liumOazed/elt-data-pipeline

Repository files navigation

elt-data-pipeline

Steps to run:

  1. Clone the repository in your directory: git clone /~https://github.com/liumOazed/elt-data-pipeline.git
  2. Make sure you have Docker installed. You can run the command docker -v to check docker installed or not
  3. Go to custom_postgres/models/example and change or load the custom models according to your need (Optional)
  4. Run ./start.sh script to run all the container and the project
  5. Go to airflow UI localhost:8080 to trigger the dag and monitor
  6. Once done run ./stop.sh

Services/Tools Used:

  • PostGreSQL Container: For source and destination databases.
  • Airflow: Orchestrates the pipeline.
  • dbt (Data Build Tool): For data transformation and modeling.
  • Cron Jobs: Additional scheduling outside Airflow. (optional)
  • Docker: To containerize and manage services.
  • Jinja: For templating SQL or other configurations.

Data Pipeline Workflow:

  • Extract data from source_db using Airflow.
  • Load data into a destination_db.
  • Transform data in the destination using dbt.
  • Schedule tasks using Airflow and cron jobs.
  • Monitor and orchestrate the entire workflow via Airflow.

THANK YOU

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published