-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create user-facing documentation for DAG Factory #278
Comments
Implement support for [Airflow TaskFlow](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/taskflow.html), available since 2.0. # How to test The following example defines a task that generates a list of numbers and another that consumes this list and creates dynamically (using Airflow dynamic task mapping) an independent task that doubles each individual number. ``` example_taskflow: default_args: owner: "custom_owner" start_date: 2 days description: "Example of TaskFlow powered DAG that includes dynamic task mapping." schedule_interval: "0 3 * * *" default_view: "graph" tasks: numbers_list: decorator: airflow.decorators.task python_callable: sample.build_numbers_list double_number_with_dynamic_task_mapping_taskflow: decorator: airflow.decorators.task python_callable: sample.double expand: number: +numbers_list # the prefix + tells DagFactory to resolve this value as the task `numbers_list`, previously defined ``` For the `sample.py` file below: ``` def build_numbers_list(): return [2, 4, 6] def double(number: int): result = 2 * number print(result) return result ``` In the UI, it is shown as: ![Screenshot 2024-12-06 at 11 53 04](/~https://github.com/user-attachments/assets/0643002a-2530-4bc1-af39-16fb3f48d4d4) And: ![Screenshot 2024-12-06 at 11 52 28](/~https://github.com/user-attachments/assets/2c2ed46a-4ee8-438a-836d-3112b4737c6a) # Scope This PR includes several use cases of [dynamic task mapping](https://airflow.apache.org/docs/apache-airflow/2.10.3/authoring-and-scheduling/dynamic-task-mapping.html): 1. Simple mapping 2. Task-generated mapping 3. Repeated mapping 4. Adding parameters that do not expand (`partial`) 5. Mapping over multiple parameters 6. Named mapping (`map_index_template`) The following dynamic task mapping cases were not tested but are expected to work: * Mapping with non-TaskFlow operators * Mapping over the result of classic operators * Filtering items from a mapped task The following dynamic task mapping cases were not tested and should not work (they were considered outside of the scope of the current ticket): * Assigning multiple parameters to a non-TaskFlow operator * Mapping over a task group * Transforming expanding data * Combining upstream data (aka “zipping”) # Tests The feature is being tested by running the example DAGs introduced in this PR, which validate various scenarios of task flow and dynamic task mapping and serve as documentation. As with other parts of DAG Factory, we can and should improve the overall unit test coverage. Two example DAG files were added, containing multiple examples of TaskFlow and Dynamic Task mapping. This is how they are displayed in the AIrflow UI: <img width="1501" alt="Screenshot 2024-12-06 at 16 11 10" src="/~https://github.com/user-attachments/assets/c4d12520-31f5-4b9d-b191-dd37523299e1"> <img width="1500" alt="Screenshot 2024-12-06 at 16 11 42" src="/~https://github.com/user-attachments/assets/ab08749f-aedb-4c8f-9df1-8f0d0451477d"> <img width="1510" alt="Screenshot 2024-12-06 at 16 11 32" src="/~https://github.com/user-attachments/assets/591e949a-49da-49f6-8d4d-1458fbb88d7f"> # Docs This PR does not contain user-facing docs other than the README. However, we'll address this as part of #278. # Related issues This PR closes two open tickets: Closes: #302 (support named mapping, via the `map_index_template` argument) Example of usage of `map_index_template`: ``` dynamic_task_with_named_mapping: decorator: airflow.decorators.task python_callable: sample.extract_last_name map_index_template: "{{ custom_mapping_key }}" expand: full_name: - Lucy Black - Vera Santos - Marks Spencer ``` Closes: #301 (Mapping over multiple parameters) Example of multiple parameters: ``` multiply_with_multiple_parameters: decorator: airflow.decorators.task python_callable: sample.multiply expand: a: +numbers_list # the prefix + tells DagFactory to resolve this value as the task `numbers_list`, previously defined b: +another_numbers_list # the prefix + tells DagFactory to resolve this value as the task `another_numbers_list`, previously defined ```
Initiated an internal discussion: https://astronomer.slack.com/archives/C015V2JFKT5/p1733904505509939 |
We discuss further this with team to finalise the technology, template, hosting solution and some initial docs we should work on. Meeting notes: https://www.notion.so/astronomerio/DAG-Factory-Docs-Recs-15a40290af6c803fbd6ed483b6eefb7a |
We're considering this task done, given all the following issues were completed:
|
At the moment, all the DAG Factory documentation is a README.
We should collaborate with Astronomer's doc team and @cmarteepants to decide what content to cover and how to represent it.
Once this is done, we should write the documents using Markdown or Rest, based on what is agreed upon.
The text was updated successfully, but these errors were encountered: