Simplify pipeline configuration #739

donkirkby · 2018-07-26T16:43:44Z

Background

I've recently been adding support for optional arguments as part of #511 and thinking about switching from Docker to Singularity for #737. One of the main enhancements to the current milestone is #723 to make it easier to create a new method by not defining the outputs.

All of this has got me thinking about how complex the code is for configuring a new pipeline and executing a pipeline in a run. It took over a week to add support for optional arguments, and I still have to make changes to the user interface and API for launching runs. The complexity also makes Kive slow at run time, as it searches for results that it can reuse.

We have learned a lot about new tools since we started the Kive project: Slurm, GitHub/GitLab, and Docker/Singularity. Could we replace a lot of the complexity in Kive with features from these tools? We did this in the past when we moved from MPI to Slurm.

This issue is a proposal for the team to discuss - a plan to simplify Kive.

Features to Remove

These are the main sources of complexity. Are they worth it?

Reusing old results.
- Because of purging, it doesn't get used that much.
- Takes a lot of code!
- The place it does get used is restarting a run batch. Maybe we could support that scenario with a search function that finds a run by its inputs.
Nested pipelines
- More flexible to handle this outside of Kive with calls to the API.
Custom cables
- Can handle this as another step in a pipeline script.
Compound data types and validation
- Adds an extra step after the method runs.
- Makes the pipeline configuration more complicated.

Features to Extract

Linking multiple methods into a pipeline
- Upload a zip/tar.gz file with scripts in it.
- Use the pipeline UI to wire inputs and outputs between scripts.
- Save pipeline as a pipeline.json file in the tar.gz file with the scripts.
- Launch the pipeline with a default Singularity image, plus the tar.gz file mounted on /mnt/bin, plus a driver script that reads the pipeline.json file and executes the steps.
- The advantage is that this all runs as a single Slurm job, and the accounting is much simpler.
- You can also migrate a pipeline from development to test to production by downloading the tar.gz file that includes the pipeline.json, and uploading it to another Kive server. A Kive container can be either a Singularity image or a tar.gz file. The tar.gz file is a layer on top of the default Singularity image.
- When you upload a new tar.gz container without a pipeline.json, it can copy the pipeline.json from the previous revision in the container family.

Features to Keep

These features are working well, and I think they're the main value of Kive.

Distribute work across compute nodes with Slurm.
Isolate sets of dependencies with Singularity. (Docker handles the dependencies well, but doesn't work on the cluster. Experiments with Singularity look very promising.)
Find results by MD5. This was the original vision for Kive: to trace a result file back to the version of the software and the data that produced it.
Reproduce an old result, typically to look at some related intermediate results.

Features to Add

Link to Git comparison of different pipeline versions.
Define pipeline configuration in Singularity labels. That way, you could migrate a pipeline from development to test to production by copying the image file and nothing else.
```
  KIVE_INPUTS=name1 name2 --option_name --multiple_option_name*
  KIVE_OUTPUTS=name1 name2 directory_name/ --output_option
```

Task List

The text was updated successfully, but these errors were encountered:

donkirkby · 2018-07-26T16:53:30Z

I sketched out how the new container app might look:

This would eventually replace the apps for archive, method, pipeline, transformation, some of librarian, and possibly metadata and datachecking.

donkirkby · 2018-07-26T18:33:15Z

After team discussion, the plan is to switch to singularity first by completing #737, then tackle this simplification.
We could keep the pipeline assembly tools, but use them to generate a zip file or Singularity image that then runs like any other pipeline.

Also break an import cycle by removing link from librarian to method.

Add container analysis page that lets you choose a container app and creates a container run. Doesn't select inputs or actually launch anything yet. Generate migration for remaining tables.

Add more context to a test that might be flaky.

This should keep the original annotations in the new location.

Set favicon for Django REST framework API.

Also set Slurm options like memory and priority.

Handle missing outputs.

Add test coverage for API code. Add some more API features for MiCall.

Return JSON objects from endpoint calls. Call API client tests on Travis. Add some detail URL's and filters to server API. Add Dataset.is_purged, and include it in the API.

Catch exceptions in runcontainer command. Handle external datasets in runcontainer command.

Also view and edit batches.

Also start adding some missing search filters.

Also copy permissions from run to outputs.

donkirkby added the question label Jul 26, 2018

donkirkby added a commit that referenced this issue Jul 26, 2018

Sketch the container app for simplifying pipeline config in #739.

8c1bfa6

donkirkby mentioned this issue Jul 26, 2018

Collate data sets in UI #511

Open

9 tasks

donkirkby added this to the Near future milestone Jul 26, 2018

donkirkby changed the title ~~Should we simplify pipeline configuration?~~ Simplify pipeline configuration Jul 31, 2018

donkirkby added enhancement and removed question labels Jul 31, 2018

donkirkby modified the milestones: Near future, 0.13 Simplify Pipeline Construction Aug 29, 2018

donkirkby mentioned this issue Sep 6, 2018

Record Kive run ids in a file cfe-lab/MiCall#387

Closed

donkirkby added a commit that referenced this issue Nov 23, 2018

Add container apps as part of #739.

bc72bcb

donkirkby added a commit that referenced this issue Nov 27, 2018

Simplify model diagrams by hiding AccessControl (part of #739).

94e1a81

Also break an import cycle by removing link from librarian to method.

donkirkby added a commit that referenced this issue Nov 27, 2018

Fix broken test for #739.

6cbae3d

Add more context to a test that might be flaky.

donkirkby added a commit that referenced this issue Nov 27, 2018

Add required fields to test data for #739.

fcfcb3d

donkirkby added a commit that referenced this issue Nov 28, 2018

Set start time in container run, as part of #739.

f2d42d3

donkirkby added a commit that referenced this issue Nov 29, 2018

Move run input interface to container app, as part of #739.

f325146

donkirkby added a commit that referenced this issue Nov 29, 2018

Copy run input interface back to sandbox app, as part of #739.

50b36df

This should keep the original annotations in the new location.

donkirkby added a commit that referenced this issue Nov 29, 2018

Create container runs with inputs for #739.

04a10d5

Set favicon for Django REST framework API.

donkirkby added a commit that referenced this issue Nov 30, 2018

Actually launch a singularity container for #739.

c0f0a47

donkirkby added a commit that referenced this issue Dec 3, 2018

Display outputs from container runs for #739.

d4755ad

donkirkby added a commit that referenced this issue Dec 4, 2018

Display stdout and stderr from container runs for #739.

cbcf229

donkirkby added a commit that referenced this issue Dec 4, 2018

Launch singularity with app name for #739.

d6fbe9d

Also set Slurm options like memory and priority.

donkirkby added a commit that referenced this issue Dec 5, 2018

Document how to upload a singularity image for #739.

aaac4b7

Handle missing outputs.

donkirkby added a commit that referenced this issue Dec 6, 2018

Add generic filter function to API for #739.

47d8bc6

Add test coverage for API code. Add some more API features for MiCall.

donkirkby self-assigned this Dec 6, 2018

donkirkby added a commit that referenced this issue Dec 7, 2018

Add more verbs to EndpointManager for #739, and fix return values.

46951ad

donkirkby added a commit that referenced this issue Dec 13, 2018

Add generic download methods to API for #739.

3502234

Catch exceptions in runcontainer command. Handle external datasets in runcontainer command.

donkirkby added a commit that referenced this issue Dec 14, 2018

Build removal plan for container runs and outputs for #739.

2bd061c

donkirkby added a commit that referenced this issue Dec 14, 2018

Add buttons to cancel or rerun container runs for #739.

70caee5

Also view and edit batches.

donkirkby added a commit that referenced this issue Dec 15, 2018

Don't remove active container runs for #739.

6a8d21b

Also start adding some missing search filters.

donkirkby added a commit that referenced this issue Dec 17, 2018

Add remaining search filters for #739.

93e736e

donkirkby mentioned this issue Dec 17, 2018

Run a pipeline in a container #751

Closed

28 tasks

donkirkby closed this as completed in b9aafc8 Dec 17, 2018

donkirkby mentioned this issue Dec 17, 2018

Find runs by input dataset #730

Closed

2 tasks

donkirkby added a commit that referenced this issue Dec 18, 2018

Store submit_time separate from start_time, following #739.

e51c1ce

Also copy permissions from run to outputs.

donkirkby mentioned this issue Mar 4, 2019

Switch to use container runs in Kive cfe-lab/MiCall#460

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify pipeline configuration #739

Simplify pipeline configuration #739

donkirkby commented Jul 26, 2018 •

edited

Loading

donkirkby commented Jul 26, 2018 •

edited

Loading

donkirkby commented Jul 26, 2018

Simplify pipeline configuration #739

Simplify pipeline configuration #739

Comments

donkirkby commented Jul 26, 2018 • edited Loading

Background

Features to Remove

Features to Extract

Features to Keep

Features to Add

Task List

donkirkby commented Jul 26, 2018 • edited Loading

donkirkby commented Jul 26, 2018

donkirkby commented Jul 26, 2018 •

edited

Loading

donkirkby commented Jul 26, 2018 •

edited

Loading