This repository has been archived by the owner on May 31, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 82
docs: miniwdl engine docs and example project for GATK best practices #158
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
wleepang
approved these changes
Nov 9, 2021
nbraid
approved these changes
Nov 9, 2021
tneely
added a commit
that referenced
this pull request
Nov 11, 2021
* Ahmaalba/Adapter Role Least Privilege Design (#57) * Reduce role privileges for adapter role * Added adapter role output bucket permissions * Adjusted Nextflow SubmitJobBatchPolicy props * CodeQL Security Analysis (#62) * Display message if no contexts deployed (#64) * Removed redundant command "agc version" (#67) * Pull request template (#65) Creates a PR template for github * Update issue templates (#63) Updates the issue templates to provide baseline guidance for customers * update dependencies (#69) Co-authored-by: Pang, Lee <pwyming@.amazon.com> * Updates the documentation for generating minimal permissions (#71) * Add run-id flag to workflow status command (#74) `agc workflow status` command was missing the `-r` flag to indicate that the string we are passing in is the run-id. * Update workflow documentation to include more info about the URL format * Update workflows.md Updates the documentation for workflows since we are parsing them when deploying workflows. * Update workflows.md * add attribution for GATK best practices workflows (#78) * add instructions to use AGC local CDK for bootstrapping (#79) * Passing environment variables to increase tolerance for metadata service endpoint timeouts (#80) * [Bug] Creating a cromwell Spot Context also creates an on demand Compute Env (#66) Avoid creating an on demand compute env for cromwell Spot Context * markjschreiber/better-account-not-activated-message (#75) * Better error message when attempting to deploy a context with no activated account. * Ensure context names are unique and sorted so deployment is in consistent order * LPD Full Managed Policy Permission Descope (#61) * LPD Batch permission descope * Ahmaalba/Adapter Role Least Privilege Design (#57) * Reduce role privileges for adapter role * Added adapter role output bucket permissions * Adjusted Nextflow SubmitJobBatchPolicy props * Prettier fix * Removal of BatchFullAccess managed policy * Code deduplication * LPD Full Implementation * Env removal from engine options * Regin and account parameter adjustment * Nextflow onspot instance bug fix * Usage of Arn.Format rather then custom ARN creation * Made roles retrieve account and region through props rather then ArnComponents * Removal of account and region to use default values * Added batch:ListJobs permissions to nextflow adapter * workflow engine documentation (#60) * Adds windows 10 as an OS option (#81) Tested AGC on a windows 10 machine running Ubuntu * Adds amd instance types (#84) * Corrected configuration of read workflow so that no MANIFEST is required (#86) * Fixing acg typo to agc (#91) Moving two instances of `acg` to `agc`. * rnaseq pipeline to use proper inputs.json file (#90) The rnaseq pipeline was referencing the inputs.json file from the atacseq example. This PR switches it to the proper inputs.json file. The old inputs.json file: ``` { "input": "s3://healthai-public-assets-us-east-1/agc-demo-data/atacseq/design.csv", "genome": "GRCh38", "single_end": true } ``` The new inputs.json file: ``` { "reads": "s3://1000genomes/phase3/data/HG00243/sequence_read/SRR*_{1,2}.filt.fastq.gz", "genome": "GRCh37", "skip_qc": true } ``` * Add workflow output command (#85) * workflow output command implementation * Workflow and Context autocomplete implementation (#82) * Workflow and Context autocomplete implementation * Support Max vCpu in project contexts (#89) feat: Configurable max vCpu for compute environments Add maxVCpu as a property of Context to control the maximum number of vCpus a compute environment can have at a given time Support a default Context values, set when a Context is unmarshalled, any value set in agc-project.yml will override the default. refs: #31 * Latest Release Link (#92) * Cleaned up Readme (#93) * Added version checker to AGC (#94) * Added version checker to AGC * Addressed feedback on the PR * Simplified version checker code * Go compiler version 1.16.0 -> 1.17.2 (#95) * Tabular text implementation and tests (#88) * Tabular text implementation and tests * Add Stale Issue Handling (#96) * added project validate command (#97) * markjschreiber/engine-in-contex-list (#99) * add engine name to context list command output * markjschreiber/clean-up-codebase (#98) * chore: Update Pull Request Template to Follow Conventional Commits (#100) Co-authored-by: Angela Li <dzl@amazon.com> * ci: Improved ci workflow (#102) * Builds the CDK project and validates eslint, also formats and fails if any formatting changes are detected. * Checks for format changes in the CLI project * ci: Add semantics behavior overrides (#106) * fix: Shows the relevant error if the workflow logs can't be retrieved (#103) * fix: workflows from demo-wdl-project should run without errors out of the box (#108) * test: use go 1.17 features to simplify unit tests (#110) * fix: show logs for workflows with more than 100 tasks (#114) * fix: use proper go tags for windows build (#117) * fix: use proper go tags for windows build * use nf-core for this workflow (#123) * feat: context destroy --force flag (#118) * context destroy --force flag * fix: Pass engine endpoint directly the wes adapter (#122) * chore: clean up project init code (#126) * ci: Add standard version, conventional changelog and bump script (#119) * ci: Add standard version, conventional changelog and bump script * fix: Fixes how users interact with the context commands (#115) Fixes how users interact with the context commands by allowing contexts to be passed in without the -c command * fix: invalid AWS Health url (#130) correctly point AWS health link to `aws.amazon.com/health` * build: Revamp build and release process (#127) We are updating our build pipeline to better automate the release process. This requires a few build related changes in our source code. * fix: Use correct context name (#132) the context name in `/examples/demo-wdl-project` is `myContext`, which is used by the examples here. * build: use latest build images (#134) * feat: Initial infrastructure for MiniWdl support (#125) Adds a MiniWdl stack which creates the appropriate batch resources and job definition to run MiniWdl jobs. * test: Added context deploy benchmarking script (#111) * Context deploy benchmark script * fix: Adds a message when new logs aren't shown to the user immediately (#131) * Adds a message when new logs aren't shown to the user immediately * fix: correctly link to core app (#133) * fix: temporary folder potential leak in some error scenarios. unit test for cdk command execution (#140) * fix: temporary folder potential leak in some error scenarios. unit test for cdk command execution * fixed typo in method name, updated implementation for channel waiter * fix: updates context describe to be consistent with context destroy (#143) * fix: updates context describe to be consistent with context destroy * Best practice is to avoid mutation of inputs. Therefore, copy instead of move input (#145) * build: move release files one folder down (#147) * fix: miniwdl interpolation workaround The gatk4-rnaseq-germline-snps-indels workflow revealed a possible bug in miniwdl where it doesn't correctly handle string interpolation of optional values used in a calculation. This change to the workflow works around the problem in miniwdl. * fix: updates how the logs are shown from cloudwatch (#142) fix: updates how the logs are shown from cloudwatch * fix: improve contrast in docs (#149) * docs: Add information about example inputs and runtimes (#146) * add information about example inputs and runtimes * fix: Asserts order deterministically (#153) * docs: ongoing cost details (#152) * added ongoing costs section to contexts.md * added cost estimate links * fix: Workflow status now ignores unqueryable stacks (#138) fix: Workflow status now ignores unqueryable stacks * docs: miniwdl engine docs and example project for GATK best practices (#158) * add engine docs * add miniwdl examples * feat: Introducing AWS Lambda based WES Adapter for running the workflows (#155) * Introducint AWS Lambda based WES Adapter for running the workflows * Addressing the comments from PR review * fix: Deregionalize min permissions (#128) * add route53:ListHostedZonesByName * de-regionalize resource arns * split out CDK specific s3 permissions * chore(release): 1.1.0 Co-authored-by: AhmadBassyiouni <30308260+abassyiouni@users.noreply.github.com> Co-authored-by: Guy Hawkins <2242982+ghawk1ns@users.noreply.github.com> Co-authored-by: Taylor <tneely@users.noreply.github.com> Co-authored-by: Illya Yalovyy <IllyaYalovyy@users.noreply.github.com> Co-authored-by: elliot-smith <elliotsm@amazon.com> Co-authored-by: W. Lee Pang, PhD <wleepang@gmail.com> Co-authored-by: Pang, Lee <pwyming@.amazon.com> Co-authored-by: Drew Dresser <andrewjdresser@gmail.com> Co-authored-by: Andrey Dovydenko <dovydenk@amazon.com> Co-authored-by: Mark Schreiber <mrschre@amazon.com> Co-authored-by: Sean Smith <seaam@amazon.com> Co-authored-by: a-li <7497012+a-li@users.noreply.github.com> Co-authored-by: Angela Li <dzl@amazon.com> Co-authored-by: nbraid <braidn@amazon.com>
tneely
pushed a commit
to tneely/amazon-genomics-cli
that referenced
this pull request
Nov 11, 2021
…aws#158) * add engine docs * add miniwdl examples
tneely
added a commit
to tneely/amazon-genomics-cli
that referenced
this pull request
Nov 11, 2021
* Ahmaalba/Adapter Role Least Privilege Design (aws#57) * Reduce role privileges for adapter role * Added adapter role output bucket permissions * Adjusted Nextflow SubmitJobBatchPolicy props * CodeQL Security Analysis (aws#62) * Display message if no contexts deployed (aws#64) * Removed redundant command "agc version" (aws#67) * Pull request template (aws#65) Creates a PR template for github * Update issue templates (aws#63) Updates the issue templates to provide baseline guidance for customers * update dependencies (aws#69) Co-authored-by: Pang, Lee <pwyming@.amazon.com> * Updates the documentation for generating minimal permissions (aws#71) * Add run-id flag to workflow status command (aws#74) `agc workflow status` command was missing the `-r` flag to indicate that the string we are passing in is the run-id. * Update workflow documentation to include more info about the URL format * Update workflows.md Updates the documentation for workflows since we are parsing them when deploying workflows. * Update workflows.md * add attribution for GATK best practices workflows (aws#78) * add instructions to use AGC local CDK for bootstrapping (aws#79) * Passing environment variables to increase tolerance for metadata service endpoint timeouts (aws#80) * [Bug] Creating a cromwell Spot Context also creates an on demand Compute Env (aws#66) Avoid creating an on demand compute env for cromwell Spot Context * markjschreiber/better-account-not-activated-message (aws#75) * Better error message when attempting to deploy a context with no activated account. * Ensure context names are unique and sorted so deployment is in consistent order * LPD Full Managed Policy Permission Descope (aws#61) * LPD Batch permission descope * Ahmaalba/Adapter Role Least Privilege Design (aws#57) * Reduce role privileges for adapter role * Added adapter role output bucket permissions * Adjusted Nextflow SubmitJobBatchPolicy props * Prettier fix * Removal of BatchFullAccess managed policy * Code deduplication * LPD Full Implementation * Env removal from engine options * Regin and account parameter adjustment * Nextflow onspot instance bug fix * Usage of Arn.Format rather then custom ARN creation * Made roles retrieve account and region through props rather then ArnComponents * Removal of account and region to use default values * Added batch:ListJobs permissions to nextflow adapter * workflow engine documentation (aws#60) * Adds windows 10 as an OS option (aws#81) Tested AGC on a windows 10 machine running Ubuntu * Adds amd instance types (aws#84) * Corrected configuration of read workflow so that no MANIFEST is required (aws#86) * Fixing acg typo to agc (aws#91) Moving two instances of `acg` to `agc`. * rnaseq pipeline to use proper inputs.json file (aws#90) The rnaseq pipeline was referencing the inputs.json file from the atacseq example. This PR switches it to the proper inputs.json file. The old inputs.json file: ``` { "input": "s3://healthai-public-assets-us-east-1/agc-demo-data/atacseq/design.csv", "genome": "GRCh38", "single_end": true } ``` The new inputs.json file: ``` { "reads": "s3://1000genomes/phase3/data/HG00243/sequence_read/SRR*_{1,2}.filt.fastq.gz", "genome": "GRCh37", "skip_qc": true } ``` * Add workflow output command (aws#85) * workflow output command implementation * Workflow and Context autocomplete implementation (aws#82) * Workflow and Context autocomplete implementation * Support Max vCpu in project contexts (aws#89) feat: Configurable max vCpu for compute environments Add maxVCpu as a property of Context to control the maximum number of vCpus a compute environment can have at a given time Support a default Context values, set when a Context is unmarshalled, any value set in agc-project.yml will override the default. refs: aws#31 * Latest Release Link (aws#92) * Cleaned up Readme (aws#93) * Added version checker to AGC (aws#94) * Added version checker to AGC * Addressed feedback on the PR * Simplified version checker code * Go compiler version 1.16.0 -> 1.17.2 (aws#95) * Tabular text implementation and tests (aws#88) * Tabular text implementation and tests * Add Stale Issue Handling (aws#96) * added project validate command (aws#97) * markjschreiber/engine-in-contex-list (aws#99) * add engine name to context list command output * markjschreiber/clean-up-codebase (aws#98) * chore: Update Pull Request Template to Follow Conventional Commits (aws#100) Co-authored-by: Angela Li <dzl@amazon.com> * ci: Improved ci workflow (aws#102) * Builds the CDK project and validates eslint, also formats and fails if any formatting changes are detected. * Checks for format changes in the CLI project * ci: Add semantics behavior overrides (aws#106) * fix: Shows the relevant error if the workflow logs can't be retrieved (aws#103) * fix: workflows from demo-wdl-project should run without errors out of the box (aws#108) * test: use go 1.17 features to simplify unit tests (aws#110) * fix: show logs for workflows with more than 100 tasks (aws#114) * fix: use proper go tags for windows build (aws#117) * fix: use proper go tags for windows build * use nf-core for this workflow (aws#123) * feat: context destroy --force flag (aws#118) * context destroy --force flag * fix: Pass engine endpoint directly the wes adapter (aws#122) * chore: clean up project init code (aws#126) * ci: Add standard version, conventional changelog and bump script (aws#119) * ci: Add standard version, conventional changelog and bump script * fix: Fixes how users interact with the context commands (aws#115) Fixes how users interact with the context commands by allowing contexts to be passed in without the -c command * fix: invalid AWS Health url (aws#130) correctly point AWS health link to `aws.amazon.com/health` * build: Revamp build and release process (aws#127) We are updating our build pipeline to better automate the release process. This requires a few build related changes in our source code. * fix: Use correct context name (aws#132) the context name in `/examples/demo-wdl-project` is `myContext`, which is used by the examples here. * build: use latest build images (aws#134) * feat: Initial infrastructure for MiniWdl support (aws#125) Adds a MiniWdl stack which creates the appropriate batch resources and job definition to run MiniWdl jobs. * test: Added context deploy benchmarking script (aws#111) * Context deploy benchmark script * fix: Adds a message when new logs aren't shown to the user immediately (aws#131) * Adds a message when new logs aren't shown to the user immediately * fix: correctly link to core app (aws#133) * fix: temporary folder potential leak in some error scenarios. unit test for cdk command execution (aws#140) * fix: temporary folder potential leak in some error scenarios. unit test for cdk command execution * fixed typo in method name, updated implementation for channel waiter * Move release files one folder down * fix: updates context describe to be consistent with context destroy (aws#143) * fix: updates context describe to be consistent with context destroy * Best practice is to avoid mutation of inputs. Therefore, copy instead of move input (aws#145) * fix: miniwdl interpolation workaround The gatk4-rnaseq-germline-snps-indels workflow revealed a possible bug in miniwdl where it doesn't correctly handle string interpolation of optional values used in a calculation. This change to the workflow works around the problem in miniwdl. * fix: updates how the logs are shown from cloudwatch (aws#142) fix: updates how the logs are shown from cloudwatch * fix: improve contrast in docs (aws#149) * docs: Add information about example inputs and runtimes (aws#146) * add information about example inputs and runtimes * fix: Asserts order deterministically (aws#153) * docs: ongoing cost details (aws#152) * added ongoing costs section to contexts.md * added cost estimate links * fix: Workflow status now ignores unqueryable stacks (aws#138) fix: Workflow status now ignores unqueryable stacks * docs: miniwdl engine docs and example project for GATK best practices (aws#158) * add engine docs * add miniwdl examples * feat: Introducing AWS Lambda based WES Adapter for running the workflows (aws#155) * Introducint AWS Lambda based WES Adapter for running the workflows * Addressing the comments from PR review * fix: Deregionalize min permissions (aws#128) * add route53:ListHostedZonesByName * de-regionalize resource arns * split out CDK specific s3 permissions * fix for installation.md (aws#161) * feat: Improved Workflow logs (aws#156) * feat: Improved Workflow logs By default, workflow logs for a run will log out run status and individual task status. Tasks logs can be emitted with `--task <taskId>` for a single task log, `--all-tasks` for all task logs, and `--failed-tasks` for failed task logs. * chore(release): 1.1.0 Co-authored-by: AhmadBassyiouni <30308260+abassyiouni@users.noreply.github.com> Co-authored-by: Guy Hawkins <2242982+ghawk1ns@users.noreply.github.com> Co-authored-by: Illya Yalovyy <IllyaYalovyy@users.noreply.github.com> Co-authored-by: elliot-smith <elliotsm@amazon.com> Co-authored-by: W. Lee Pang, PhD <wleepang@gmail.com> Co-authored-by: Pang, Lee <pwyming@.amazon.com> Co-authored-by: Drew Dresser <andrewjdresser@gmail.com> Co-authored-by: Andrey Dovydenko <dovydenk@amazon.com> Co-authored-by: Mark Schreiber <mrschre@amazon.com> Co-authored-by: Sean Smith <seaam@amazon.com> Co-authored-by: a-li <7497012+a-li@users.noreply.github.com> Co-authored-by: Angela Li <dzl@amazon.com> Co-authored-by: nbraid <braidn@amazon.com>
elliot-smith
pushed a commit
that referenced
this pull request
Nov 16, 2021
…#158) * add engine docs * add miniwdl examples
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of Changes
Document the miniwdl engine and provide GATK best practices project configured for miniwdl used to validate minwdl
Description of how you validated changes
make start-docs
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license