Skip to content

Commit

Permalink
Merge pull request #24 from evotools/v183_hapmode
Browse files Browse the repository at this point in the history
Support haplotype-resolved liftovers
  • Loading branch information
RenzoTale88 authored Sep 6, 2024
2 parents de622f9 + 8ffcbe8 commit 4b175f0
Show file tree
Hide file tree
Showing 15 changed files with 52 additions and 19 deletions.
10 changes: 6 additions & 4 deletions .github/workflows/CI-blat.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
strategy:
matrix:
# Nextflow versions: check pipeline minimum and current latest
nxf_ver: ["23.04.0"]
nxf_ver: ["24.04.4"]
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
Expand All @@ -37,10 +37,12 @@ jobs:
wget -qO- get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/
- name: Install Dependencies
- name: Install mamba
run: |
chmod a+x ./assets/install.sh
assets/install.sh
wget -nv -O Mambaforge.sh /~https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh
bash Mambaforge.sh -b -p ./mambaforge
./mambaforge/bin/mamba init
source ./mambaforge/etc/profile.d/conda.sh
- name: Run pipeline with test data and minimap2
run: |
Expand Down
10 changes: 6 additions & 4 deletions .github/workflows/CI-gsa.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
strategy:
matrix:
# Nextflow versions: check pipeline minimum and current latest
nxf_ver: ["23.04.0"]
nxf_ver: ["24.04.4"]
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
Expand All @@ -37,10 +37,12 @@ jobs:
wget -qO- get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/
- name: Install Dependencies
- name: Install mamba
run: |
chmod a+x ./assets/install.sh
assets/install.sh
wget -nv -O Mambaforge.sh /~https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh
bash Mambaforge.sh -b -p ./mambaforge
./mambaforge/bin/mamba init
source ./mambaforge/etc/profile.d/conda.sh
- name: Run pipeline with test data and minimap2
run: |
Expand Down
10 changes: 6 additions & 4 deletions .github/workflows/CI-lastz.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ jobs:
strategy:
matrix:
# Nextflow versions: check pipeline minimum and current latest
nxf_ver: ["23.04.0"]
nxf_ver: ["24.04.4"]
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
Expand All @@ -37,10 +37,12 @@ jobs:
wget -qO- get.nextflow.io | bash
sudo mv nextflow /usr/local/bin/
- name: Install Dependencies
- name: Install mamba
run: |
chmod a+x ./assets/install.sh
assets/install.sh
wget -nv -O Mambaforge.sh /~https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-$(uname)-$(uname -m).sh
bash Mambaforge.sh -b -p ./mambaforge
./mambaforge/bin/mamba init
source ./mambaforge/etc/profile.d/conda.sh
- name: Run pipeline with test data
run: |
Expand Down
3 changes: 2 additions & 1 deletion .github/workflows/CI-mm2.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ on:
- 'main'
- 'dev'
- '*test'
- 'v*'
pull_request:
workflow_dispatch:
release:
Expand All @@ -27,7 +28,7 @@ jobs:
strategy:
matrix:
# Nextflow versions: check pipeline minimum and current latest
nxf_ver: ["23.04.0"]
nxf_ver: ["24.04.4"]
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
Expand Down
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Changelog

## [v1.8.3]
- Added `--haplotypes` mode, that allows to create liftover files between haplotypes of the same individual with UCSC naming convention

## [v1.8.2]
- `--aligner minimap2` now runs on individual target sequences to reduce the memory footprint and improve performances in large distributed systems
- updated dependencies (minimap2 v2.26.0, last v1454, crossmap v0.6.4, bedtools v2.31.0, lastz v1.04.22, blat v445)
Expand Down
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,10 @@ This analysis will run using genome1 and genome2 as source and target, respectiv
whereas the target will be fragmented in 10Mb chunks overlapping 100Kb. It will use lastz as the aligner using the preset for closely related genomes (near).
The output files will be copied into the folder my_liftover.

## Frequently asked questions

+ _How do I liftover between two haplotypes of the same genome?_ You can lift over positions between haplotypes of the same individual (i.e. having the sequences named `*_hap*` or `*_alt*`) by providing the `--haplotypes` option.

## Citing nf-LO
To cite nf-LO, please refer to:
```
Expand Down
3 changes: 3 additions & 0 deletions docs/chain.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,3 +36,6 @@ nextflow run evotools/nf-LO --igenome_source GRCh37 \
--no_netsynt \
--chainCustom '-minScore=5000 -linearGap=medium'
```

## Haplotype-resolved assemblies
Normally, the workflow will not liftover positions between haplotype sequences (i.e. having the sequences named `*_hap*` or `*_alt*`). This can cause the workflow to generate empty liftovers when the process only involves sequences with the aforementioned naming, for example when lifting `hap1` to `hap2` of the same individual. To still generate liftovers in such cases, users have to provide the `--haplotypes` option, which will allow the workflow to retain such alignments downstream.
2 changes: 2 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Changelog
## [v1.8.3]
- Added `--haplotypes` mode, that allows to create liftover files between haplotypes of the same individual with UCSC naming convention

## [v1.8.2]
- `--aligner minimap2` now runs on individual target sequences to reduce the memory footprint and improve performances in large distributed systems
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
author = 'Andrea Talenti'

# The full version, including alpha/beta/rc tags
release = '1.6.0'
release = '1.8.3'


# -- General configuration ---------------------------------------------------
Expand Down
3 changes: 3 additions & 0 deletions docs/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Frequently asked questions

+ _How do I liftover between two haplotypes of the same genome?_ You can lift over positions between haplotypes of the same individual (i.e. having the sequences named `*_hap*` or `*_alt*`) by providing the `--haplotypes` option.
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ User guide
liftover
output
reports
faq
changelog
citations

Expand Down
2 changes: 2 additions & 0 deletions docs/notes.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# Changelog
## [v1.8.3]
- Added `--haplotypes` mode, that allows to create liftover files between haplotypes of the same individual with UCSC naming convention

## [v1.8.2]
- `--aligner minimap2` now runs on individual target sequences to reduce the memory footprint and improve performances in large distributed systems
Expand Down
10 changes: 6 additions & 4 deletions modules/processes/postprocess.nf
Original file line number Diff line number Diff line change
Expand Up @@ -150,9 +150,10 @@ process chainNet_old{

script:
if ( params.aligner != "blat" & params.aligner != "nucmer" & params.aligner != "GSAlign")
def haplotypes = params.haplotypes ? "-inclHap" : ""
"""
chainPreNet ${rawchain} ${twoBitsizeS} ${twoBitsizeT} stdout |
chainNet -verbose=0 stdin ${twoBitsizeS} ${twoBitsizeT} stdout /dev/null | netSyntenic stdin netfile.net
chainPreNet ${haplotypes} {rawchain} ${twoBitsizeS} ${twoBitsizeT} stdout |
chainNet -verbose=0 ${haplotypes} stdin ${twoBitsizeS} ${twoBitsizeT} stdout /dev/null | netSyntenic stdin netfile.net
netChainSubset -verbose=0 netfile.net ${rawchain} stdout | chainStitchId stdin stdout > liftover.chain
"""
}
Expand All @@ -178,9 +179,10 @@ process chainNet{
"""

script:
def haplotypes = params.haplotypes ? "-inclHap" : ""
"""
chainPreNet ${rawchain} ${twoBitsizeS} ${twoBitsizeT} stdout |
chainNet -verbose=0 stdin ${twoBitsizeS} ${twoBitsizeT} netfile.net /dev/null
chainPreNet ${haplotypes} ${rawchain} ${twoBitsizeS} ${twoBitsizeT} stdout |
chainNet ${haplotypes} -verbose=0 stdin ${twoBitsizeS} ${twoBitsizeT} netfile.net /dev/null
"""
}

Expand Down
4 changes: 3 additions & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ params {
igenomes_target = false
distance = 'medium'
aligner = 'lastz'
haplotypes = false
srcSize = 20000000
tgtSize = 10000000
tgtOvlp = 100000
Expand Down Expand Up @@ -99,6 +100,7 @@ profiles {
charliecloud.enabled = false
conda.createTimeout = '8 h'
includeConfig 'conf/conda.config'
process.conda = "$projectDir/environment.yml"
conda.useMamba = params.mamba ? true : false
}
docker {
Expand Down Expand Up @@ -188,7 +190,7 @@ manifest {
mainScript = 'main.nf'
nextflowVersion = '>=21.10.0'
defaultBranch = 'main'
version = '1.8.2'
version = '1.8.3'
}


Expand Down
4 changes: 4 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,10 @@
"default": "medium",
"enum": ["same", "near", "medium", "far", "balanced", "custom"]
},
"haplotypes": {
"type": "boolean",
"default": false
},
"aligner": {
"type": "string",
"default": "lastz",
Expand Down

0 comments on commit 4b175f0

Please sign in to comment.