by Laura Budurlean
A combination of optical genome mapping with Bionano and whole genome sequencing short-read data. This pipeline was created to help integrate structural variant calling from these two technologies. These results were published in:
Budurlean L, Tukaramrao DB, Zhang L, Dovat S, Broach J. Integrating optical genome mapping and whole genome sequencing in somatic structural variant detection. Journal of Personalized Medicine. 2024;14(3):291.
Bionano optical genome maps are aligned to precise coordinates using WGS reads to confirm a heterozygous IKZF1 deletion.
You can use the makefile: make -f Makefile.mak help
to run the help file, and make -f Makefile.mak all
to run all the steps at once. You will still need to specify some file names/directories inside the 1- 2- 3- main .sh scripts first depending on your unique sample naming conventions.
You can run each of the scripts yourself in the order specified below.
You can run any SV calling pipeline you wish for your project on your WGS files. You need to obtain a VCF file with your SV calls from WGS. For our data, we opted to use LUMPY and DELLY. We use the SpeedSeq pipeline for running the SVs, which already incorporates LUMPY. We run DELLY separately. A separate repository for doing SV calling with SpeedSeq is available here: SV-calling-with-SpeedSeq. For copy number alterations, we use FreeCNV.
From the Bionano data, obtain the VCF and .SMAP files for each sample from the pipeline you ran and give each sample its own folder with the respective VCF and .SMAP files. We used the Bionano Rare Variant Pipeline. You can either use filtered or unfiltered files. The scripts here have their own built-in filtering parameters that you can adjust, similar to the Bionano ones available online, but you may also pre-filter Bionano files before downloading them if you wish.
Now we can run the scripts. Note: If you set up your directories differently, make sure to change the file paths in the scripts.
Load any required modules including: Python, Bedtools, R.
Adjust headers, parameters, and paths as required for your own project and run the following scripts in order:
1-Optical-Mapping-SV-part1.sh
Bionano-remove-recurrent.sh
1-Optical-Mapping-SV-part2.sh
Run filter-sv.sh
in the /.../scripts/WGS/SV/
scripts folder.
Run merge-sv.sh
in the /.../scripts/WGS/SV/
scripts folder.
- Ensure your FreeCNV
.bam_CNVs
files havechr
prefixes and no trailing tabs. - If you encounter Python pandas errors, ensure your Python version is higher than 3.7. We used Python 3.7.1.