-
Notifications
You must be signed in to change notification settings - Fork 8
Home
Navigating the sections
This repository contains scripts, called handlers, that automate the process of converting raw FASTQ sequences into BAM files, and finally to a finished VCF file.
To set up sequence_handling
, open a terminal and type:
git clone /~https://github.com/MorrellLab/sequence_handling.git
If you don't have Git installed, you can go to the repository on GitHub and select Download ZIP
on the right-hand side. No GitHub account is required for downloading through either method.
To see usage information about sequence_handling
, go into the sequence handling directory
cd sequence_handling
and run:
./sequence_handling
This repository has a heavy dependency on GNU Parallel; most of the handlers use GNU Parallel to speed the processing of multiple samples. While this speeds processing, be aware that: standard laptops, tablets, and desktop computers may not be appropriate. The handlers in the repository are most appropriate for use on supercomputers; this repository was designed with the Minnesota Supercomputing Institute (MSI) in mind.
MSI's resources make extensive use of a module system, in which software is installed and maintained by MSI and users can call upon modules as needed. These handlers are designed to call upon modules whenever possible, however some dependencies are only available through the Morrell Lab. To gain access to the Morrell Lab modules, please run the following command on the login host:
echo export MODULEPATH=/panfs/roc/groups/9/morrellp/public/Modules:'$MODULEPATH' >> ~/.bash_profile
Please check the dependencies page to see which programs are necessary for each handler.
Before beginning sequence_handling, make sure that your FastQ samples have been merged (if individual samples are split across multiple files) and renamed. It will be much harder to merge and/or rename files later in the pipeline.
Take a look at the recommended workflow to familiarize yourself with the goals of this repository. To begin running the pipeline, you will need to:
- Install or locate the correct version of each of the dependencies. (MSI users simply need to have access to all of the required modules.)
- Create a list of your samples using
sample_list_generator.sh
. - Fill out a configuration file for your project.
For the latest updates and to chat with our team, please join our Slack workspace: sequencehandling.slack.com.
For slides from a Does[0]Compute? discussion on validating files at each step in a sequencing pipeline and some of the commands/tools you may want to use, go here.
- Getting Started
- Recommended Workflow
- Configuration
- Dependencies
- sample_list_generator.sh
- Slurm specific options
- Common Problems and Errors