-
Notifications
You must be signed in to change notification settings - Fork 8
Read_Mapping
Read_Mapping starts a task array of QSub job submissions to the Portable Batch System job scheduler to read map using the Burrows-Wheeler Aligner (BWA-MEM). It can also index a FastA file using BWA if the provided reference is not already indexed.
To run Read_Mapping, all common and handler-specific variables must be defined within the configuration file. Once the variables have been defined, Read_Mapping can be submitted to a job scheduler with the following command (assuming that you are in the directory containing sequence_handling
):
./sequence_handling Read_Mapping Config
Where Config
is the full file path to the configuration file.
The following are a list of variables that need to be defined within Config
. In addition to the handler-specific variables, all common variables must be defined. The default parameters listed here are designed for cultivated barley. Parameters will need to be adjusted on a per-species basis.
Variable | Function | Default Value |
---|---|---|
RM_QSUB |
QSub settings for batch submission. Recommended settings are "mem=22gb,nodes=1:ppn=16,walltime=24:00:00" . Some samples may require more than the 24 hours allowed by lab , so the use of mesabi is necessary. For more information, see the FAQ. |
|
TRIMMED_LIST |
A list of adapter-trimmed or quality-trimmed samples to read map. This will be ${OUT_DIR}/Adapter_Trimming/${PROJECT}_trimmed_adapters.txt (Adapter_Trimming) or ${OUT_DIR}/Quality_Trimming/${PROJECT}_trimmed_quality.txt (Quality_Trimming). |
|
FORWARD_TRIMMED |
Shared suffix for forward reads. This will be _Forward_ScytheTrimmed.fastq.gz (Adapter_Trimming) or _R1_trimmed.fastq.gz (Quality_Trimming). |
|
REVERSE_TRIMMED |
Shared suffix for reverse reads. This will be _Reverse_ScytheTrimmed.fastq.gz (Adapter_Trimming) or _R2_trimmed.fastq.gz (Quality_Trimming). |
|
SINGLES_TRIMMED |
Shared suffix for single reads. This will be _Single_ScytheTrimmed.fastq.gz (Adapter_Trimming) or _single_trimmed.fastq.gz (Quality_Trimming). |
|
THREADS |
How many threads to use. | 8 |
SEED |
Minimum seed length. | 8 |
WIDTH |
Band width. | 100 |
DROPOFF |
Off-diagonal x-dropoff (Z-dropoff). | 100 |
RE_SEED |
Re-seed value. | 1.0 |
CUTOFF |
Cutoff value. | 10000 |
MATCH |
Matching score. | 1 |
MISMATCH |
Mismatch penalty. | 4 |
GAP |
Gap penalty. | 8 |
EXTENSION |
Gap extension penalty. | 1 |
CLIP |
Clipping penalty. | 6 |
UNPAIRED |
Unpaired read penalty. | 9 |
RESCUE |
Attempt to rescue missing hits in paired-end mode? Note: this means that reads may not be matched | false |
INTERLEAVED |
Is the first input query interleaved? | false |
RM_THRESHOLD |
Minimum threshold. | 85 |
SECONDARY |
Output all alignments and mark as secondary. | false |
APPEND |
Append FastA/Q comments to SAM files. | false |
HARD |
Use hard clipping. | false |
SPLIT |
Mark split hits as secondary. | true |
VERBOSITY |
Verbosity level. Choose from 'disabled' , 'errors' , 'warnings' , 'all' , or 'debug' . |
'all' |
Note: If running single-end samples, leave FORWARD_TRIMMED
and REVERSE_TRIMMED
filled with values that do not match your samples. If running paired-end samples, leave SINGLES_TRIMMED
filled with values that do not match your samples.
If your reference genome is not indexed, Read_Mapping generates an index file for the reference genome in the same directory as the reference genome. After indexing Read_Mapping will exit, so you will need to run Read_Mapping again to map reads.
Read_Mapping generates aligned SAM files for each sample, located under
${OUT_DIR}/Read_Mapping/${SAMPLE}.sam
where ${OUT_DIR}
is specified in the configuration file.
A list of files is not generated from Read_Mapping. However, you can generate one using sample_list_generator.sh
.
Read_Mapping depends on the Burrows-Wheeler Aligner and the Portable Batch System to run. If you want to use a different job scheduler or read mapper, you will need to modify this script extensively. Future implementations of Read_Mapping using Bowtie 2 are under consideration. Please check the dependencies page to ensure that you are using the required version of each dependency.
Next: SAM_Processing
- Getting Started
- Recommended Workflow
- Configuration
- Dependencies
- sample_list_generator.sh
- Slurm specific options
- Common Problems and Errors