-
Notifications
You must be signed in to change notification settings - Fork 8
SAM_Processing
There are two scripts for processing SAM files: SAM_Processing_SAMTools.sh
and SAM_Processing_Picard.sh
. Currently, only SAM_Processing_SAMTools.sh
is operational, so that will be the only one demonstrated now.
The SAM_Processing_SAMTools.sh
script sorts, deduplicates, adds read groups to, and merges the SAM files produced from read_mapping_start.sh
into one finished BAM file. This script utilizes SAMTools to carry out all processing of the SAM files. In addition, it creates alignment statistics using the samstat
function of SAMTools. To run SAM_Procesing_SAMTools.sh
, all variables must be defined within the file itself. This is accomplished by opening SAM_Procesing_SAMTools.sh
in your favorite text editor and following instructions in the usage information section. Once the variables have been defined, SAM_Procesing_SAMTools.sh
needs to be submitted to a job scheduler. The script is set up for PBS and submitting SAM_Procesing_SAMTools.sh
is done with the following command:
qsub SAM_Processing_SAMTools.sh
After the job has run, a list of sorted, deduplicated, and read grouped BAM files will be generated in addition to the merged BAM file.
The following are a list of variables that need to be defined within SAM_Procesing_SAMTools.sh
, read the output file generated from the job submission to get the path to the list.
Variable | Line | Function |
---|---|---|
5 | Sets an email address for notifications of job status | |
SAMTools Definition | 43-45 | Define the path to the SAMTools installation or load it from a cluster |
SAMPLE_INFO | 48 | A list of SAM files to process |
REF_GEN | 51 | A reference sequence used in the sorting process |
SCRATCH | 54 | A directory that will hold results |
PROJECT | 57 | A name that describes the project you are working on |
SAM_Procesing_SAMTools.sh
creates sorted, deduplicated BAM files that have read groups marked. In addition, it also generates a merged BAM file for other tasks such as variant calling. Alignment statistics for all input SAM files and the finished, but unmerged, BAM files. Finally, a list of all finished, but unmerged, BAM files is generated
SAM_Procesing_SAMTools.sh
depends on SAMTools for all processing needs as well as generating the alignment statistics. In addition, PBS and GNU Parallel are required for basic running.
Next: Coverage_Map.sh
- Getting Started
- Recommended Workflow
- Configuration
- Dependencies
- sample_list_generator.sh
- Slurm specific options
- Common Problems and Errors