-
Notifications
You must be signed in to change notification settings - Fork 8
Adapter_Trimming
The Quality_Trimming.sh
script trims samples based on quality to remove low-quality regions. This script utilizes Sickle, Scythe, and Seqqs to perform the trimming. Currently, it only works on paired-end data. To run Quality_Trimming.sh
, all variables must be defined within the file itself. This is accomplished by opening Quality_Trimming.sh
in your favorite text editor and following instructions in the usage information section. Once the variables have been defined, Quality_Trimming.sh
needs to be submitted to a job scheduler. The script is set up for PBS and submitting Quality_Trimming.sh
is done with the following command:
qsub Quality_Trimming.sh
After the job has run, a list of trimmed FastQ files will be generated for use with read_mapping_start.sh
, please view the output file from the job submission to obtain the path to the list.
The following are a list of variables that need to be defined within Quality_Trimming.sh
, read the output file generated from the job submission to get the path to the list.
Variable | Line | Function |
---|---|---|
5 | Sets an email address for notifications of job status | |
SEQUENCE_HANLDING | 66 | The full path to the directory in which sequence_handling is stored |
SAMPLE_INFO | 69 | A list of samples to trim |
FORWARD_NAMING | 75 | Extension for forward files |
REVERSE_NAMING | 76 | Extension for reverse files |
PROJECT | 79 | A name that describes the project you are working on |
SCRATCH | 82 | A directory that will hold results |
ADAPTERS | 85 | A plain text or fasta file with the adapter sequences |
PRIOR | 89 | A prior value for Scythe |
THRESHOLD | 94 | The threshold for quality trimming in Sickle |
PLATFORM | 97 | The platform used for sequencing. This can be found in the output files from Assess_Quality.sh
|
R Definition | 100-102 | Define the path to an R installation or load it from a cluster |
Quality_Trimming.sh
creates trimmed FastQ files for each sample. It also generates trimming statistics to help assess quality both before and after trimming. It is still recommended that Assess_Quality.sh
be used for more complete quality assurance. In addition, a list of all trimmed files will be output for use with other scripts.
Quality_Trimming.sh
depends on Sickle, Scythe, and Seqqs to perform the trimming. These are not installed on MSI and must be installed separately. The installer.sh
script has the ability to download and install these three programs from GitHub by passing the install
argument. Furthermore, PBS and GNU Parallel are required for basic running. Finally, R is required for plotting trimming statistcs.
Quality_Trimming.sh
uses two helper scripts in the trimming process. fix_quality.sh
adjusts the quality scores given by sequencing centers to a more realistic value using Awk
. plot_seqqs.R
creates quality comparison plots for each sample. These plots are stored in PDF documents.
Next: read_mapping_start.sh
- Getting Started
- Recommended Workflow
- Configuration
- Dependencies
- sample_list_generator.sh
- Slurm specific options
- Common Problems and Errors