CopyDetective is a novel algorithm for detection threshold aware CNV calling in matched whole-exome sequencing data.
Sarah Sandmann, Marius Wöste, Aniek O de Graaf, Birgit Burkhardt, Joop H Jansen, Martin Dugas, CopyDetective: Detection threshold–aware copy number variant calling in whole-exome sequencing data, GigaScience, Volume 9, Issue 11, November 2020, giaa118, https://doi.org/10.1093/gigascience/giaa118.
To run CopyDetective, you need R (Version 3.5.2 or higher) and shiny.
To install CopyDetective, just download the R-scripts. All required packages are automatically installed.
CopyDetective is available as a shiny GUI. You can run CopyDetective either by RStudio or by using shiny::runApp()
in an R console.
On the left, there are two tabs available for analysis:
- Perform analysis: actual CNV calling is performed
- Display results: display list of CNV calls and output plots for a selected sample
On the right, four output tabs are available:
- Log: progress of the analysis
- Detection thresholds: individual detection thresholds for every sample, including minimum CNV size and minimum cell fraction for deletions and duplications
- CNV calls: merged CNV calls; if filtration by means of quality is selected, the calls are filted based on the selected filtration threshold
- Plots: dependent on the selected options, up to four plots are displayed: 1) Raw CNV calls (all), 2) Raw CNV calls (sig), 3) Merged CNV calls, and 4) Filtered CNV calls
For the actual analysis with CopyDetective, several parameters are available.
Category | Parameter | Details |
---|---|---|
Input | Input folder | Path to input files; input files must cover information on heterozygous polymorphisms called for every sample, including VAF and coverage of these polymorphisms in tumor and matching germline sample |
Sample names file | A 2-column txt-file is required; the germline sample names must be defined in the first column, the matching tumor sample names in the seocnd column (no header) | |
Output | Output folder | Path to which output files will be written |
Output files | The following can be selected: Raw CNV calls, Merged CNV calls, Filtered CNV calls | |
Output plots | The following can be selected: Raw CNV calls (all), Raw CNV calls (sig), Merged CNV calls, Filtered CNV calls | |
1. Quality Analysis | Detection thresholds available? | Select either yes or no |
yes: Upload detection thresholds file | A txt-file containing the columns: Sample, Window_del, CF_del, SNPs_del, Window_dup, CF_dup and SNPs_dup; an example is available in the example folder | |
no: How shall Quality Analysis be performed? | Select either Exact or Simulation | |
no and Exact: VAF for polymorphisms in the control sample | Select either Exact (variant calling results) or Simulated (expected value 0.5) | |
no and Simulation: Number of simulations | 10-99999; default: 500 | |
no: Evaluate Cell Fractions: Fractions to consider | 1-100; default: 5-100 | |
no: Evaluate Cell Fractions: In steps of... | 1-100; default: 5 | |
no: Evaluate Window sizes: Percentile to consider for window selection | 0-100; default: 95 | |
no: Strategy for detection threshold optimization | Select one of the following: Compromize (cell fraction and window size), Force (cell fraction), Force (window size) | |
no: Force (cell fraction) | Minimum (possible) cell fraction to consider | |
no: Force (window size) | Minimum (possible) window size to consider | |
2. CNV calling | Desired sensitivity | 0-1; default: 0.95 |
3. Merging | Maximum allowed distance between raw calls | 100,000-59128,983; default: 20,000,000 |
4. Filtration | Perform final filtration? | Select either yes or no |
yes: Quality threshold | 0-600; default: 10.76 |
An example, containing all necessary input for two patients can be found in the Example folder. Corresponding output files are provided in the subfolder output.
If you should experience any errors using CopyDetective, or you are looking for some additional features, feel free to open an issue or to contact Sarah Sandmann ( sarah.sandmann@uni-muenster.de ).
Sarah Sandmann, Marius Wöste, Aniek O de Graaf, Birgit Burkhardt, Joop H Jansen, Martin Dugas, CopyDetective: Detection threshold–aware copy number variant calling in whole-exome sequencing data, GigaScience, Volume 9, Issue 11, November 2020, giaa118, https://doi.org/10.1093/gigascience/giaa118.