Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pVACFuse: KeyError: '25.GCN1-MSI1.ENST00000300648.7-ENST00000257552.7.inframe_fusion.32' #1125

Closed
MaxMichaeler opened this issue Jul 11, 2024 · 1 comment · Fixed by #1130

Comments

@MaxMichaeler
Copy link

MaxMichaeler commented Jul 11, 2024

Installation Type

Standalone

pVACtools Version / Docker Image

4.2.1

Python Version

3.9.18

Operating System

Linux

Describe the bug

I'm trying to run pVACFuse in a Nextflow pipeline using a Singularity image and each time I run it I keep getting the error:

Traceback (most recent call last): File "/opt/conda/bin/pvacfuse", line 8, in <module> sys.exit(main()) File "/opt/conda/lib/python3.9/site-packages/pvactools/tools/pvacfuse/main.py", line 108, in main args[0].func.main(args[1]) File "/opt/conda/lib/python3.9/site-packages/pvactools/tools/pvacfuse/run.py", line 245, in main create_net_class_report(output_files, all_epitopes_file, filtered_file, args, run_arguments) File "/opt/conda/lib/python3.9/site-packages/pvactools/tools/pvacfuse/run.py", line 42, in create_net_class_report PostProcessor(**post_processing_params).execute() File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/post_processor.py", line 65, in execute self.calculate_reference_proteome_similarity() File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/post_processor.py", line 252, in calculate_reference_proteome_similarity CalculateReferenceProteomeSimilarity( File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 595, in execute unique_peptides = pymp.shared.list(self._get_unique_peptides(mt_records_dict, wt_records_dict)) File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 575, in _get_unique_peptides peptide, full_peptide = self._get_peptide(line, mt_records_dict, wt_records_dict) File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 283, in _get_peptide peptide = mt_records_dict[line['ID']] KeyError: '25.GCN1-MSI1.ENST00000300648.7-ENST00000257552.7.inframe_fusion.32'

I've tried looking at similar issues, but they all seem to relate to some bug. My input is an Arriba .tsv file.

How to reproduce this bug

export BLASTDB=/opt/resourceDir/blast_db
export MHCFLURRY_DATA_DIR=/opt/pvacfuse/mhcflurry_data
blastp_path=`which blastp`

pvacfuse run \
    -e1 8,9,10,11 \
    -e2 15,16,17,18,19,20,21,22,23,24,25 \
    --iedb-install-directory /opt/pvacfuse/iedb_data \
    --run-reference-proteome-similarity \
    -t 6 \
    --blastp-path $blastp_path \
    --blastp-db refseq_select_prot \
    -s 100 \
    -d 200 \
     \
    SRR1107833.fusions.tsv \
    SRR1107833 \
    DQB1*06:02 \
    DeepImmuno MHCflurry MHCflurryEL MHCnuggetsI MHCnuggetsII NNalign NetMHC NetMHCIIpan NetMHCIIpanEL NetMHCpan SMM SMMPMBEC \
    ./

Input files

Github doesn't allow .tsv files, but this should be run as a .tsv and not .txt

SRR1107833.fusions.txt

Log output

All prerequisites found!
Copying the standalone-specific netMHCcons template into place
IEDB MHC class I binding prediction tools successfully installed!
Use the command 'python src/predict_binding.py' to get started
/data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5
No MHC class I alleles chosen. Skipping MHC class II predictions.
Executing MHC Class II predictions
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-97
Completed
Allele DQB106:02 not valid for Method NNalign. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB1
06:02 and Epitope Length 15 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/15/tmp/SRR1107833.MHCnuggetsII.DQB106:02.15.tsv_1-97
Making binding predictions on Allele DQB1
06:02 and Epitope Length 15 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/15/tmp/SRR1107833.MHCnuggetsII.DQB106:02.15.tsv_1-97 - Completed
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 15 - Entries 1-97
Parsing prediction file for Allele DQB106:02 and Epitope Length 15 - Entries 1-97 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-97
Completed
Allele DQB1
06:02 not valid for Method NNalign. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB106:02 and Epitope Length 16 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/16/tmp/SRR1107833.MHCnuggetsII.DQB106:02.16.tsv_1-97
Making binding predictions on Allele DQB106:02 and Epitope Length 16 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/16/tmp/SRR1107833.MHCnuggetsII.DQB106:02.16.tsv_1-97 - Completed
Parsing prediction file for Allele DQB106:02 and Epitope Length 16 - Entries 1-97
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 16 - Entries 1-97 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-97
Completed
Allele DQB106:02 not valid for Method NNalign. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB1
06:02 and Epitope Length 17 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/17/tmp/SRR1107833.MHCnuggetsII.DQB106:02.17.tsv_1-97
Making binding predictions on Allele DQB1
06:02 and Epitope Length 17 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/17/tmp/SRR1107833.MHCnuggetsII.DQB106:02.17.tsv_1-97 - Completed
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 17 - Entries 1-97
Parsing prediction file for Allele DQB106:02 and Epitope Length 17 - Entries 1-97 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-97
Completed
Allele DQB1
06:02 not valid for Method NNalign. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB106:02 and Epitope Length 18 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/18/tmp/SRR1107833.MHCnuggetsII.DQB106:02.18.tsv_1-97
Making binding predictions on Allele DQB106:02 and Epitope Length 18 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/18/tmp/SRR1107833.MHCnuggetsII.DQB106:02.18.tsv_1-97 - Completed
Parsing prediction file for Allele DQB106:02 and Epitope Length 18 - Entries 1-97
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 18 - Entries 1-97 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-98
Completed
Allele DQB106:02 not valid for Method NNalign. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB1
06:02 and Epitope Length 19 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/19/tmp/SRR1107833.MHCnuggetsII.DQB106:02.19.tsv_1-98
Making binding predictions on Allele DQB1
06:02 and Epitope Length 19 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/19/tmp/SRR1107833.MHCnuggetsII.DQB106:02.19.tsv_1-98 - Completed
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 19 - Entries 1-98
Parsing prediction file for Allele DQB106:02 and Epitope Length 19 - Entries 1-98 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-98
Completed
Allele DQB1
06:02 not valid for Method NNalign. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB106:02 and Epitope Length 20 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/20/tmp/SRR1107833.MHCnuggetsII.DQB106:02.20.tsv_1-98
Making binding predictions on Allele DQB106:02 and Epitope Length 20 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/20/tmp/SRR1107833.MHCnuggetsII.DQB106:02.20.tsv_1-98 - Completed
Parsing prediction file for Allele DQB106:02 and Epitope Length 20 - Entries 1-98
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 20 - Entries 1-98 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-98
Completed
Allele DQB106:02 not valid for Method NNalign. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB1
06:02 and Epitope Length 21 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/21/tmp/SRR1107833.MHCnuggetsII.DQB106:02.21.tsv_1-98
Making binding predictions on Allele DQB1
06:02 and Epitope Length 21 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/21/tmp/SRR1107833.MHCnuggetsII.DQB106:02.21.tsv_1-98 - Completed
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 21 - Entries 1-98
Parsing prediction file for Allele DQB106:02 and Epitope Length 21 - Entries 1-98 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-98
Completed
Allele DQB1
06:02 not valid for Method NNalign. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB106:02 and Epitope Length 22 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/22/tmp/SRR1107833.MHCnuggetsII.DQB106:02.22.tsv_1-98
Making binding predictions on Allele DQB106:02 and Epitope Length 22 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/22/tmp/SRR1107833.MHCnuggetsII.DQB106:02.22.tsv_1-98 - Completed
Parsing prediction file for Allele DQB106:02 and Epitope Length 22 - Entries 1-98
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 22 - Entries 1-98 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-98
Completed
Allele DQB106:02 not valid for Method NNalign. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB1
06:02 and Epitope Length 23 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/23/tmp/SRR1107833.MHCnuggetsII.DQB106:02.23.tsv_1-98
Making binding predictions on Allele DQB1
06:02 and Epitope Length 23 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/23/tmp/SRR1107833.MHCnuggetsII.DQB106:02.23.tsv_1-98 - Completed
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 23 - Entries 1-98
Parsing prediction file for Allele DQB106:02 and Epitope Length 23 - Entries 1-98 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-97
Completed
Allele DQB1
06:02 not valid for Method NNalign. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB106:02 and Epitope Length 24 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/24/tmp/SRR1107833.MHCnuggetsII.DQB106:02.24.tsv_1-97
Making binding predictions on Allele DQB106:02 and Epitope Length 24 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/24/tmp/SRR1107833.MHCnuggetsII.DQB106:02.24.tsv_1-97 - Completed
Parsing prediction file for Allele DQB106:02 and Epitope Length 24 - Entries 1-97
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 24 - Entries 1-97 - Completed
Combining Parsed Prediction Files
Completed
Converting Fusion file to TSV
Completed
Generating Variant Peptide FASTA and Key File
Completed
Parsing the Variant Peptide FASTA and Key File
Completed
Calculating Manufacturability Metrics
Completed
Splitting FASTA into smaller chunks
Splitting FASTA into smaller chunks - Entries 1-97
Completed
Allele DQB106:02 not valid for Method NNalign. Skipping.
Allele DQB1
06:02 not valid for Method NetMHCIIpan. Skipping.
Allele DQB106:02 not valid for Method NetMHCIIpanEL. Skipping.
Making binding predictions on Allele DQB1
06:02 and Epitope Length 25 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/25/tmp/SRR1107833.MHCnuggetsII.DQB106:02.25.tsv_1-97
Making binding predictions on Allele DQB1
06:02 and Epitope Length 25 with Method MHCnuggetsII - File /data/scratch/work/8c/7131152b3ed15e50a0d492e43f39b5/MHC_Class_II/25/tmp/SRR1107833.MHCnuggetsII.DQB106:02.25.tsv_1-97 - Completed
Parsing prediction file for Allele DQB1
06:02 and Epitope Length 25 - Entries 1-97
Parsing prediction file for Allele DQB1*06:02 and Epitope Length 25 - Entries 1-97 - Completed
Combining Parsed Prediction Files
Completed
Creating aggregated report
Completed
Calculating Manufacturability Metrics
Completed
Running Binding Filters
Completed
Running Coverage Filters
Completed
Running Top Score Filter
Completed
Calculating Reference Proteome Similarity
Traceback (most recent call last):
File "/opt/conda/bin/pvacfuse", line 8, in
sys.exit(main())
File "/opt/conda/lib/python3.9/site-packages/pvactools/tools/pvacfuse/main.py", line 108, in main
args[0].func.main(args[1])
File "/opt/conda/lib/python3.9/site-packages/pvactools/tools/pvacfuse/run.py", line 245, in main
create_net_class_report(output_files, all_epitopes_file, filtered_file, args, run_arguments)
File "/opt/conda/lib/python3.9/site-packages/pvactools/tools/pvacfuse/run.py", line 42, in create_net_class_report
PostProcessor(**post_processing_params).execute()
File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/post_processor.py", line 65, in execute
self.calculate_reference_proteome_similarity()
File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/post_processor.py", line 252, in calculate_reference_proteome_similarity
CalculateReferenceProteomeSimilarity(
File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 595, in execute
unique_peptides = pymp.shared.list(self._get_unique_peptides(mt_records_dict, wt_records_dict))
File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 575, in _get_unique_peptides
peptide, full_peptide = self._get_peptide(line, mt_records_dict, wt_records_dict)
File "/opt/conda/lib/python3.9/site-packages/pvactools/lib/calculate_reference_proteome_similarity.py", line 283, in _get_peptide
peptide = mt_records_dict[line['ID']]
KeyError: '25.GCN1-MSI1.ENST00000300648.7-ENST00000257552.7.inframe_fusion.32'

Output files

No response

@susannasiebert
Copy link
Contributor

This issue should be fixed in version 4.3.0. Please give it a try and reopen this issue if you're still getting this error in version 4.3.0. You will need to run from scratch in order for the fix to take effect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants