Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/hotfix'
Browse files Browse the repository at this point in the history
  • Loading branch information
susannasiebert committed Jun 8, 2023
2 parents f3df76d + 07f57b2 commit 927fd52
Show file tree
Hide file tree
Showing 12 changed files with 228 additions and 8 deletions.
1 change: 1 addition & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ jobs:
sudo apt-get install -y ghostscript
sudo apt-get install -y gcc
sudo apt-get install -y pandoc
pip install setuptools==57
- name: Install Python dependencies
run: |
pip install -r requirements.txt
Expand Down
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@
# The short X.Y version.
version = '3.1'
# The full version, including alpha/beta/rc tags.
release = '3.1.2'
release = '3.1.3'


# The language for content autogenerated by Sphinx. Refer to documentation
Expand Down
6 changes: 2 additions & 4 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,8 @@ New in Release |release|

This is a bugfix release. It fixes the following problem(s):

- It fixes an issue with parsing class II IEDB output files when running
pVACfuse or pVACbind, which resulted in the wrong binding prediction scores
being associated with certain epitopes.
- It adds missing import statements to pVACvector.
- It fixes an issue with the reference proteome match step where stop lost
mutations would throw a fatal error.

New in Version |version|
------------------------
Expand Down
8 changes: 8 additions & 0 deletions docs/releases/3_1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,11 @@ This is a bugfix release. It fixes the following problem(s):
pVACfuse or pVACbind, which resulted in the wrong binding prediction scores
being associated with certain epitopes.
- It adds missing import statements to pVACvector.

Version 3.1.3
-------------

This is a bugfix release. It fixes the following problem(s):

- It fixes an issue with the reference proteome match step where stop lost
mutations would throw a fatal error.
9 changes: 7 additions & 2 deletions pvactools/lib/calculate_reference_proteome_similarity.py
Original file line number Diff line number Diff line change
Expand Up @@ -201,9 +201,14 @@ def extract_n_mer_from_fs(self, full_peptide, wt_peptide, epitope, subpeptide_po
#This catches cases where the start position would cause too many leading wildtype amino acids, which would result
#in false-positive reference matches
if len(full_peptide) > len(wt_peptide):
diff_position = [i for i in range(len(wt_peptide)) if wt_peptide[i] != full_peptide[i]][0]
diffs = [i for i in range(len(wt_peptide)) if wt_peptide[i] != full_peptide[i]]
if diffs == []:
diffs = [len(wt_peptide)]
else:
diff_position = [i for i in range(len(full_peptide)) if wt_peptide[i] != full_peptide[i]][0]
diffs = [i for i in range(len(full_peptide)) if wt_peptide[i] != full_peptide[i]]
if diffs == []:
diffs = [len(full_peptide)]
diff_position = diffs[0]
min_start = diff_position - self.match_length + 1
if min_start > start:
start = min_start
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@

setup(
name="pvactools",
version="3.1.2",
version="3.1.3",
packages=[
"pvactools.tools",
"pvactools.tools.pvacbind",
Expand Down
18 changes: 18 additions & 0 deletions tests/test_calculate_reference_proteome_similarity.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,24 @@ def test_calculate_self_similarity(self):
os.remove(metric_file)
close_mock_fhs()

def test_wt_peptide_fully_in_mt_peptide(self):
with unittest.mock.patch('Bio.Blast.NCBIWWW.qblast', side_effect=mock_ncbiwww_qblast):
input_file = os.path.join(self.test_data_dir, 'input_wt_in_mt.tsv')
input_fasta = os.path.join(self.test_data_dir, 'input_wt_in_mt.fasta')
output_file = tempfile.NamedTemporaryFile()
metric_file = "{}.reference_matches".format(output_file.name)
self.assertFalse(CalculateReferenceProteomeSimilarity(input_file, input_fasta, output_file.name).execute())
self.assertTrue(cmp(
output_file.name,
os.path.join(self.test_data_dir, "output_wt_in_mt.tsv"),
))
self.assertTrue(cmp(
metric_file,
os.path.join(self.test_data_dir, "output_wt_in_mt.tsv.reference_matches"),
))
os.remove(metric_file)
close_mock_fhs()

def test_blastp_db_incompatible_with_species(self):
with self.assertRaises(Exception) as context:
input_file = os.path.join(self.test_data_dir, 'input.tsv')
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
<?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd">
<BlastOutput>
<BlastOutput_program>blastp</BlastOutput_program>
<BlastOutput_version>BLASTP 2.14.1+</BlastOutput_version>
<BlastOutput_reference>Stephen F. Altschul, Thomas L. Madden, Alejandro A. Sch&amp;auml;ffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), &quot;Gapped BLAST and PSI-BLAST: a new generation of protein database search programs&quot;, Nucleic Acids Res. 25:3389-3402.</BlastOutput_reference>
<BlastOutput_db>refseq_select_prot</BlastOutput_db>
<BlastOutput_query-ID>Query_62973</BlastOutput_query-ID>
<BlastOutput_query-def>unnamed protein product</BlastOutput_query-def>
<BlastOutput_query-len>30</BlastOutput_query-len>
<BlastOutput_param>
<Parameters>
<Parameters_matrix>BLOSUM62</Parameters_matrix>
<Parameters_expect>10</Parameters_expect>
<Parameters_gap-open>32767</Parameters_gap-open>
<Parameters_gap-extend>32767</Parameters_gap-extend>
<Parameters_filter>F</Parameters_filter>
</Parameters>
</BlastOutput_param>
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>Query_62973</Iteration_query-ID>
<Iteration_query-def>unnamed protein product</Iteration_query-def>
<Iteration_query-len>30</Iteration_query-len>
<Iteration_hits>
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>ref|NP_258260.1|</Hit_id>
<Hit_def>F-BAR and double SH3 domains protein 1 [Homo sapiens]</Hit_def>
<Hit_accession>NP_258260</Hit_accession>
<Hit_len>690</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>23.9769</Hsp_bit-score>
<Hsp_score>46</Hsp_score>
<Hsp_evalue>2.06221</Hsp_evalue>
<Hsp_query-from>15</Hsp_query-from>
<Hsp_query-to>27</Hsp_query-to>
<Hsp_hit-from>286</Hsp_hit-from>
<Hsp_hit-to>298</Hsp_hit-to>
<Hsp_query-frame>0</Hsp_query-frame>
<Hsp_hit-frame>0</Hsp_hit-frame>
<Hsp_identity>7</Hsp_identity>
<Hsp_positive>10</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>13</Hsp_align-len>
<Hsp_qseq>FLKPPALISPSPP</Hsp_qseq>
<Hsp_hseq>FLQEPGVFSPTPP</Hsp_hseq>
<Hsp_midline>FL+ P + SP+PP</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
<Hit>
<Hit_num>2</Hit_num>
<Hit_id>ref|NP_001277116.1|</Hit_id>
<Hit_def>protein KRBA1 isoform 2 [Homo sapiens]</Hit_def>
<Hit_accession>NP_001277116</Hit_accession>
<Hit_len>1064</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>23.5187</Hsp_bit-score>
<Hsp_score>45</Hsp_score>
<Hsp_evalue>3.98271</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>26</Hsp_query-to>
<Hsp_hit-from>695</Hsp_hit-from>
<Hsp_hit-to>720</Hsp_hit-to>
<Hsp_query-frame>0</Hsp_query-frame>
<Hsp_hit-frame>0</Hsp_hit-frame>
<Hsp_identity>11</Hsp_identity>
<Hsp_positive>15</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>26</Hsp_align-len>
<Hsp_qseq>DVTKPVPHLRLLIAFLKPPALISPSP</Hsp_qseq>
<Hsp_hseq>DLWKPLPQERDRLPSCKPPVPLSPCP</Hsp_hseq>
<Hsp_midline>D+ KP+P R + KPP +SP P</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
<Hit>
<Hit_num>3</Hit_num>
<Hit_id>ref|NP_055858.2|</Hit_id>
<Hit_def>TBC1 domain family member 9B isoform b [Homo sapiens]</Hit_def>
<Hit_accession>NP_055858</Hit_accession>
<Hit_len>1233</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>23.0605</Hsp_bit-score>
<Hsp_score>44</Hsp_score>
<Hsp_evalue>4.03719</Hsp_evalue>
<Hsp_query-from>2</Hsp_query-from>
<Hsp_query-to>19</Hsp_query-to>
<Hsp_hit-from>740</Hsp_hit-from>
<Hsp_hit-to>757</Hsp_hit-to>
<Hsp_query-frame>0</Hsp_query-frame>
<Hsp_hit-frame>0</Hsp_hit-frame>
<Hsp_identity>8</Hsp_identity>
<Hsp_positive>12</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>18</Hsp_align-len>
<Hsp_qseq>VTKPVPHLRLLIAFLKPP</Hsp_qseq>
<Hsp_hseq>VSPPIPHLRALLSSSDDP</Hsp_hseq>
<Hsp_midline>V+ P+PHLR L++ P</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
<Hit>
<Hit_num>4</Hit_num>
<Hit_id>ref|NP_112235.2|</Hit_id>
<Hit_def>mediator of RNA polymerase II transcription subunit 25 isoform 1 [Homo sapiens]</Hit_def>
<Hit_accession>NP_112235</Hit_accession>
<Hit_len>747</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>22.6023</Hsp_bit-score>
<Hsp_score>43</Hsp_score>
<Hsp_evalue>6.13634</Hsp_evalue>
<Hsp_query-from>4</Hsp_query-from>
<Hsp_query-to>27</Hsp_query-to>
<Hsp_hit-from>186</Hsp_hit-from>
<Hsp_hit-to>209</Hsp_hit-to>
<Hsp_query-frame>0</Hsp_query-frame>
<Hsp_hit-frame>0</Hsp_hit-frame>
<Hsp_identity>11</Hsp_identity>
<Hsp_positive>14</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>24</Hsp_align-len>
<Hsp_qseq>KPVPHLRLLIAFLKPPALISPSPP</Hsp_qseq>
<Hsp_hseq>RKLPALRLLFEKAAPPALLEPLQP</Hsp_hseq>
<Hsp_midline>+ +P LRLL PPAL+ P P</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
<Hit>
<Hit_num>5</Hit_num>
<Hit_id>ref|NP_055977.3|</Hit_id>
<Hit_def>long-chain-fatty-acid--CoA ligase ACSBG1 isoform 1 [Homo sapiens]</Hit_def>
<Hit_accession>NP_055977</Hit_accession>
<Hit_len>724</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>22.6023</Hsp_bit-score>
<Hsp_score>43</Hsp_score>
<Hsp_evalue>6.58373</Hsp_evalue>
<Hsp_query-from>2</Hsp_query-from>
<Hsp_query-to>19</Hsp_query-to>
<Hsp_hit-from>222</Hsp_hit-from>
<Hsp_hit-to>239</Hsp_hit-to>
<Hsp_query-frame>0</Hsp_query-frame>
<Hsp_hit-frame>0</Hsp_hit-frame>
<Hsp_identity>6</Hsp_identity>
<Hsp_positive>13</Hsp_positive>
<Hsp_gaps>0</Hsp_gaps>
<Hsp_align-len>18</Hsp_align-len>
<Hsp_qseq>VTKPVPHLRLLIAFLKPP</Hsp_qseq>
<Hsp_hseq>IWKQLPHLKAVVIYKEPP</Hsp_hseq>
<Hsp_midline>+ K +PHL+ ++ + +PP</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
</Iteration_hits>
<Iteration_stat>
<Statistics>
<Statistics_db-num>19390</Statistics_db-num>
<Statistics_db-len>11240616</Statistics_db-len>
<Statistics_hsp-len>0</Statistics_hsp-len>
<Statistics_eff-space>0</Statistics_eff-space>
<Statistics_kappa>0.134</Statistics_kappa>
<Statistics_lambda>0.3176</Statistics_lambda>
<Statistics_entropy>0.4012</Statistics_entropy>
</Statistics>
</Iteration_stat>
</Iteration>
</BlastOutput_iterations>
</BlastOutput>
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
>WT.1.CDK2.ENST00000266970.FS.298-299C/CT
DVTKPVPHLRL
>MT.1.CDK2.ENST00000266970.FS.298-299C/CT
DVTKPVPHLRLLIAFLKPPALISPSPPVWA
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Chromosome Start Stop Reference Variant Transcript Transcript Support Level Ensembl Gene ID Variant Type Mutation Protein Position Gene Name HGVSc HGVSp HLA Allele Peptide Length Sub-peptide Position Mutation Position MT Epitope Seq WT Epitope Seq Best MT Score Method Best MT Score Corresponding WT Score Corresponding Fold Change Best MT Percentile Method Best MT Percentile Corresponding WT Percentile Tumor DNA Depth Tumor DNA VAF Tumor RNA Depth Tumor RNA VAF Normal Depth Normal VAF Gene Expression Transcript Expression Median MT Score Median WT Score Median Fold Change Median MT Percentile Median WT Percentile MHCflurry WT Score MHCflurry MT Score MHCflurry WT Percentile MHCflurry MT Percentile MHCnuggetsI WT Score MHCnuggetsI MT Score MHCnuggetsI WT Percentile MHCnuggetsI MT Percentile NetMHC WT Score NetMHC MT Score NetMHC WT Percentile NetMHC MT Percentile NetMHCcons WT Score NetMHCcons MT Score NetMHCcons WT Percentile NetMHCcons MT Percentile NetMHCpan WT Score NetMHCpan MT Score NetMHCpan WT Percentile NetMHCpan MT Percentile PickPocket WT Score PickPocket MT Score PickPocket WT Percentile PickPocket MT Percentile SMM WT Score SMM MT Score SMM WT Percentile SMM MT Percentile SMMPMBEC WT Score SMMPMBEC MT Score SMMPMBEC WT Percentile SMMPMBEC MT Percentile Index cterm_7mer_gravy_score max_7mer_gravy_score difficult_n_terminal_residue c_terminal_cysteine c_terminal_proline cysteine_count n_terminal_asparagine asparagine_proline_bond_count
chr12 55971622 55971622 C CT ENST00000266970 1 ENSG00000123374 FS -/X 298-299 CDK2 ENST00000266970.9:c.895dup ENSP00000266970.4:p.Ter299LeufsTer20 HLA-C*12:03 9 11 NA IAFLKPPAL NA NetMHCpan 14.32 NA NA MHCflurry 0.003 NA 268 0.267 NA NA 136 0.008 NA NA 47.332 NA NA 0.5 NA NA 28.543855693594235 NA 0.002625 NA 333.37 NA NA NA 364.63 NA 0.44 NA 64.78 NA 1.1 NA 14.32 NA 0.04 NA 1073.583066484683 NA 0.5 NA 16.077184658902514 NA 19.0 NA 29.883399446491428 NA 43.0 1.CDK2.ENST00000266970.FS.298-299C/CT 0.7285714285714285 0.8285714285714284 False False False 0 False 0
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Chromosome Start Stop Reference Variant Transcript Transcript Support Level Ensembl Gene ID Variant Type Mutation Protein Position Gene Name HGVSc HGVSp HLA Allele Peptide Length Sub-peptide Position Mutation Position MT Epitope Seq WT Epitope Seq Best MT Score Method Best MT Score Corresponding WT Score Corresponding Fold Change Best MT Percentile Method Best MT Percentile Corresponding WT Percentile Tumor DNA Depth Tumor DNA VAF Tumor RNA Depth Tumor RNA VAF Normal Depth Normal VAF Gene Expression Transcript Expression Median MT Score Median WT Score Median Fold Change Median MT Percentile Median WT Percentile MHCflurry WT Score MHCflurry MT Score MHCflurry WT Percentile MHCflurry MT Percentile MHCnuggetsI WT Score MHCnuggetsI MT Score MHCnuggetsI WT Percentile MHCnuggetsI MT Percentile NetMHC WT Score NetMHC MT Score NetMHC WT Percentile NetMHC MT Percentile NetMHCcons WT Score NetMHCcons MT Score NetMHCcons WT Percentile NetMHCcons MT Percentile NetMHCpan WT Score NetMHCpan MT Score NetMHCpan WT Percentile NetMHCpan MT Percentile PickPocket WT Score PickPocket MT Score PickPocket WT Percentile PickPocket MT Percentile SMM WT Score SMM MT Score SMM WT Percentile SMM MT Percentile SMMPMBEC WT Score SMMPMBEC MT Score SMMPMBEC WT Percentile SMMPMBEC MT Percentile Index cterm_7mer_gravy_score max_7mer_gravy_score difficult_n_terminal_residue c_terminal_cysteine c_terminal_proline cysteine_count n_terminal_asparagine asparagine_proline_bond_count Reference Match
chr12 55971622 55971622 C CT ENST00000266970 1 ENSG00000123374 FS -/X 298-299 CDK2 ENST00000266970.9:c.895dup ENSP00000266970.4:p.Ter299LeufsTer20 HLA-C*12:03 9 11 NA IAFLKPPAL NA NetMHCpan 14.32 NA NA MHCflurry 0.003 NA 268 0.267 NA NA 136 0.008 NA NA 47.332 NA NA 0.5 NA NA 28.543855693594235 NA 0.002625 NA 333.37 NA NA NA 364.63 NA 0.44 NA 64.78 NA 1.1 NA 14.32 NA 0.04 NA 1073.583066484683 NA 0.5 NA 16.077184658902514 NA 19.0 NA 29.883399446491428 NA 43.0 1.CDK2.ENST00000266970.FS.298-299C/CT 0.7285714285714285 0.8285714285714284 False False False 0 False 0 False
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Chromosome Start Stop Reference Variant Transcript MT Epitope Seq Peptide Hit ID Hit Definition Query Sequence Query Window Match Sequence Match Start Match Stop

0 comments on commit 927fd52

Please sign in to comment.