-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
USI reported by quantms for DIANN experiments. #350
Comments
In non-tims data the apex RT reported by DIA-NN should uniquely identify the scan. |
Thanks, @vdemichev, for responding. We were having a look to some of them and saw not to many fragments, for example https://www.ebi.ac.uk/pride/archive/usi?usi=mzspec%3APXD019909%3A20180914_QE8_nLC0_BDA_SA_DIA_Keratinocytes_NN002%3Ascan%3A7150%3AGKQEEEKPGEEK%2F2&resultType=FULL. If you use in the viewer mass type mono and 40 ppm you will see only 4 b-ions and 2 y-ions, the Proline for example is not even identified. I'm not used to see DIA spectra in viz tools, my question is:
USIs and viz of spectra in DDA is quite common, see this https://www.ebi.ac.uk/pride/archive/usi?usi=mzspec:PXD000561:Adult_Frontalcortex_bRP_Elite_85_f09:scan:17555:VLHPLEGAVVIIFK/2 we are exploring and checking if this may sense for DIA to write some guidelines about DIA USIs. Your feedback is more than welcome. BTW, we are the only one's generating USIs for DIA & DIANN experiments. |
Hi all, I am currently looking into similar visualizations from quantms DIA-NN output. DDA-like PSM visualization at the apex can be a supplementary view but might be very convoluted. |
Here the logic how we match diann report to ms_info.tsv: Lines 838 to 858 in fd20331
Basically, we choose 'RT.Start' (seconds as unit) as reteintion time in the report and changed seconds to minutes, at last we merge report and ms_info on column 'RT.start' to make sure matching nearest reteintion times.
|
I would suggest to use RT instead of RT.Start |
@jpfeuffer can you provide some ideas and visualization about examples it. I don't have access to Skyline? |
Hi @jpfeuffer @vdemichev, in the discussion above, you mentioned using |
I think there might be a confusion between "scan start time" of a spectrum and "elution start time" of a peptide or peptide fragment. I think you have to compare "RT" (= apex of peptide elution) from DIA-NN with "getRT()" (="scan [start] time") from pyopenms. |
Hi @jpfeuffer, as we discussed before, we might need to pick the closest MS2 spectrum that has the precursor mass in its precursor range. Are you suggested to filter MS2 ions for every ion in DIA-NN report and then match the RT? I used |
You can use p.getMZ() - p.getIsolationWindowLowerOffset() I would use that as a test and see if it is really necessary. I would think that the closest spectrum to the RT apex that DIANN reports always has the correct isolation window. |
@jpfeuffer @ypriverol After tesing in PXD026600, there are 49 of 26874 ions out of thier precursor mz range when using |
Interesting. Maybe some precursor correction going on? |
Their distance from their precursor mz window is distributed between |
I would take RT from DIA-NN report, and match it to the scan RT in mzML, as suggested by @jpfeuffer. This can be done regardless of anything in MS1 data and regardless of m/z tolerances - I would ignore those and only use for extracting the apex spectrum - but in this case I would use relatively wide tolerances, as raw mzML is not mass calibrated. |
Hi @vdemichev, thanks for your comment and suggestion above all! When matching |
What is expected is an ideal match (RT in minutes). What should also be the case is that the corresponding isolation window contians the precursor mass. |
Hi @jpfeuffer @vdemichev, here we use the following method to obtain the PSM: In general, for each ion in report, we find the ions of MS2 that have precursor isolation window which contains the ion's precursor mz, and perform RT matching. Such traversal is extremely time consuming and would be appreciated if you have a more suitable method. @ypriverol We have added |
@WangHong007 what if you just create an RT index for each run (RT bin -> file position mapping), and then query the spectrum corresponding to the closest RT? |
@WangHong007 can you generate one example DIA output if USIs. for all of us to test and visualize how they look. |
@ypriverol mztab and USIs are here: /~https://github.com/WangHong007/Data-Folder/tree/main/quantms/DIA-NN_USIs/PXD026600 |
@WangHong007 I have searched in PRIDE Archive USI two of the USIs from the file:
Both of these spectra and peptide combination do not have a single ion annotated. Can you lookup how those USI were created, basically look for the scan in the mzML, the RT and the DIANN output. Check How the USI was created. |
Hi all! Here are the several important indicators of these two USIs:
|
Hi all, there are still very few ion matches in some PSMs when taking RT from DIA-NN report, and match it to closest the scan RT in mzML. 20ppm and 40 ppm fragment tolerence are used respectively. regardless of anything in MS1 data. The PXD026600 ions annotation are follow and sorted by Q.Value and PEP. Neutral Loss are considered. PXD026600_USI_ions_matched_20PPM.zip More ions are matched in PSMs when increasing to 40 ppm. But there are still very few ion matches. A few examples:
So Is this kind of number of fragment annotations normally? if the spectra MS2 we matched is the one that it suppose to be? @ypriverol @vdemichev @jpfeuffer |
Can you share an example project as small as possible, with the mzmls, diann_report, and whatever goes into your scripts (e.g. your mzml_statistics/summary files)? |
One possibility: ions can be missing from specific scans. DIA-NN looks at the entire XICs around the apex, not just at the apex spectrum, and hence the latter does not necessarily include all ions, although this is most often the case. Also of course any issues with mass calibration can play a role here. You can always verify what exactly DIA-NN is matching to identify the peptide using the --vis command. |
What you are saying is that you can probably have more than one MS2 for a given Peptide, where some ions are in one MS2 and others in another one? |
Done in #419 |
Description of the bug
@WangHong007 @zprobot @daichengxin:
I have been recently testing multiple USIs generated by quantms for DIA experiments. A little background about the problem, USIs are a way to reference directly to the scan and spectrum that was used to identify the spectrum, in someway is the fundamental evidence on DDA identification; here is an example.
All DDA search engines keep track of the scan that was used to identify the spectrum. However, in DIA experiments other features are also relevant and DIANN do not trace in the output files of the scan number that was used to identify the peptide. In quantms, we have implemented a logic to "find" for every peptide the scan number used to identify the peptide. @zprobot provided all the USIs for reanalysis PXD019909; however, when we were doing a visualization of multiple USIs they look like random USI meaning the spectrum looks like do not correspond to the given peptide, see example.
I propose the following follow up:
Please let me know if you need more discussion.
Additional examples that looks wrong:
Command used and terminal output
No response
Relevant files
No response
System information
No response
The text was updated successfully, but these errors were encountered: