Rxivist logo

Protocol for Community-created Public MS/MS Reference Library Within the GNPS Infrastructure

By Fernando Vargas, Kelly C. Weldon, Nicole Sikora, Mingxun Wang, Zheng Zhang, Emily C. Gentry, Morgan W. Panitchpakdi, Mauricio Caraballo, Pieter C. Dorrestein, Alan K. Jarmusch

Posted 15 Oct 2019
bioRxiv DOI: 10.1101/804401

Rationale: A major hurdle in identifying chemicals in mass spectrometry experiments is the availability of MS/MS reference spectra in public databases. Currently, scientists purchase databases or use public databases such as GNPS. The MSMS-Chooser workflow empowers the creation of MS/MS reference spectra directly in the GNPS infrastructure. Methods: An MSMS-Chooser sample template was completed with the required information and sequence tables were generated programmatically. Standards in methanol-water (1:1) solution (1 uM) were placed into wells individually. An LC-MS/MS system using data-dependent acquisition in positive and negative modes was used. Species that may be generated under typical ESI conditions are chosen. The MS/MS spectra and MSMS-Chooser sample template were subsequently uploaded to MSMS-Chooser in GNPS for automatic MS/MS spectral annotation. Results: Data acquisition quickly and effectively collected MS/MS spectra. MSMS-Chooser was able to accurately annotate 99.2% of the manually validated MS/MS scans that were generated from the chemical standards. The output of MSMS-Chooser includes a table ready for inclusion in the GNPS library (after inspection) as well as the ability to directly launch searches via MASST. Altogether, the data acquisition, processing, and upload to GNPS took ~2 hours for our proof-of-concept results. Conclusions: The MSMS-Chooser workflow enables the rapid data acquisition, analysis, and annotation of chemical standards, and uploads the MS/MS spectra to community-driven GNPS. MSMS-Chooser democratizes the creation of MS/MS reference spectra in GNPS which will improve annotation and strengthen the tools which use the annotation information.

Download data

  • Downloaded 359 times
  • Download rankings, all-time:
    • Site-wide: 52,411 out of 101,046
    • In bioinformatics: 5,938 out of 9,276
  • Year to date:
    • Site-wide: 27,621 out of 101,046
  • Since beginning of last month:
    • Site-wide: 51,241 out of 101,046

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


  • 20 Oct 2020: Support for sorting preprints using Twitter activity has been removed, at least temporarily, until a new source of social media activity data becomes available.
  • 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
  • 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
  • 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
  • 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
  • 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
  • 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
  • 22 Jan 2019: Nature just published an article about Rxivist and our data.
  • 13 Jan 2019: The Rxivist preprint is live!