Chemical cross-linking coupled with mass spectrometry is a powerful tool to study protein-protein interactions and protein conformations. Two linked peptides are ionized and fragmented to produce a tandem mass spectrum. In such an experiment, a tandem mass spectrum contains ions from two peptides. The peptide identification problem becomes a peptide-peptide pair identification problem. Currently, most existing tools don't search all possible pairs due to the quadratic time complexity. Consequently, a significant percentage of linked peptides are missed. In our earlier work, we developed a tool named ECL to search all pairs of peptides exhaustively. While ECL does not miss any linked peptides, it is very slow due to the quadratic computational complexity, especially when the database is large. Furthermore, ECL uses a score function without statistical calibration, while researchers have demonstrated that using a statistical calibrated score function can achieve a higher sensitivity than using an uncalibrated one. Here, we propose an advanced version of ECL, named ECL 2.0. It achieves a linear time and space complexity by taking advantage of the additive property of a score function. It can analyze a typical data set containing tens of thousands of spectra using a large-scale database containing thousands of proteins in a few hours. Comparison with other five state-of-the-art tools shows that ECL 2.0 is much faster than pLink, StavroX, ProteinProspector, and ECL. Kojak is the only one tool that is faster than ECL 2.0. But Kojak does not exhaustively search all possible peptide pairs. We also adopt an e-value estimation method to calibrate the original score. Comparison shows that ECL 2.0 has the highest sensitivity among the state-of-the-art tools. The experiment using a large-scale in vivo cross-linking data set demonstrates that ECL 2.0 is the only tool that can find PSMs passing the false discovery rate threshold. The result illustrates that exhaustive search and well calibrated score function are useful to find PSMs from a huge search space.
- Downloaded 675 times
- Download rankings, all-time:
- Site-wide: 32,571
- In bioinformatics: 3,723
- Year to date:
- Site-wide: 108,316
- Since beginning of last month:
- Site-wide: 82,613
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!