Rxivist logo

Network-based prediction of protein interactions

By István A. Kovács, Katja Luck, Kerstin Spirohn, Yang Wang, Carl Pollis, Sadie Schlabach, Wenting Bian, Dae-Kyum Kim, Nishka Kishore, Tong Hao, Michael A. Calderwood, Marc Vidal, Albert-László Barabási

Posted 02 Mar 2018
bioRxiv DOI: 10.1101/275529 (published DOI: 10.1038/s41467-019-09177-y)

As biological function emerges through interactions between a cell's molecular constituents, understanding cellular mechanisms requires us to catalogue all physical interactions between proteins. Despite spectacular advances in high-throughput mapping, the number of missing human protein-protein interactions (PPIs) continues to exceed the experimentally documented interactions. Computational tools that exploit structural, sequence or network topology information are increasingly used to fill in the gap, using the patterns of the already known interactome to predict undetected, yet biologically relevant interactions. Such network-based link prediction tools rely on the Triadic Closure Principle (TCP), stating that two proteins likely interact if they share multiple interaction partners. TCP is rooted in social network analysis, namely the observation that the more common friends two individuals have, the more likely that they know each other. Here, we offer direct empirical evidence across multiple datasets and organisms that, despite its dominant use in biological link prediction, TCP is not valid for most protein pairs. We show that this failure is fundamental - TCP violates both structural constraints and evolutionary processes. This understanding allows us to propose a link prediction principle, consistent with both structural and evolutionary arguments, that predicts yet uncovered protein interactions based on paths of length three (L3). A systematic computational cross-validation shows that the L3 principle significantly outperforms existing link prediction methods. To experimentally test the L3 predictions, we perform both large-scale high-throughput and pairwise tests, finding that the predicted links test positively at the same rate as previously known interactions, suggesting that most (if not all) predicted interactions are real. Combining L3 predictions with experimental tests provided new interaction partners of FAM161A, a protein linked to retinitis pigmentosa, offering novel insights into the molecular mechanisms that lead to the disease. Because L3 is rooted in a fundamental biological principle, we expect it to have a broad applicability, enabling us to better understand the emergence of biological function under both healthy and pathological conditions.

Download data

  • Downloaded 2,091 times
  • Download rankings, all-time:
    • Site-wide: 3,397 out of 89,238
    • In systems biology: 86 out of 2,308
  • Year to date:
    • Site-wide: 26,634 out of 89,238
  • Since beginning of last month:
    • Site-wide: 24,232 out of 89,238

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)