Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 50,135 bioRxiv papers from 233,681 authors.

End-to-end differentiable learning of protein structure

By Mohammed AlQuraishi

Posted 14 Feb 2018
bioRxiv DOI: 10.1101/265231 (published DOI: 10.1016/j.cels.2019.03.006)

Accurate prediction of protein structure is one of the central challenges of biochemistry. Despite significant progress made by co-evolution methods to predict protein structure from signatures of residue-residue coupling found in the evolutionary record, a direct and explicit mapping between protein sequence and structure remains elusive, with no substantial recent progress. Meanwhile, rapid developments in deep learning, which have found remarkable success in computer vision, natural language processing, and quantum chemistry raise the question of whether a deep learning based approach to protein structure could yield similar advancements. A key ingredient of the success of deep learning is the reformulation of complex, human-designed, multi-stage pipelines with differentiable models that can be jointly optimized end-to-end. We report the development of such a model, which reformulates the entire structure prediction pipeline using differentiable primitives. Achieving this required combining four technical ideas: (1) the adoption of a recurrent neural architecture to encode the internal representation of protein sequence, (2) the parameterization of (local) protein structure by torsional angles, which provides a way to reason over protein conformations without violating the covalent chemistry of protein chains, (3) the coupling of local protein structure to its global representation via recurrent geometric units, and (4) the use of a differentiable loss function to capture deviations between predicted and experimental structures. To our knowledge this is the first end-to-end differentiable model for learning of protein structure. We test the effectiveness of this approach using two challenging tasks: the prediction of novel protein folds without the use of co-evolutionary information, and the prediction of known protein folds without the use of structural templates. On the first task the model achieves state-of-the-art performance, even when compared to methods that rely on co-evolutionary data. On the second task the model is competitive with methods that use experimental protein structures as templates, achieving 3-7Å accuracy despite being template-free. Beyond protein structure prediction, end-to-end differentiable models of proteins represent a new paradigm for learning and modeling protein structure, with potential applications in docking, molecular dynamics, and protein design.

Download data

  • Downloaded 12,102 times
  • Download rankings, all-time:
    • Site-wide: 59 out of 50,135
    • In bioinformatics: 9 out of 5,263
  • Year to date:
    • Site-wide: 27 out of 50,135
  • Since beginning of last month:
    • Site-wide: 6 out of 50,135

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide

Sign up for the Rxivist weekly newsletter! (Click here for more details.)