Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 65,445 bioRxiv papers from 289,895 authors.

Unified rational protein engineering with sequence-only deep representation learning

By Ethan C Alley, Grigory Khimulya, Surojit Biswas, Mohammed AlQuraishi, George M Church

Posted 26 Mar 2019
bioRxiv DOI: 10.1101/589333

Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabelled amino acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily, and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approach reaches near state-of-the-art or superior performance predicting stability of natural and de novo designed proteins as well as quantitative function of molecularly diverse mutants. UniRep further enables two orders of magnitude cost savings in a protein engineering task. We conclude UniRep is a versatile protein summary that can be applied across protein engineering informatics.

Download data

  • Downloaded 6,404 times
  • Download rankings, all-time:
    • Site-wide: 271 out of 65,445
    • In synthetic biology: 6 out of 620
  • Year to date:
    • Site-wide: 60 out of 65,445
  • Since beginning of last month:
    • Site-wide: 32 out of 65,445

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News