Rxivist logo

Frameshift and wild-type proteins are highly similar because the genetic code and genomes were optimized for frameshift tolerance

By Xiaolong Wang, Quanjiang Dong, Gang Chen, Jianye Zhang, Yongqiang Liu, Yujia Cai

Posted 25 Aug 2016
bioRxiv DOI: 10.1101/067736

Frameshift protein sequences encoded by alternative reading frames of coding genes have been considered meaningless, and frameshift mutations have been considered of little importance for the molecular evolution of coding genes and proteins. However, functional frameshifts have been found widely existing. It was puzzling how a frameshift protein kept its structure and functionality while its amino-acid sequence was changed substantially. Here we show that frame similarities between frameshifts and wild types are higher than random similarities and are defined at the genetic code, gene, and genome levels. In the standard genetic code, frameshift codon substitutions are more conservative than random substitutions. The frameshift tolerability of the standard genetic code ranks in the top 2.0-3.5% of alternative genetic codes, showing that the genetic code is nearly optimal for frameshift tolerance. Furthermore, frameshift-resistant codons (codon pairs) appear more frequently than expected in many genes and certain genomes, showing that the frameshift optimality is reflected not only in the genetic code but more importantly, in its allowance of further optimizing the frameshift tolerance of a particular gene or genome, which shed light on the role of frameshift mutations in molecular and genomic evolution.

Download data

  • Downloaded 2,438 times
  • Download rankings, all-time:
    • Site-wide: 8,824
    • In genetics: 346
  • Year to date:
    • Site-wide: 41,441
  • Since beginning of last month:
    • Site-wide: 40,401

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide