Rxivist logo

Analysis of next- and third-generation RNA-Seq data reveals the structures of alternative transcription units in bacterial genomes

By Qi Wang, Zhaoqian Liu, Bo Yan, Wen-Chi Chou, Laurence Ettwiller, Qin Ma, Bingqiang Liu

Posted 04 Jan 2021
bioRxiv DOI: 10.1101/2021.01.02.425006

Alternative transcription units (ATUs) are dynamically encoded under different conditions or environmental stimuli in bacterial genomes, and genome-scale identification of ATUs is essential for studying the emergence of human diseases caused by bacterial organisms. However, it is unrealistic to identify all ATUs using experimental techniques, due to the complexity and dynamic nature of ATUs. Here we present the first-of-its-kind computational framework, named SeqATU, for genome-scale ATU prediction based on next-generation RNA-Seq data. The framework utilizes a convex quadratic programming model to seek an optimum expression combination of all of the to-be-identified ATUs. The predicted ATUs in E. coli reached a precision of 0.77/0.74 and a recall of 0.75/0.76 in the two RNA-Sequencing datasets compared with the benchmarked ATUs from third-generation RNA-Seq data. We believe that the ATUs identified by SeqATU can provide fundamental knowledge to guide the reconstruction of transcriptional regulatory networks in bacterial genomes.

Download data

  • Downloaded 304 times
  • Download rankings, all-time:
    • Site-wide: 91,586
    • In bioinformatics: 8,065
  • Year to date:
    • Site-wide: 12,570
  • Since beginning of last month:
    • Site-wide: 74,333

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide