Rxivist logo

T-Gene: Improved target gene prediction

By Timothy O’Connor, Charles E Grant, Mikael Boden, Timothy L Bailey

Posted 15 Oct 2019
bioRxiv DOI: 10.1101/803221 (published DOI: 10.1093/bioinformatics/btaa227)

Motivation: Identifying the genes regulated by a given transcription factor (its "target genes") is a key step in developing a comprehensive understanding of gene regulation. Previously we developed a method for predicting the target genes of a transcription factor (TF) based solely on the correlation between a histone modification at the TF's binding site and the expression of the gene across a set of tissues. That approach is limited to organisms for which extensive histone and expression data is available, and does not explicitly incorporate the genomic distance between the TF and the gene. Results: We present the T-Gene algorithm, which overcomes these limitations. T-Gene can be used to predict which genes are most likely to be regulated by a TF, and which of the TF's binding sites are most likely involved in regulating particular genes. T-Gene calculates a novel score that combines distance and histone/expression correlation, and we show that this score accurately predicts when a regulatory element bound by a TF is in contact with a gene's promoter, achieving median positive predictive value (PPV) above 50%. T-Gene is easy to use via its web server or as a command-line tool, and can also make accurate predictions (median PPV above 40%) based on distance alone when histone/expression data is not available. T-Gene provides an estimate of the statistical significance of each of its predictions. Availability: The T-Gene web server, source code, histone/expression data and genome annotation files are provided at http://meme-suite.org.

Download data

  • Downloaded 506 times
  • Download rankings, all-time:
    • Site-wide: 71,550
    • In bioinformatics: 6,705
  • Year to date:
    • Site-wide: 82,804
  • Since beginning of last month:
    • Site-wide: 42,075

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide