Rxivist logo

Identification of putative causal loci in whole-genome sequencing data via knockoff statistics

By Zihuai He, Linxi Liu, Chen Wang, Yann Le Guen, Justin Lee, Stephanie Gogarten, Fred Lu, Stephen Montgomery, Hua Tang, Edwin K Silverman, Michael Cho, Michael Greicius, Iuliana Ionita-Laza

Posted 09 Mar 2021
bioRxiv DOI: 10.1101/2021.03.08.434451

The analysis of whole-genome sequencing studies is challenging due to the large number of rare variants in noncoding regions and the lack of natural units for testing. We propose a statistical method to detect and localize rare and common risk variants in whole-genome sequencing studies based on a recently developed knockoff framework. It can (1) prioritize causal variants over associations due to linkage disequilibrium thereby improving interpretability; (2) help distinguish the signal due to rare variants from shadow effects of significant common variants nearby; (3) integrate multiple knockoffs for improved power, stability and reproducibility; and (4) flexibly incorporate state-of-the-art and future association tests to achieve the benefits proposed here. In applications to whole-genome sequencing data from the Alzheimer's Disease Sequencing Project (ADSP) and COPDGene samples from NHLBI Trans-Omics for Precision Medicine (TOPMed) Program we show that our method compared with conventional association tests can lead to substantially more discoveries.

Download data

  • Downloaded 422 times
  • Download rankings, all-time:
    • Site-wide: 73,772
    • In genetics: 3,227
  • Year to date:
    • Site-wide: 11,394
  • Since beginning of last month:
    • Site-wide: 16,361

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide