Rxivist logo

The weighting is the hardest part: on the behavior of the likelihood ratio test and score test under weight misspecification in rare variant association studies

By Camelia C. Minică, Giulio Genovese, Christina M Hultman, René Pool, Jacqueline M. Vink, Conor V. Dolan, Benjamin M Neale

Posted 02 Jun 2015
bioRxiv DOI: 10.1101/020198

Rare variant association studies are at a critical inflexion point with the increasing availability of exome-sequencing data. A popular test of association is the sequence kernel association test (SKAT). Weights are embedded within SKAT to reflect the hypothesized contribution of the variants to the trait variance. Correct weighting is expected to boost power, and yet the correct weights are generally unknown. It is therefore important to assess the effect of weight misspecification in SKAT. We evaluated the behavior of the score and likelihood ratio tests (LRT) under weight misspecification. Simulation and empirical results revealed that LRT is generally more robust and more powerful than score test in such a circumstance. For instance, when the simulated betas were larger for rarer than for more common variants, (incorrectly) assigning equal weights reduced the power of the LRT by ~5%, while the power of the score test dropped by ~30%. To optimize weighting we proposed a data-driven weighting scheme. With this scheme and LRT we detected significant enrichment of rare case mutations (MAF<5%; P-value=7E-04) of a set of constrained genes in the Swedish schizophrenia case-control cohort with exome-sequencing data. The score test is currently preferred for its computational efficiency and power. Indeed, assuming correct specification, in some circumstances the score test is the most powerful test. However, LRT has the compelling qualities of being generally more powerful and more robust to misspecification. This is an important result given that, arguably, misspecified models are likely to be the rule rather than the exception in weighting-based approaches.

Download data

  • Downloaded 1,114 times
  • Download rankings, all-time:
    • Site-wide: 22,525
    • In genetics: 1,003
  • Year to date:
    • Site-wide: 105,811
  • Since beginning of last month:
    • Site-wide: 112,044

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide