Rxivist logo

benchmarkR: an R package for benchmarking genome-scale methods

By Xiaobei Zhou, Charity W Law, Mark D Robinson

Posted 17 Apr 2015
bioRxiv DOI: 10.1101/018200

benchmarkR is an R package designed to assess and visualize the performance of statistical methods for datasets that have an independent truth (e.g., simulations or datasets with large-scale validation), in particular for methods that claim to control false discovery rates (FDR). We augment some of the standard performance plots (e.g., receiver operating characteristic, or ROC, curves) with information about how well the methods are calibrated (i.e., whether they achieve their expected FDR control). For example, performance plots are extended with a point to highlight the power or FDR at a user-set threshold (e.g., at a method's estimated 5% FDR). The package contains general containers to store simulation results (SimResults) and methods to create graphical summaries, such as receiver operating characteristic curves (rocX), false discovery plots (fdX) and power-to-achieved FDR plots (powerFDR); each plot is augmented with some form of calibration information. We find these plots to be an improved way to interpret relative performance of statistical methods for genomic datasets where many hypothesis tests are performed. The strategies, however, are general and will find applications in other domains.

Download data

  • Downloaded 480 times
  • Download rankings, all-time:
    • Site-wide: 69,181
    • In bioinformatics: 6,564
  • Year to date:
    • Site-wide: 156,614
  • Since beginning of last month:
    • Site-wide: 134,152

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide