Rxivist logo

Model selection for biological crystallography

By Nathan S Babcock, Daniel A. Keedy, James S Fraser, David A. Sivak

Posted 20 Oct 2018
bioRxiv DOI: 10.1101/448795

Structural biologists have fit increasingly complex model types to protein X-ray crystallographic data, motivated by higher-resolving crystals, greater computational power, and a growing appreciation for protein dynamics. Once fit, a more complex model will generally fit the experimental data better, but it also provides greater capacity to overfit to experimental noise. While refinement progress is normally monitored for a given model type with a fixed number of parameters, comparatively little attention has been paid to the selection among distinct model types where the number of parameters can vary. Using metrics derived in the statistical field of model comparison, we develop a framework for statistically rigorous inference of model complexity. From analysis of simulated data, we find that the resulting information criteria are less likely to prefer an erroneously complex model type and are less sensitive to noise, compared to the crystallographic cross-validation criterion Rfree. Moreover, these information criteria suggest caution in using complex model types and for inferring protein conformational heterogeneity from experimental scattering data.

Download data

  • Downloaded 519 times
  • Download rankings, all-time:
    • Site-wide: 31,871 out of 94,912
    • In biophysics: 1,241 out of 4,144
  • Year to date:
    • Site-wide: 47,299 out of 94,912
  • Since beginning of last month:
    • Site-wide: 68,150 out of 94,912

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)