Participants in epidemiological and genetic studies are rarely truly random samples of the populations they are intended to represent, and both known and unknown factors can influence participation in a study (also known as selection into a study). The circumstances in which selection causes bias in an instrumental variable (IV) analysis are not well understood. We use directed acyclic graphs (DAGs) to depict assumptions about the selection mechanism (i.e., the factors affecting selection into the study), and show how DAGs can be used to determine when a two stage least squares (2SLS) IV analysis is biased by selection. For a range of selection mechanisms we explain the structure of the selection bias and, via simulations, we illustrate the potential bias caused by selection in an IV analysis. We show that selection can result in a biased 2SLS estimate of the causal exposure effect, substantial undercoverage of its confidence interval, and the chance of reaching an incorrect conclusion about the causal exposure effect. We consider whether the bias caused by selection differ according to different instrument strengths, between a linear and nonlinear exposure-instrument association, and for a causal and non-causal exposure effect. In addition, we present the results of a real data example where nonrandom selection into the study was suspected. We conclude that selection bias can have a major effect on an IV analysis and that statistical methods for estimating causal effects using data from nonrandom samples are needed.
- Downloaded 1,062 times
- Download rankings, all-time:
- Site-wide: 9,815 out of 85,151
- In epidemiology: 97 out of 1,556
- Year to date:
- Site-wide: 12,991 out of 85,151
- Since beginning of last month:
- Site-wide: 10,238 out of 85,151
Downloads over time
Distribution of downloads per paper, site-wide
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!