DNA methylation-based sex classifier to predict sex and identify sex chromosome aneuploidy
Olivia A Grant,
Klaus D McDonald-Maier,
Leonard C Schalkwyk
Posted 19 Oct 2020
bioRxiv DOI: 10.1101/2020.10.19.345090
Posted 19 Oct 2020
Sex is an important covariate of epigenome-wide association studies due to its strong influence on DNA methylation patterns across numerous genomic positions. Nevertheless, many samples on the Gene Expression Omnibus (GEO) frequently lack a sex annotation or are incorrectly labelled. Considering the influence that sex imposes on DNA methylation patterns, it is necessary to ensure that methods for filtering poor samples and checking of sex assignment are accurate and widely applicable. In this paper, we presented a novel method to predict sex using only DNA methylation density signals, which can be readily applied to almost all DNA methylation datasets of different formats (raw IDATs or text files with only density signals) uploaded to GEO. We identified 4345 significantly (p < 0.01) sex-associated CpG sites present on both 450K and EPIC arrays, and constructed a sex classifier based on the two first components of PCAs from the two sex chromosomes. The proposed method is constructed using whole blood samples and exhibits good performance across a wide range of tissues. We further demonstrated that our method can be used to identify samples with sex chromosome aneuploidy, this function is validated by five Turner syndrome cases and one Klinefelter syndrome case. The proposed method has been integrated into the wateRmelon Bioconductor package. ### Competing Interest Statement The authors have declared no competing interest.
- Downloaded 153 times
- Download rankings, all-time:
- Site-wide: 117,219
- In bioinformatics: 9,600
- Year to date:
- Site-wide: 55,205
- Since beginning of last month:
- Site-wide: 37,757
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!