Cancers are composed of genetically distinct subpopulations of malignant cells. By sequencing DNA from cancer tissue samples, we can characterize the somatic mutations specific to each population and build clone trees describing the evolutionary ancestry of populations relative to one another. These trees reveal critical points in disease development and inform treatment. Pairtree constructs clone trees using DNA sequencing data from one or more bulk samples of an individual cancer. It uses Bayesian inference to compute posterior distributions over the evolutionary relationships between every pair of identified subpopulations, then uses these distributions in a Markov Chain Monte Carlo algorithm to perform efficient inference of the posterior distribution over clone trees. Pairtree also uses the pairwise relationships to detect mutations that violate the infinite sites assumption. Unlike previous methods, Pairtree can perform clone tree reconstructions using as many as 100 samples per cancer that reveal 30 or more cell subpopulations. On simulated data, Pairtree is the only method whose performance reliably improves when provided with additional bulk samples from a cancer. On 14 B-progenitor acute lymphoblastic leukemias with up to 90 samples from each cancer, Pairtree was the only method that could reproduce or improve upon expert-derived clone tree reconstructions. By scaling to more challenging problems, Pairtree supports new biomedical research applications that can improve our understanding of the natural history of cancer, as well as better illustrate the interplay between cancer, host, and therapeutic interventions. The Pairtree method, along with an interactive visual interface for exploring the clone tree posterior, is available at https://github.com/morrislab/pairtree.
- Downloaded 1,300 times
- Download rankings, all-time:
- Site-wide: 19,105
- In bioinformatics: 2,159
- Year to date:
- Site-wide: 4,312
- Since beginning of last month:
- Site-wide: 61,286
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!