The arrangement of hypotheses in a hierarchical structure (e.g., phylogenies, cell types) appears in many research fields and indicates different resolutions at which data can be interpreted. A common goal is to find a representative resolution that gives high sensitivity to identify relevant entities (e.g., microbial taxa or cell subpopulations) that are related to a phenotypic outcome (e.g. disease status) while controlling false detections, therefore providing a more compact view of detected entities and summarizing characteristics shared among them. Current methods, either performing hypothesis tests at an arbitrary resolution or testing hypotheses at all possible resolutions leading to nested results, are suboptimal. Moreover, they are not flexible enough to work in situations where each entity has multiple features to consider and different resolutions might be required for different features. For example, in single cell RNA-seq data, an increasing focus is to find differential state genes that change expression within a cell subpopulation in response to an external stimulus. Such differential expression might occur at different resolutions (e.g., all cells or a small set of cells) for different genes. Our new algorithm treeclimbR is designed to fill this gap by exploiting a hierarchical tree of entities, proposing multiple candidates that capture the latent signal and pinpointing branches or leaves that contain features of interest, in a data-driven way. It outperforms currently available methods on synthetic data, and we highlight the approach on various applications, including microbiome and microRNA surveys as well as single cell cytometry and RNA-seq datasets. With the emergence of various multi-resolution genomic datasets, treeclimbR provides a thorough inspection on entities across resolutions and gives additional flexibility to uncover biological associations. ### Competing Interest Statement The authors have declared no competing interest.
- Downloaded 855 times
- Download rankings, all-time:
- Site-wide: 32,909
- In bioinformatics: 3,612
- Year to date:
- Site-wide: 28,713
- Since beginning of last month:
- Site-wide: 117,343
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!