Unsupervised clustering of missense variants in the HNF1A gene using multidimensional functional data aids clinical interpretation
Laeya A. Najmi,
Amanda J Bennett,
Jana K. Rundle,
Timme van der Lugt,
Michael N Weedon,
Mark I McCarthy,
Pål Rasmus Njølstad,
Anna L. Gloyn
Posted 02 Nov 2019
medRxiv DOI: 10.1101/19010900
Posted 02 Nov 2019
BackgroundExome sequencing in diabetes presents a diagnostic challenge as depending on frequency, functional impact and genomic and environmental contexts, HNF1A variants can cause Maturity-onset Diabetes of the Young (MODY), increase type 2 diabetes risk, or be benign. A correct diagnosis matters as it informs on treatment, progression, and family risk. We describe a multi-dimensional functional dataset of 73 HNF1A missense variants identified in exomes of 12,940 individuals. Our aim was to develop an analytical framework for stratifying variants along the HNF1A phenotypic continuum to facilitate diagnostic interpretation. MethodsHNF1A variant function was determined by 4 different molecular assays. Structure of the multi-dimensional dataset was explored using principal component analysis, k-means, and hierarchical clustering. Weights for tissue-specific isoform expression and functional domain were integrated. Functionally annotated variant subgroups were used to re-evaluate genetic diagnoses in national MODY diagnostic registries. FindingsHNF1A variants demonstrated a range of behaviours across the assays. The structure of the multi-parametric data was shaped primarily by transactivation. Using unsupervised learning methods, we obtained high-resolution functional clusters of the variants which separated known causal MODY variants from benign and type 2 diabetes risk variants and led to reclassification of 4% and 9% of HNF1A variants identified in the UK and Norway MODY diagnostic registries, respectively. InterpretationOur proof-of-principle analyses facilitated informative stratification of HNF1A variants along the continuum, allowing improved evaluation of clinical significance, management and precision medicine in diabetes clinics. Transcriptional activity appears a superior readout supporting pursuit of transactivation-centric experimental designs for high-throughput functional screens. FundingWellcome Trust, National Institute for Health Research (NIHR) Oxford Biomedical Research Centre (BRC), European Research Council, Norwegian Research Council, Stiftelsen Kristian Gerhard Jebsen, Western Norway Regional Health Authority, Novo Nordisk Fonden, Royal Norwegian Diabetes Foundation. Research in contextO_ST_ABSEvidence before the studyC_ST_ABSMolecular characterisation pipelines for studying the function of transcription factors consist primarily of in vitro cellular assays which interrogate transcriptional activity, protein abundance, localisation of the transcription factor to the nucleus, and binding to relevant DNA recognition sequences. The experimental techniques used to explore these mechanisms in vitro vary in robustness and reliability. There exist a wide variety of reported functional consequences of HNF1A variants in the literature, a gene causing the most common form of Maturity-onset Diabetes of the Young (HNF1A-MODY). The standard approach for analysing multi-tiered functional datasets has been to evaluate each functional parameter independently. Data from functional characterisation efforts of the HNF-1A protein encoded by the HNF1A gene, support that the degree of HNF-1A disruption tends to correlate positively with phenotypic severity: MODY-causing protein-altering variants impair HNF-1A transcriptional activity more severely ([≤]30% vs. wild-type) than HNF1A variants associated with increased risk for developing type 2 diabetes in population-specific contexts (40-60% vs. wild-type). Rare variants which demonstrated intermediate function (between MODY-casual and wild-type) in transactivation and nuclear localisation assays were shown to be associated with a 6-fold increase in type 2 diabetes predisposition. Added value of this studyWe have developed a proof-of-principle analytical framework for robust and unbiased variant stratification using multi-dimensional functional follow-up data from a large number of exome-identified missense variants in HNF1A. Through our analytical approach we were able to perform a comprehensive assessment of molecular function by utilising data from as many mechanistic dimensions as possible, avoiding arbitrarily determined cut-offs based on 1D functional data. Our method facilitated informative spatial organization of variants along the HNF1A molecular-phenotypic spectrum and an exploration of the contributions of each in vitro molecular mechanism on meaningful functional, and therefore clinical, stratification. Further, we were able to perform sensitive mapping of variant effects on molecular function with phenotypic outcome using clinical and genetic data from national MODY diagnostic registries of UK and Norway. This effort allowed us to annotate functional clusters with clinical knowledge and identify discordant classifications between functional genotype and clinical phenotype. Implications of all the available evidenceOur novel approach towards analysing large functional datasets enables sensitive variant-phenotype mapping and multi-layered variant annotation. It also assists in prioritisation of functional elements and signatures for Multiplexed Assays of Variant Effects (MAVEs) whilst they largely remain limited to a single functional readout. Indeed, comprehensively annotated HNF1A variant clusters can aid in the interpretation and clinical classification of variants, and can also be utilised to calibrate supervised variant classification models built with high-throughput-derived experimental data.
- Downloaded 529 times
- Download rankings, all-time:
- Site-wide: 49,655
- In endocrinology: 28
- Year to date:
- Site-wide: 53,786
- Since beginning of last month:
- Site-wide: 56,234
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!