Rxivist logo

Electronic Health Record and Genome-wide Genetic Data in Generation Scotland Participants

By Shona M. Kerr, Archie Campbell, Jonathan Marten, Veronique Vitart, Andrew McIntosh, David J Porteous, Caroline Hayward

Posted 23 Jun 2017
bioRxiv DOI: 10.1101/154609 (published DOI: 10.12688/wellcomeopenres.12600.1)

This paper provides the first detailed demonstration of the research value of the Electronic Health Record (EHR) linked to research data in Generation Scotland Scottish Family Health Study (GS:SFHS) participants, together with how to access this data. The structured, coded variables in the routine biochemistry, prescribing and morbidity records in particular represent highly valuable phenotypic data for a genomics research resource. Access to a wealth of other specialized datasets including cancer, mental health and maternity inpatient information is also possible through the same straightforward and transparent application process. The Electronic Health Record linked dataset is a key component of GS:SFHS, a biobank conceived in 1999 for the purpose of studying the genetics of health areas of current and projected public health importance. Over 24,000 adults were recruited from 2006 to 2011, with broad and enduring written informed consent for biomedical research. Consent was obtained from 23,603 participants for GS:SFHS study data to be linked to their Scottish National Health Service (NHS) records, using their Community Health Index (CHI) number. This identifying number is used for NHS Scotland procedures (registrations, attendances, samples, prescribing and investigations) and allows healthcare records for individuals to be linked across time and location. Here, we describe the NHS EHR dataset on the sub-cohort of 20,032 GS:SFHS participants with consent and mechanism for record linkage plus extensive genetic data. Together with existing study phenotypes, including family history and environmental exposures such as smoking, the EHR is a rich resource of real world data that can be used in research to characterise the health trajectory of participants, available at low cost and a high degree of timeliness, matched to DNA, urine and serum samples and genome-wide genetic information.

Download data

  • Downloaded 663 times
  • Download rankings, all-time:
    • Site-wide: 37,109
    • In genomics: 3,162
  • Year to date:
    • Site-wide: 80,596
  • Since beginning of last month:
    • Site-wide: 78,891

Altmetric data

Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)