Unsupervised Extraction of Epidemic Syndromes from Participatory Influenza Surveillance Self-reported Symptoms
W John Edmunds,
Posted 04 May 2018
bioRxiv DOI: 10.1101/314591 (published DOI: 10.1371/journal.pcbi.1006173)
Posted 04 May 2018
Seasonal influenza surveillance is usually carried out by sentinel general practitioners who compile weekly reports based on the number of influenza-like illness (ILI) clinical cases observed among visited patients. This practice for surveillance is generally affected by two main issues: i) reports are usually released with a lag of about one week or more, ii) the definition of a case of influenza-like illness based on patients symptoms varies from one surveillance system to the other, i.e. from one country to the other. The availability of novel data streams for disease surveillance can alleviate these issues; in this paper, we employed data from Influenzanet, a participatory web-based surveillance project which collects symptoms directly from the general population in real time. We developed an unsupervised probabilistic framework that combines time series analysis of symptoms counts and performs an algorithmic detection of groups of symptoms, hereafter called, syndrome. Symptoms counts were collected through the participatory web-based surveillance platforms of a consortium called Influenzanet which is found to correlate with Influenza-like illness incidence as detected by sentinel doctors. Our aim is to suggest how web-based surveillance data can provide an epidemiological signal capable of detecting influenza-like illness' temporal trends without relying on a specific case definition. We evaluated the performance of our framework by showing that the temporal trends of the detected syndromes closely follow the ILI incidence as reported by the traditional surveillance, and consist of combinations of symptoms that are compatible with the ILI definition. The proposed framework was able to predict quite accurately the ILI trend of the forthcoming influenza season based only on the available information of the previous years. Moreover, we assessed the generalisability of the approach by evaluating its potentials for the detection of gastrointestinal syndromes. We evaluated the approach against the traditional surveillance data and despite the limited amount of data, the gastrointestinal trend was successfully detected. The result is a real-time flexible surveillance and prediction tool that is not constrained by any disease case definition.
- Downloaded 579 times
- Download rankings, all-time:
- Site-wide: 56,538
- In epidemiology: 2,631
- Year to date:
- Site-wide: 138,904
- Since beginning of last month:
- Site-wide: 123,684
Downloads over time
Distribution of downloads per paper, site-wide
- 27 Nov 2020: The website and API now include results pulled from medRxiv as well as bioRxiv.
- 18 Dec 2019: We're pleased to announce PanLingua, a new tool that enables you to search for machine-translated bioRxiv preprints using more than 100 different languages.
- 21 May 2019: PLOS Biology has published a community page about Rxivist.org and its design.
- 10 May 2019: The paper analyzing the Rxivist dataset has been published at eLife.
- 1 Mar 2019: We now have summary statistics about bioRxiv downloads and submissions.
- 8 Feb 2019: Data from Altmetric is now available on the Rxivist details page for every preprint. Look for the "donut" under the download metrics.
- 30 Jan 2019: preLights has featured the Rxivist preprint and written about our findings.
- 22 Jan 2019: Nature just published an article about Rxivist and our data.
- 13 Jan 2019: The Rxivist preprint is live!