Rxivist API documentation
The Rxivist API offers free, programmatic access to all of our preprint metadata in a JSON interface. It's open to all—no keys or authentication here, at least for now. We do ask that you go easy on the requests, as this is a small project with limited funding for server infrastructure.
If you are looking for data to use offline somewhere, there's also no need to send 200,000 API requests to get all of it: We generate weekly database dumps that contain all Rxivist information, and you're welcome to download them. Not only is that easier for our servers to handle, but it may be much easier for you to process. The PostgreSQL dumps are available for direct download.
If you are planning to use the Rxivist data within a web application, it would be much appreciated if you link to Rxivist on any page that displays a non-trivial amount of data pulled from this API. Also, let us know what you're up to! We'd love to find out how this data is being used.
While we're talking about third-party web applications, we should here that we can't guarantee this API will be around forever. We plan to keep it running (and updated!) for the foreseeable future, but if you're going to build something using the Rxivist API that requires a strong uptime commitment or a concrete promise of long-term functionality, consider deploying your own version of the software. We'll provide as much guidance as we can.
Etiquette
We want to provide a free API for all, and we don't want to unnecessarily burden developers (or ourselves) with cumbersome API tokens or registration processes. For that to work, we ask that you be polite and try not to do anything that will take the API down or otherwise make it unusable for others. Specifically, we encourage the following polite behaviour:
- Cache data so you don't request the same information over and over again.
- Minimize the number of parallel requests being made. If you start noticing increased response times or start getting timeout errors, consider adding pauses between requests.
- Specify a User-Agent header that properly identifies your script or tool and that provides a means of contacting you via email using "mailto:". For example:
GroovyBib/1.1 (https://example.org/GroovyBib/; mailto:GroovyBib@example.org) BasedOnFunkyLib/1.4
. This way we can contact you if we see a problem. - Report problems and/or ask questions on our issue tracker.
New, December 2020: Requests to the Rxivist website and API that do not specify a User-Agent may be rejected.
Alas, not all people are polite. And for this reason we reserve the right to impose rate limits and/or to block clients that are disrupting the public service.
"Etiquette" section based on the Crossref API documentation, available via Creative Commons license.
How to cite Rxivist
If you use Rxivist data in your research, please cite our paper, which is now available at eLife:
Table of contents
- Preprints: Search
https://api.rxivist.org/v1/papers
- Preprints: Details
https://api.rxivist.org/v1/papers/<id>
- Preprints: Download data
https://api.rxivist.org/v1/downloads/<id>
- Authors: Rankings
https://api.rxivist.org/v1/authors
- Authors: Details
https://api.rxivist.org/v1/authors/<id>
- API details: Category list
https://api.rxivist.org/v1/data/categories
- API details: Total entities
https://api.rxivist.org/v1/data/stats
- API details: Site-wide metric distributions
https://api.rxivist.org/v1/data/distributions/<entity>/<metric>
Preprints
Endpoint: Search
Retrieve a list of papers matching the given criteria.
https://api.rxivist.org/v1/papers
Arguments
q
– A search string to filter results based on their titles, abstracts and authors. Default:metric
– Which field to use when sorting results. Default: twitter- Acceptable values:
biorxiv
,medrxiv
,all
- Acceptable values:
timeframe
– How far back to look for the cumulative results of the chosen metric. ("ytd" and "lastmonth" are only available for the "downloads" metric. Default: "day" for Twitter metrics, "alltime" for downloads.- Acceptable values:
alltime
,ytd
,lastmonth
,day
,week
,month
,year
- Acceptable values:
category_filter
– An array of categories to which the results should be limited. Default: []- Acceptable values:
addiction-medicine
,allergy-and-immunology
,anesthesia
,animal-behavior-and-cognition
,biochemistry
,bioengineering
,bioinformatics
,biophysics
,cancer-biology
,cardiovascular-medicine
,cell-biology
,clinical-trials
,dentistry-and-oral-medicine
,dermatology
,developmental-biology
,ecology
,emergency-medicine
,endocrinology
,epidemiology
,evolutionary-biology
,forensic-medicine
,gastroenterology
,genetic-and-genomic-medicine
,genetics
,genomics
,geriatric-medicine
,health-economics
,health-informatics
,health-policy
,health-systems-and-quality-improvement
,hematology
,hiv-aids
,immunology
,infectious-diseases
,intensive-care-and-critical-care-medicine
,medical-education
,medical-ethics
,microbiology
,molecular-biology
,nephrology
,neurology
,neuroscience
,nursing
,nutrition
,obstetrics-and-gynecology
,occupational-and-environmental-health
,oncology
,ophthalmology
,orthopedics
,otolaryngology
,pain-medicine
,paleontology
,palliative-medicine
,pathology
,pediatrics
,pharmacology-and-therapeutics
,pharmacology-and-toxicology
,physiology
,plant-biology
,primary-care-research
,psychiatry-and-clinical-psychology
,public-and-global-health
,radiology-and-imaging
,rehabilitation-medicine-and-physical-therapy
,respiratory-medicine
,rheumatology
,scientific-communication-and-education
,sexual-and-reproductive-health
,sports-medicine
,surgery
,synthetic-biology
,systems-biology
,toxicology
,transplantation
,urology
,zoology
- Acceptable values:
page
– Number of the page of results to retrieve. Shorthand for an offset based on the specified page_size Default: 0page_size
– How many results to return at one time. Default: 20repo
– Which preprint repositories should be included in the request. Default: biorxiv
Example
Top 3 downloaded papers, all time
Using the "downloads" metric, get 3 papers ordered by their overall download count.
https://api.rxivist.org/v1/papers?metric=downloads&page_size=3&timeframe=alltime
Response (click to expand)
{ "query": { "text_search": "", "timeframe": "alltime", "categories": [], "metric": "downloads", "page_size": 3, "current_page": 0, "final_page": 11138, "total_results": 33416, "repository": "biorxiv" }, "results": [ { "id": 12345, "metric": 166288, "title": "Example Paper Here: A compelling placeholder", "url": "https://api.rxivist.org/v1/papers/12345", "biorxiv_url": "https://www.biorxiv.org/content/early/2018/fake_url", "doi": "10.1101/00000", "category": "cancer-biology", "first_posted": "19-09-18", "abstract": "This is where the abstract would go.", "repo": "biorxiv", "authors": [ { "id": 1, "name": "Richard Abdill" }, { "id": 24802, "name": "Another Person" } ] }, # (More responses go here...) ] }
Endpoint: Details
Retrieve data about a single paper and all of its authors. Note: Unlike the author rankings, paper rankings do NOT incorporate the concept of ties.
https://api.rxivist.org/v1/papers/<id>
Arguments
id
– Rxivist paper ID associated with the paper you want Default: True
Example
Paper detail request
https://api.rxivist.org/v1/papers/25777
Response (click to expand)
{ "id": 25770, "doi": "10.1101/096727", "biorxiv_url": "https://www.biorxiv.org/content/early/2016/12/29/096727", "repo": "biorxiv", "url": "https://api.rxivist.org/v1/papers/25770", "title": "Parallel adaptation to higher temperatures in divergent clades of the nematode Pristionchus pacificus", "category": "evolutionary-biology", "first_posted": "2016-12-29", "abstract": "Studying the effect of temperature on fertility is particularly important in the light of ongoing climate change. We need to know if organisms can adapt to higher temperatures and, if so, what are the evolutionary mechanisms behind such adaptation. Such studies have been hampered by the lack different populations of sufficient sizes with which to relate the phenotype of temperature tolerance to the underlying genotypes. Here, we examined temperature adaptation in populations of the nematode Pristionchus pacificus, in which individual strains are able to successfully reproduce at 30°C. Analysis of the frequency of heat tolerant strains in different temperature zones on La Reunion supports that this trait is subject to natural selection. Reconstruction of ancestral states along the phylogeny of highly differentiated P. pacificus clades suggests that heat tolerance evolved multiple times independently. This is further supported by genome wide association studies showing that heat tolerance is a polygenic trait and that different loci are used by individual P. pacificus clades to develop heat tolerance. More precisely, analysis of allele frequencies indicated that most genetic markers that are associated with heat tolerance are only polymorphic in individual clades. While in some P. pacificus clades, parallel evolution of heat tolerance can be explained by ancestral polymorphism or by gene flow across clades, we observe at least one clearly distinct and independent scenario where heat tolerance emerged by de novo mutation. Thus, temperature tolerance evolved at least two times independently in the evolutionary history of this species. Our data suggest that studies of wild populations of P. pacificus will reveal distinct cellular mechanisms driving temperature adaptation.", "authors": [ { "id": 1221, "name": "Mark Leaver", "institution": "Max Planck Institute of Molecular Cell Biology and Genetics;", "orcid": "http://orcid.org/0000-0003-2796-4312" }, { "id": 1222, "name": "Merve Kayhan", "institution": "Bilkent University;", "orcid": null }, { "id": 1223, "name": "Angela McGaughran", "institution": "Australian National University;", "orcid": null }, { "id": 1224, "name": "Christian Roedelsperger", "institution": "Max Planck Institute for Developmental Biology;", "orcid": null }, { "id": 1225, "name": "Anthony A Hyman", "institution": "Max Planck Institute of Molecular Cell Biology and genetics", "orcid": "http://orcid.org/0000-0003-3664-154X" }, { "id": 1226, "name": "Ralf Sommer", "institution": "Max Planck Institute for Developmental Biology;", "orcid": "http://orcid.org/0000-0003-1503-7749" } ], "ranks": { "alltime": { "rank": 15658, "tie": false, "downloads": 290 }, "ytd": { "rank": 22951, "tie": false, "downloads": 68 }, "lastmonth": { "rank": 28283, "tie": 4, "downloads": 65 }, "category": { "rank": 1500, "tie": 4, "downloads": 290 } }, publication": { "journal": "Journal Name Here", "doi": "10.1038/1234567" } }
Endpoint: Download data
Retrieve monthly download statistics for a single paper.
https://api.rxivist.org/v1/downloads/<id>
Arguments
id
– Rxivist paper ID associated with the download data you want. Default: True
Example
Paper download data request
https://api.rxivist.org/v1/downloads/12345
Response (click to expand)
{ "query": { "id": 12345 }, "results": [ { "month": 6, "year": 2018, "downloads": 205, "views": 259 }, { "month": 7, "year": 2018, "downloads": 153, "views": 199 }, { "month": 8, "year": 2018, "downloads": 88, "views": 98 }, { "month": 9, "year": 2018, "downloads": 118, "views": 159 }, { "month": 10, "year": 2018, "downloads": 10, "views": 18 } ] }
Authors
Endpoint: Rankings
The top 200 authors for all-time downloads in a category.
https://api.rxivist.org/v1/authors
Arguments
category
– The category to which results should be limited. Omitting one returns results for the entire site. Default: False- Acceptable values:
addiction-medicine
,allergy-and-immunology
,anesthesia
,animal-behavior-and-cognition
,biochemistry
,bioengineering
,bioinformatics
,biophysics
,cancer-biology
,cardiovascular-medicine
,cell-biology
,clinical-trials
,dentistry-and-oral-medicine
,dermatology
,developmental-biology
,ecology
,emergency-medicine
,endocrinology
,epidemiology
,evolutionary-biology
,forensic-medicine
,gastroenterology
,genetic-and-genomic-medicine
,genetics
,genomics
,geriatric-medicine
,health-economics
,health-informatics
,health-policy
,health-systems-and-quality-improvement
,hematology
,hiv-aids
,immunology
,infectious-diseases
,intensive-care-and-critical-care-medicine
,medical-education
,medical-ethics
,microbiology
,molecular-biology
,nephrology
,neurology
,neuroscience
,nursing
,nutrition
,obstetrics-and-gynecology
,occupational-and-environmental-health
,oncology
,ophthalmology
,orthopedics
,otolaryngology
,pain-medicine
,paleontology
,palliative-medicine
,pathology
,pediatrics
,pharmacology-and-therapeutics
,pharmacology-and-toxicology
,physiology
,plant-biology
,primary-care-research
,psychiatry-and-clinical-psychology
,public-and-global-health
,radiology-and-imaging
,rehabilitation-medicine-and-physical-therapy
,respiratory-medicine
,rheumatology
,scientific-communication-and-education
,sexual-and-reproductive-health
,sports-medicine
,surgery
,synthetic-biology
,systems-biology
,toxicology
,transplantation
,urology
,zoology
- Acceptable values:
Example
Author rankings request, limited to biophysics
https://api.rxivist.org/v1/authors?category=biophysics
Response (click to expand)
{ "results": [ { "id": 80168, "name": "Claudia Cattoglio", "rank": 1, "downloads": 2504, "tie": true }, { "id": 47439, "name": "Xavier Darzacq", "rank": 1, "downloads": 2504, "tie": true }, { "id": 47441, "name": "Robert Tjian", "rank": 1, "downloads": 2504, "tie": true }, { "id": 19704, "name": "Patrick Cramer", "rank": 4, "downloads": 2389, "tie": false }, { "id": 80823, "name": "Dimitry Tegunov", "rank": 5, "downloads": 2388, "tie": true }, # ...and so on for 200 entries ] }
Endpoint: Details
Retrieve data about a single author.
https://api.rxivist.org/v1/authors/<id>
Arguments
id
– Rxivist paper ID associated with the author in question. Default: True
Example
Author detail request
https://api.rxivist.org/v1/authors/1222
Response (click to expand)
{ "id": 1222, "name": "Merve Kayhan", "institution": "Bilkent University;", "orcid": null, "articles": [ { "id": 25770, "doi": "10.1101/096727", "biorxiv_url": "https://www.biorxiv.org/content/early/2016/12/29/096727", "url": "https://api.rxivist.org/v1/papers/25770", "title": "Parallel adaptation to higher temperatures in divergent clades of the nematode Pristionchus pacificus", "category": "evolutionary-biology", "ranks": { "alltime": { "rank": 15658, "tie": false, "downloads": 290 }, "ytd": { "rank": 22951, "tie": false, "downloads": 65 }, "lastmonth": { "rank": 28283, "tie": 4, "downloads": 65 }, "category": { "rank": 28283, "tie": 4, "downloads": 0 } } } ], "ranks": [ { "downloads": 134075, "rank": 1, "tie": false, "category": "alltime" }, { "downloads": 3126, "rank": 1120, "tie": true, "category": "bioinformatics" }, { "downloads": 130949, "rank": 1, "tie": false, "category": "neuroscience" } ] }
API details
Endpoint: Category list
A list of all bioRxiv "collections," or categories, currently available via Rxivist.
https://api.rxivist.org/v1/data/categories
Example
https://api.rxivist.org/v1/data/categories
Response (click to expand)
{ "results": [ "animal-behavior-and-cognition", "biochemistry", "bioengineering", "bioinformatics", "biophysics", "cancer-biology", "cell-biology", "clinical-trials", "developmental-biology", "ecology", "epidemiology", "evolutionary-biology", "genetics", "genomics", "immunology", "microbiology", "molecular-biology", "neuroscience", "paleontology", "pathology", "pharmacology-and-toxicology", "physiology", "plant-biology", "scientific-communication-and-education", "synthetic-biology", "systems-biology", "zoology" ] }
Endpoint: Total entities
Basic information about how many papers and authors are indexed by Rxivist.
https://api.rxivist.org/v1/data/stats
Example
https://api.rxivist.org/v1/data/stats
Response (click to expand)
{ "papers_indexed": 34409, "authors_indexed": 145540, "missing_abstract": 0, "missing_date": 3539, "outdated_count": { "biophysics": 66, "cell-biology": 134, "developmental-biology": 356, "ecology": 3, "epidemiology": 1, "evolutionary-biology": 87, "genetics": 1250, "genomics": 292, "immunology": 93, "microbiology": 407, "molecular-biology": 212, "neuroscience": 391, "pharmacology-and-toxicology": 57, "physiology": 56, "plant-biology": 68, "scientific-communication-and-education": 51 } }
Endpoint: Site-wide metric distributions
Histogram-style binned data summarizing how many papers or authors have a given metric within the range of each bin. For example, the paper downloads distribution might say that there are 15 papers with between 0 and 9 downloads, 35 papers with between 10 and 19 downloads, 70 papers with between 20 and 29 downloads, and so on. ALSO provides averages for the specified metric.
https://api.rxivist.org/v1/data/distributions/<entity>/<metric>
Arguments
entity
– Which object should be used to group the metric totals. Default: papers- Acceptable values:
paper
,author
- Acceptable values:
metric
– Which metric to evaluate. Not currently available for Twitter data. Default: downloads- Acceptable values:
downloads
- Acceptable values:
Example
Paper download distribution
https://api.rxivist.org/v1/data/distributions/paper/downloads
Response (click to expand)
{ "averages": { "mean": 480, "median": 253 }, "histogram": [ { "bucket_min": 0, "count": 4 }, { "bucket_min": 2, "count": 21 }, { "bucket_min": 4, "count": 92 }, { "bucket_min": 8, "count": 128 }, { "bucket_min": 16, "count": 272 }, { "bucket_min": 32, "count": 457 } ] }