Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 63,068 bioRxiv papers from 279,747 authors.

Large-Scale Uniform Analysis of Cancer Whole Genomes in Multiple Computing Environments

By Christina K. Yung, Brian D. O’Connor, Sergei Yakneen, Junjun Zhang, Kyle Ellrott, Kortine Kleinheinz, Naoki Miyoshi, Keiran M. Raine, Romina Royo, Gordon B. Saksena, Matthias Schlesner, Solomon I. Shorser, Miguel Vazquez, Joachim Weischenfeldt, Denis Yuen, Adam P Butler, Brandi N. Davis-Dusenbery, Roland Eils, Vincent Ferretti, Robert L Grossman, Olivier Harismendy, Youngwook Kim, Hidewaki Nakagawa, Steven J. Newhouse, David Torrents, Lincoln D. Stein, on behalf of the PCAWG Technical Working Group, Javier Bartolomé Rodriguez, Keith A Boroevich, Rich Boyce, Angela N Brooks, Alex Buchanan, Ivo Buchhalter, Niall J. Byrne, Andy Cafferkey, Peter J. Campbell, Zhaohong Chen, Sunghoon Cho, Wan Choi, Peter Clapham, Francisco M. De La Vega, Jonas Demeulemeester, Michelle T. Dow, Lewis J. Dursi, Juergen Eils, Claudiu Farcas, Francesco Favero, Nodirjon Fayzullaev, Paul Flicek, Nuno A Fonseca, Josep Ll. Gelpi, Gad Getz, Bob Gibson, Michael C. Heinold, Julian M Hess, Oliver Hofmann, Jongwhi H. Hong, Thomas J. Hudson, Daniel Huebschmann, Barbara Hutter, Carolyn M. Hutter, Seiya Imoto, Sinisa Ivkovic, Seung-Hyup Jeon, Wei Jiao, Jongsun Jung, Rolf Kabbe, Andre Kahles, Jules Kerssemakers, Hyunghwan Kim, Hyung-Lae Kim, Jihoon Kim, Jan O Korbel, Michael Koscher, Antonios Koures, Milena Kovacevic, Chris Lawerenz, Ignaty Leshchiner, Dimitri G. Livitz, George L. Mihaiescu, Sanja Mijalkovic, Ana Mijalkovic Lazic, Satoru Miyano, Hardeep K. Nahal, Mia Nastic, Jonathan Nicholson, David Ocana, Kazuhiro Ohi, Lucila Ohno-Machado, Larsson Omberg, B.F. Francis Ouellette, Nagarajan Paramasivam, Marc D Perry, Todd D. Pihl, Manuel Prinz, Montserrat Puiggròs, Petar Radovic, Esther Rheinbay, Mara W. Rosenberg, Charles Short, Heidi J. Sofia, Jonathan Spring, Adam J Struck, Grace Tiao, Nebojsa Tijanic, Peter Van Loo, David Vicente, Jeremiah A. Wala, Zhining Wang, Johannes Werner, Ashley Williams, Youngchoon Woo, Adam J. Wright, Qian Xiang, the PCAWG Network

Posted 10 Jul 2017
bioRxiv DOI: 10.1101/161638

The International Cancer Genome Consortium (ICGC)'s Pan-Cancer Analysis of Whole Genomes (PCAWG) project aimed to categorize somatic and germline variations in both coding and non-coding regions in over 2,800 cancer patients. To provide this dataset to the research working groups for downstream analysis, the PCAWG Technical Working Group marshalled ~800TB of sequencing data from distributed geographical locations; developed portable software for uniform alignment, variant calling, artifact filtering and variant merging; performed the analysis in a geographically and technologically disparate collection of compute environments; and disseminated high-quality validated consensus variants to the working groups. The PCAWG dataset has been mirrored to multiple repositories and can be located using the ICGC Data Portal. The PCAWG workflows are also available as Docker images through Dockstore enabling researchers to replicate our analysis on their own data.

Download data

  • Downloaded 1,874 times
  • Download rankings, all-time:
    • Site-wide: 2,505 out of 63,068
    • In genomics: 518 out of 4,328
  • Year to date:
    • Site-wide: 6,222 out of 63,068
  • Since beginning of last month:
    • Site-wide: 9,870 out of 63,068

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News