Rxivist logo

Rxivist combines preprints from bioRxiv with data from Twitter to help you find the papers being discussed in your field. Currently indexing 63,081 bioRxiv papers from 279,818 authors.

Pan-cancer classifications of tumor histological images using deep learning

By Javad Noorbakhsh, Saman Farahmand, Mohammad Soltanieh-ha, Sandeep Namburi, Kourosh Zarringhalam, Jeff Chuang

Posted 26 Jul 2019
bioRxiv DOI: 10.1101/715656

Histopathological images are essential for the diagnosis of cancer type and selection of optimal treatment. However, the current clinical process of manual inspection of images is time consuming and prone to intra- and inter-observer variability. Here we show that key aspects of cancer image analysis can be performed by deep convolutional neural networks (CNNs) across a wide spectrum of cancer types. In particular, we implement CNN architectures based on Google Inception v3 transfer learning to analyze 27815 H&E slides from 23 cohorts in The Cancer Genome Atlas in studies of tumor/normal status, cancer subtype, and mutation status. For 19 solid cancer types we are able to classify tumor/normal status of whole slide images with extremely high AUCs (0.995±0.008). We are also able to classify cancer subtypes within 10 tissue types with AUC values well above random expectations (micro-average 0.87±0.1). We then perform a cross-classification analysis of tumor/normal status across tumor types. We find that classifiers trained on one type are often effective in distinguishing tumor from normal in other cancer types, with the relationships among classifiers matching known cancer tissue relationships. For the more challenging problem of mutational status, we are able to classify TP53 mutations in three cancer types with AUCs from 0.65-0.80 using a fully-trained CNN, and with similar cross-classification accuracy across tissues. These studies demonstrate the power of CNNs for not only classifying histopathological images in diverse cancer types, but also for revealing shared biology between tumors. We have made software available at: https://github.com/javadnoorb/HistCNN

Download data

  • Downloaded 1,994 times
  • Download rankings, all-time:
    • Site-wide: 2,265 out of 63,081
    • In bioinformatics: 465 out of 6,269
  • Year to date:
    • Site-wide: 405 out of 63,081
  • Since beginning of last month:
    • Site-wide: 25 out of 63,081

Altmetric data


Downloads over time

Distribution of downloads per paper, site-wide


Sign up for the Rxivist weekly newsletter! (Click here for more details.)


News