Main Content

Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification within and across Tissues of Origin

Cell. Volume 158: 929-944, 14 August 2014 10.1016/g.cell.2014.06.049

Recent genomic analyses of pathologically defined tumor types identify ‘‘within-a-tissue’’ disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head and neck, and a subset of bladder cancers coalesced into one subtype typified by TP53 alterations, TP63 amplifications, and high expression of immune and proliferation pathway genes. Of note, bladder cancers split into three pan-cancer subtypes. The multiplatform classification, while correlated with tissue-of-origin, provides independent information for predicting clinical outcomes. All data sets are available for data-mining from a unified resource to support further biological discoveries and insights into novel therapeutic strategies.

Data in the GDC

Supplemental Data

The following are input data and output data (cluster assignments) for 6 platforms:

Platform Input Data Subtype Assignments
mRNAseq rnaseq_input.txt rnaseq_output.txt
miRNAseq mirna_input.csv mirna_output.txt
Somatic Copy Number (SCNA) SCNA_input.txt SCNA_output.txt
Methylation DNAmethylation_input.csv DNAmethylation_output.csv
Reverse Phase Protein Array (RPPA) RPPA_input.csv RPPA_output.csv
Mutated Pathways mutatedpath_input.tsv mutatedpath_output.tsv

All supplemental data (including tables, figures, and data files) for this publication are available in the Sage Bionetworks Synapse repository (doi:10.7303/syn2468297).

Tools for Exploring Data and Analyses

Associated Data Files

The analyses were derived from a data freeze for 12 TCGA cancer types made on December 22, 2012. The sample lists for the data freeze are available in the Sage Bionetworks Synapse repository (doi:10.7303/syn300013).

Instructions for Data Download

Open Access Data

  1. Download the appropriate manifest file from the publication page
  2. Use the manifest file to download data using the GDC Data Transfer Tool (DTT) or the GDC API

Controlled Access Data

  1. Download the appropriate manifest file from the publication page
  2. Download a token from the GDC Data Portal
  3. Use the manifest file and token to download data using the GDC DTT or the GDC API

For assistance, please contact the GDC Help Desk: support@nci-gdc.datacommons.io.