Cell. Volume 173 Issue 2: p291–304.e6, 5 April 2018 10.1016/j.cell.2018.03.022
We conducted comprehensive integrative molecular analyses of the complete set of tumors in The Cancer Genome Atlas (TCGA), consisting of approximately 10,000 specimens representing 33 types of cancer. We performed molecular clustering on chromosome arm-level aneuploidy, DNA hypermethylation, mRNA and miRNA expression levels and reverse phase protein arrays individually, of which all, except for aneuploidy, revealed clustering organized primarily by histology, tissue type, or anatomic origin. The influence of cell type was evident in DNA methylation-based clustering, even after exclusion of sites with known preexisting tissue-type-specific methylation. Integrative clustering further emphasized the dominant role of cell-of-origin patterns. Molecular similarities among histologically or anatomically-related cancer types provide a basis for focused pan-cancer analyses, such as Pan-Gastrointestinal, Pan-Gynecological, Pan-Kidney, Pan-Squamous cancers, and those related by stemness features, which in turn may inform treatment strategies.
Data in the GDC
- GDC Manifests
- Open-Access Data - Download Manifest (24 Files)
- Controlled-Access Data - Download Manifest (2 Files)
Supplemental Data
- Sample Annotations
- Analyte level annotations - merged_sample_quality_annotations.tsv
- Mutation Files
- Controlled mutation annotation file - mc3.v0.2.8.CONTROLLED.maf.gz
- Public mutation annotation file - mc3.v0.2.8.PUBLIC.maf.gz
- ABSOLUTE-annotated MAF file - TCGA_consolidated.abs_mafs_truncated.fixed.txt.gz
- Molecular Signatures - tcga_pancancer_082115.vep.filter_whitelisted.context.maf.signatures.txt
- Mutation Load - mutation-load-updated.txt
- DNA copy number Files
- SNP6 whitelisted copy number segments file - broad.mit.edu_PANCAN_Genome_Wide_SNP_6_whitelisted.seg
- GISTIC2.0 all_thresholded.by_genes file - all_thresholded.by_genes_whitelisted.tsv
- GISTIC2.0 all_data_by_genes file - all_data_by_genes_whitelisted.tsv
- ISAR-corrected SNP6 whitelisted copy number segments file - ISAR_corrected.PANCAN_Genome_Wide_SNP_6_whitelisted.seg
- gzipped ISAR-corrected GISTIC2.0 all_thresholded.by_genes file - ISAR_GISTIC.all_thresholded.by_genes.txt
- gzipped ISAR-corrected GISTIC2.0 all_data_by_genes file - ISAR_GISTIC.all_data_by_genes.txt.gz
- ABSOLUTE-annotated seg file - TCGA_mastercalls.abs_segtabs.fixed.txt
- ABSOLUTE purity/ploidy file - TCGA_mastercalls.abs_tables_JSedit.fixed.txt
- Aneuploidy scores and arm calls file - PANCAN_ArmCallsAndAneuploidyScore_092817.txt
- DNA Methylation Files
- Between-platform normalization for DNA Methylation data - DNA methylation (Merged 27K+450K) Beta Value - jhu-usc.edu_PANCAN_merged_HumanMethylation27_HumanMethylation450.betaValue_whitelisted.tsv
- DNA methylation 450K only beta value data matrix - jhu-usc.edu_PANCAN_HumanMethylation450.betaValue_whitelisted.tsv
- Leukocyte score - TCGA_all_leuk_estimate.masked.20170107.tsv
- RNA and Protein Files
- RNA batch corrected matrix - EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv
- miRNA batch corrected matrix - pancanMiRs_EBadjOnProtocolPlatformWithoutRepsWithUnCorrectMiRs_08_04_16.csv
- miRNA sample information - PanCanAtlas_miRNA_sample_information_list.txt
- RPPA batch corrected matrix - TCGA-RPPA-pancan-clean.txt
- Other Files
- PARADIGM Pathway Inference Matrix - merge_merged_reals.tar.gz
- DNA methylation Stemness signatures (lists of probes and genes) - DNAmethylation and RNAexpression Stemness Signatures.xlsx
- DNA methylation and RNA stemness scores - SupplementalTable_S1.xlsx
- iCluster input features - pancan33.iCluster.features.csv
Additional Resources
- Broad Institute FireCloud (link is external) The Broad Institute
- cBioPortal for Cancer Genomics (link is external) Memorial Sloan-Kettering Cancer Center
- PanCanAtlas Additional Files
Instructions for Data Download
Open Access Data
- Download the appropriate manifest file from the publication page
- Use the manifest file to download data using the GDC Data Transfer Tool (DTT) or the GDC API
- GDC DTT ( Download, User's Guide)
- GDC API ( User’s Guide)
Controlled Access Data
- Download the appropriate manifest file from the publication page
- Download a token from the GDC Data Portal
- GDC Data Portal ( Launch, User’s Guide)
- Use the manifest file and token to download data using the GDC DTT or the GDC API
- GDC DTT ( Download, User’s Guide)
- GDC API ( User’s Guide)
For assistance, please contact the GDC Help Desk: support@nci-gdc.datacommons.io.