Cancer Cell. Volume 37, Issue 5 p.639-654.E6., 11 May 2020 10.1016/j.ccell.2020.04.012
Summary
We evaluated ancestry effects on mutation rates, DNA methylation, and mRNA and miRNA expression among 10,678 patients across 33 cancer types from The Cancer Genome Atlas. We demonstrated that cancer subtypes and ancestry-related technical artifacts are important confounders that have been insufficiently accounted for. Once accounted for, ancestry-associated differences spanned all molecular features and hundreds of genes. Biologically significant differences were usually tissue-specific but not specific to cancer. However, admixture and pathway analyses suggested some of these differences are causally related to cancer. Specific findings included an increased FBXW7 mutations in patients of African origin, decreased VHL and PBRM1 mutations in renal cancer patients of African origin, and decreased immune activity in bladder cancer patients of East Asian origin.
Significance
We conducted the most comprehensive analysis to date of the molecular effects of ancestry across cancer or normal tissues. We found that, though many ancestry effects were shared by normal tissues, they were profoundly tissue-specific, suggesting ancestry effects have to be considered primarily on a per-tissue basis both among cancers and non-cancer tissues. In tissue-specific analyses of normal tissue especially, more samples from diverse ancestries are required for comprehensive ancestry analyses, and we identified important controls for confounders and artifacts that need to be applied in such studies. Differences between African, European, and East Asian groups in renal and bladder cancers in particular suggest that ancestry should be taken into account when considering routes to disease and response to immunotherapies.
Data in the GDC
- GDC Manifests
- Open Access Data - Download Manifest (16 Files)
- Controlled Access Data - Download Manifest (25 Files)
Supplemental Data Files
- TCGA QC HRC Imputed Genotyping Data used by the AIM AWG (from Sayaman et al.)
- Information on composition of genotyping files - READ_ME.txt
- File mapping TCGA Patient ID to corresponding Birdseed genotyping files - Map_TCGAPatientID_BirdseedFileID.txt
- QC Unimputed Genotyping Data
- Read me for quality-controlled unimputed genotyping data - READ_ME_1.txt
- Quality-controlled unimputed genotyping data plink files - QC_Unimputed_plink.zip
- HRC Stranded Genotyping Data
- Read me for HRC Stranded Genotyping data - READ_ME_2.txt
- HRC Stranded Genotyping data vcf files - HRC_Stranded_vcf.zip
- 1000G Stranded Genotyping Data
- Read me for 1000G Stranded Genotyping data - READ_ME_3.txt
- 1000G Stranded Genotyping data vcf files - 1000G_Stranded_vcf.zip
- HRC Imputed Genotyping Data
- HRC imputed genotyping data for chromosome 1 - chr_1.zip
- HRC imputed genotyping data for chromosome 2 - chr_2.zip
- HRC imputed genotyping data for chromosome 3 - chr_3.zip
- HRC imputed genotyping data for chromosome 4 - chr_4.zip
- HRC imputed genotyping data for chromosome 5 - chr_5.zip
- HRC imputed genotyping data for chromosome 6 - chr_6.zip
- HRC imputed genotyping data for chromosome 7 - chr_7.zip
- HRC imputed genotyping data for chromosome 8 - chr_8.zip
- HRC imputed genotyping data for chromosome 9 - chr_9.zip
- HRC imputed genotyping data for chromosome 10 - chr_10.zip
- HRC imputed genotyping data for chromosome 11 - chr_11.zip
- HRC imputed genotyping data for chromosome 12 - chr_12.zip
- HRC imputed genotyping data for chromosome 13 - chr_13.zip
- HRC imputed genotyping data for chromosome 14 - chr_14.zip
- HRC imputed genotyping data for chromosome 15 - chr_15.zip
- HRC imputed genotyping data for chromosome 16 - chr_16.zip
- HRC imputed genotyping data for chromosome 17 - chr_17.zip
- HRC imputed genotyping data for chromosome 18 - chr_18.zip
- HRC imputed genotyping data for chromosome 19 - chr_19.zip
- HRC imputed genotyping data for chromosome 20 - chr_20.zip
- HRC imputed genotyping data for chromosome 21 - chr_21.zip
- HRC imputed genotyping data for chromosome 22 - chr_22.zip
- Other Files
- miRNA information - miRNA_information.xlsx
- PCA analysis from UCSF - GDC_Publication_Page_Figure_S1.pdf
- PCA analysis from UCSF - GDC_Publication_Page_Figure_S2.pdf
- PCA analysis from University of Trento - GDC_Publication_Page_Figure_S3.pdf
- Odd vs even chromosome analysis from University of Trento - GDC_Publication_Page_Figure_S4.pdf
- Local Ancestry Calls from Admixed samples - local_ancestry_calls.tar.gz
- Principle Component Analysis - UCSF - UCSF_Ancestry_Calls.csv
- Whole exome ancestry calls - University of Trento - WES_ethnicity_calls_Trento_Cornell_20170526.txt
- Principle Component Analysis - WashU - WashU_PCA_ethnicity_assigned.tsv
- Principle Component Analysis - Broad - Broad_ancestry_PCA.txt
- Admix percent by sample - Admixture_by_sample.txt
Additional Resources
Instructions for Data Download
Open Access Data
- Download the appropriate manifest file from the publication page
- Use the manifest file to download data using the GDC Data Transfer Tool (DTT) or the GDC API
- GDC DTT ( Download, User's Guide)
- GDC API ( User’s Guide)
Controlled Access Data
- Download the appropriate manifest file from the publication page
- Download a token from the GDC Data Portal
- GDC Data Portal ( Launch, User’s Guide)
- Use the manifest file and token to download data using the GDC DTT or the GDC API
- GDC DTT ( Download, User’s Guide)
- GDC API ( User’s Guide)
For assistance, please contact the GDC Help Desk: support@nci-gdc.datacommons.io.