Main Content

The chromatin accessibility landscape of primary human cancers

Science. Volume 362, Issue 6413 p.eaav1898, 25 October 2018 10.1126/science.aav1898

We present the genome-wide chromatin accessibility profiles of 410 tumor samples spanning 23 cancer types from The Cancer Genome Atlas. We identify 562,709 transposase-accessible DNA elements that substantially extend the compendium of known cis-regulatory elements. Integration of ATAC-seq with TCGA multi-omic data identifies a large number of putative distal enhancers that distinguish molecular subtypes of cancers, uncovers specific driving transcription factors via protein-DNA footprints, and nominates long-range gene-regulatory interactions in cancer. These data reveal genetic risk loci of cancer predisposition as active DNA regulatory elements in cancer, identify gene-regulatory interactions underlying cancer immune evasion, and pinpoint noncoding mutations that drive enhancer activation and may impact patient survival. These results suggest a systematic approach to understand the noncoding genome in cancer to advance diagnosis and therapy.

Data in the GDC

Views of the Data

The ATAC-seq peak accessibility and computed peak-to-gene linkage predictions are publicly available for interactive visualization and exploration at the UCSC Xena Browser (https://atacseq.xenahubs.net).

Supplemental Data Files

    • Data S1. Cancer types studied, donor characteristics, and sequencing statistics. [xlsx]
    • Data S2. Pan-cancer and breast cancer peak calls. [xlsx]
    • Data S3. Overlap of peaks with Roadmap DNase-seq, peak saturation analysis, and t-SNE positions of all samples. [xlsx]
    • Data S4. Distal binarization analysis and enrichment of motifs in cluster-specific peak sets. [xlsx]
    • Data S5. GWAS and eQTL analyses and overlap with peak-to-gene links. [xlsx]
    • Data S6. TF footprinting analyses and correlation to gene expression. [xlsx]
    • Data S7. Pan-cancer and breast cancer-specific peak-to-gene links and enhancer-to-gene links. [xlsx]
    • Data S8. ELMER and Regulon analyses. [xlsx]
    • Data S9. Peak-to-gene links related to immune response in cancer. [xlsx]
    • Data S10. Open Access. Integration of ATAC-seq and WGS to identify noncoding mutations. Only contains mutation positions, no base changes provided. [xlsx]
    • Data S10. Controlled Access. Integration of ATAC-seq and WGS to identify noncoding mutations. Contains mutation positions and base changes.[xlsx]
    • Protocol S1. Processing of frozen tissue fragments for ATAC-seq. [pdf]

Other Supplemental Data Files

  • ATAC-seq Counts Matrices
    • README to facilitate usage of count matrices. [TXT]
    • Normalized ATAC-seq insertion counts within the pan-cancer peak set. Recommended format. [RDS]
    • Normalized ATAC-seq insertion counts within the pan-cancer peak set. Not recommended due to size. [TXT]
    • Raw ATAC-seq insertion counts within the pan-cancer peak set. [RDS]
    • Raw ATAC-seq insertion counts within the pan-cancer peak set. [TXT]
    • All cancer type-specific count matrices in normalized counts. [ZIP]
    • All cancer type-specific count matrices in raw counts. [ZIP]
  • ATAC-seq Peak Calls
    • README to facilitate usage of peak call bed files. [TXT]
    • All cancer type-specific peak sets. [ZIP]
    • Pan-cancer peak set. [TXT]
    • Lookup table for various TCGA sample identifiers. [TXT]
  • BigWig Files for All Samples
    • README to facilitate usage of bigWig files [TXT]
    • Normalized bigWig files for the ACC cohort. [TAR-GZ]
    • Normalized bigWig files for the BLCA cohort. [TAR-GZ]
    • Normalized bigWig files for the BRCA cohort. [TAR-GZ]
    • Normalized bigWig files for the CESC cohort. [TAR-GZ]
    • Normalized bigWig files for the CHOL cohort. [TAR-GZ]
    • Normalized bigWig files for the COAD cohort. [TAR-GZ]
    • Normalized bigWig files for the ESCA cohort. [TAR-GZ]
    • Normalized bigWig files for the GBM cohort. [TAR-GZ]
    • Normalized bigWig files for the HNSC cohort. [TAR-GZ]
    • Normalized bigWig files for the KIRC cohort. [TAR-GZ]
    • Normalized bigWig files for the KIRP cohort. [TAR-GZ]
    • Normalized bigWig files for the LGG cohort. [TAR-GZ]
    • Normalized bigWig files for the LIHC cohort. [TAR-GZ]
    • Normalized bigWig files for the LUAD cohort. [TAR-GZ]
    • Normalized bigWig files for the LUSC cohort. [TAR-GZ]
    • Normalized bigWig files for the MESO cohort. [TAR-GZ]
    • Normalized bigWig files for the PCPG cohort. [TAR-GZ]
    • Normalized bigWig files for the PRAD cohort. [TAR-GZ]
    • Normalized bigWig files for the SKCM cohort. [TAR-GZ]
    • Normalized bigWig files for the STAD cohort. [TAR-GZ]
    • Normalized bigWig files for the TGCT cohort. [TAR-GZ]
    • Normalized bigWig files for the THCA cohort. [TAR-GZ]
    • Normalized bigWig files for the UCEC cohort. [TAR-GZ]

Analysis Code

    • Code for ELMER probe-to-gene analysis. [HTML]

Additional Resources

Instructions for Data Download

Open Access Data

  1. Download the appropriate manifest file from the publication page
  2. Use the manifest file to download data using the GDC Data Transfer Tool (DTT) or the GDC API

Controlled Access Data

  1. Download the appropriate manifest file from the publication page
  2. Download a token from the GDC Data Portal
  3. Use the manifest file and token to download data using the GDC DTT or the GDC API

For assistance, please contact the GDC Help Desk: support@nci-gdc.datacommons.io.