Main Content

Single-cell chromatin accessibility landscapes reveal nucleotide-resolved malignant regulatory programs in primary human cancers

Citation TBD.

To identify gene regulatory changes associated with malignancy, we generated a single-cell atlas of chromatin accessibility landscapes of cancer from 74 samples comprising 227,063 nuclei across eight tumor types as part of The Cancer Genome Atlas (TCGA). Chromatin accessibility landscapes in cancer are strongly influenced by copy number alterations that can also be used to identify subclones, yet underlying cis-regulatory landscapes retain strong cancer type-specific features. Using data from organ-matched healthy tissues, we identify the nearest-healthy cell types in diverse cancers, demonstrating that the epigenetic signature of basal-like subtype breast cancer is most similar to secretory-type luminal epithelial cells. Neural network models trained to learn regulatory programs in cancer revealed enrichment of model-prioritized somatic non-coding mutations near cancer-associated genes, suggesting that dispersed, non-recurrent non-coding mutations in cancer are functional. Overall, these data and interpretable gene regulatory models for cancer and healthy tissue provide a framework for understanding cancer-specific gene regulation.

Data in the GDC

Supplemental Data

Instructions for Data Download

Open Access Data

  1. Download the appropriate manifest file from the publication page
  2. Use the manifest file to download data using the GDC Data Transfer Tool (DTT) or the GDC API

Controlled Access Data

  1. Download the appropriate manifest file from the publication page
  2. Download a token from the GDC Data Portal
  3. Use the manifest file and token to download data using the GDC DTT or the GDC API

For assistance, please contact the GDC Help Desk: support@nci-gdc.datacommons.io.