Science. Volume 385, Number 6713, eadk9217, September 6, 2024 10.1126/science.adk9217
To identify cancer-associated gene regulatory changes, we generated single-cell chromatin accessibility landscapes across eight tumor types as part of The Cancer Genome Atlas. Tumor chromatin accessibility is strongly influenced by copy number alterations that can be used to identify subclones, yet underlying cis-regulatory landscapes retain cancer type–specific features. Using organ-matched healthy tissues, we identified the “nearest-healthy” cell types in diverse cancers, demonstrating that the chromatin signature of basal-like–subtype breast cancer is most similar to secretory-type luminal epithelial cells. Neural network models trained to learn regulatory programs in cancer revealed enrichment of model-prioritized somatic noncoding mutations near cancer-associated genes, suggesting that dispersed, nonrecurrent, noncoding mutations in cancer are functional. Overall, these data and interpretable gene regulatory models for cancer and healthy tissue provide a framework for understanding cancer-specific gene regulation.
Data in the GDC
- GDC Supplemental Manifests
- Open-Access Data - Download Manifest (82 Files)
- Controlled-Access Data - Download Manifest (1 File)
- scATAC-Seq Data (fastqs; controlled) - Download Manifest (1,062 Files)
- scATAC-Seq Data (bams; controlled) - Download Manifest (148 Files)
- GDC Harmonized Manifests
- Associated WGS - Controlled - Download Manifest (326 Files)
- Associated WGS - Open - Download Manifest (82 Files)
Supplemental Data
- Cancer_scATACseq_data.tar.gz - Cancer scATACseq data
- Healthy_scATACseq_data.tar.gz - Healthy scATACseq data
- ASCAT_CNV_Calls.tar.gz - ASCAT CNV Calls
- Access_controlled_Supplemental_Table.tar.gz - Access controlled Supplemental Table
- Denoising_Autoencoder_Weights.tar.gz - Denoising Autoencoder Weights
- Breast_cleaned_combined_peakset.tar.gz - Breast cleaned Motif Instances on Cancer Healthy combined peakset
- Breast_cleaned_Cancer_peakset.tar.gz - Breast cleaned Motif Instances on Cancer individual peakset
- Cancer_cleaned_Cancer_peakset.tar.gz - Cancer cleaned Motif Instances on Cancer specific peakset
- GBM_cleaned_gbm_subclone_peakset.tar.gz - GBM sub clone cleaned motif instances on gbm subclone peakset
- BLCA_peaksets.tar.gz - BLCA peak sets
- BRCA_peaksets.tar.gz - BRCA peak sets
- Individual_BRCA_sample_cleaned_motifs.tar.gz - Individual BRCA sample cleaned motifs
- BRCA_HEALTHY_cleaned_motifs_Figure_4.tar.gz - BRCA healthy cleaned motifs
- GBM_subclone_motifs.tar.gz - GBM subclone motifs
- Healthy_nearest_neighbor_peaks_for_LDSC.tar.gz - Healthy nearest neighbor beaks for LDSC
- gencodev39_cage_ratio_to_sum_refined_tss_positions
- negative_sampling_model_training_250_1364_fig2.tsv - Negative sampling model training
- tcga_canceronly_top50klogtfidf_221011.h5ad
- Cancer_Type_Peaks.tar.gz - Cancer Type Peaks
- GBM_Sub_Clone_Peaks.tar.gz - GBM Sub Clone Peaks
- Immune_Cancer_Peaks.tar.gz - Immune Cancer Peaks
- KIRC_KIRP_peaksets.tar.gz - KIRC KIRP peaksets
- Kidney_cancer_control_peaksets.tar.gz - Kidney cancer control peaksets
- LUAD_peaksets.tar.gz - LUAD peaksets
- Lung_cancer_control_peaksets.tar.gz - Lung cancer control peaksets
- Seq2ATAC NN model weights
- Bladder_Healthy_tissue_Seq2ATAC_NN_model_weights.tar.gz
- BLCA1_Seq2ATAC_NN_model_weights.tar.gz
- BLCA2_Seq2ATAC_NN_model_weights.tar.gz
- BLCA3_Seq2ATAC_NN_model_weights.tar.gz
- BLCA4_Seq2ATAC_NN_model_weights.tar.gz
- BLCA5_Seq2ATAC_NN_model_weights.tar.gz
- BLCA6_Seq2ATAC_NN_model_weights.tar.gz
- BLCA7_Seq2ATAC_NN_model_weights.tar.gz
- BLCA8_Seq2ATAC_NN_model_weights.tar.gz
- BLCA9_Seq2ATAC_NN_model_weights.tar.gz
- Breast_Healthy_Seq2ATAC_NN_model_weights.tar.gz
- BRCA10_Seq2ATAC_NN_model_weights.tar.gz
- BRCA11_Seq2ATAC_NN_model_weights.tar.gz
- BRCA12_Seq2ATAC_NN_model_weights.tar.gz
- BRCA13_Seq2ATAC_NN_model_weights.tar.gz
- BRCA14_Seq2ATAC_NN_model_weights.tar.gz
- BRCA15_Seq2ATAC_NN_model_weights.tar.gz
- BRCA16_Seq2ATAC_NN_model_weights.tar.gz
- BRCA17_Seq2ATAC_NN_model_weights.tar.gz
- BRCA18_Seq2ATAC_NN_model_weights.tar.gz
- BRCA19_Seq2ATAC_NN_model_weights.tar.gz
- BRCA20_Seq2ATAC_NN_model_weights.tar.gz
- BRCA21_Seq2ATAC_NN_model_weights.tar.gz
- BRCA22_Seq2ATAC_NN_model_weights.tar.gz
- BRCA23_Seq2ATAC_NN_model_weights.tar.gz
- BRCA24_Seq2ATAC_NN_model_weights.tar.gz
- BRCA25_Seq2ATAC_NN_model_weights.tar.gz
- BLCA_Seq2ATAC_NN_model_weights.tar.gz
- BRCA_Seq2ATAC_NN_model_weights.tar.gz
- COAD_Seq2ATAC_NN_model_weights.tar.gz
- GBM_Seq2ATAC_NN_model_weights.tar.gz
- LUAD_Seq2ATAC_NN_model_weights.tar.gz
- KIRC_Seq2ATAC_NN_model_weights.tar.gz
- KIRP_Seq2ATAC_NN_model_weights.tar.gz
- SKCM_Seq2ATAC_NN_model_weights.tar.gz
- GBM39_SubCloneA_Seq2ATAC_NN_model_weights.tar.gz
- GBM39_SubCloneB_Seq2ATAC_NN_model_weights.tar.gz
- GBM45_SubCloneA_Seq2ATAC_NN_model_weights.tar.gz
- GBM45_SubCloneB_Seq2ATAC_NN_model_weights.tar.gz
- KIRC47_Seq2ATAC_NN_model_weights.tar.gz
- KIRC48_Seq2ATAC_NN_model_weights.tar.gz
- KIRC49_Seq2ATAC_NN_model_weights.tar.gz
- KIRC50_Seq2ATAC_NN_model_weights.tar.gz
- KIRP51_Seq2ATAC_NN_model_weights.tar.gz
- KIRP52_Seq2ATAC_NN_model_weights.tar.gz
- KIRP53_Seq2ATAC_NN_model_weights.tar.gz
- KIRP54_Seq2ATAC_NN_model_weights.tar.gz
- LUAD55_Seq2ATAC_NN_model_weights.tar.gz
- LUAD56_Seq2ATAC_NN_model_weights.tar.gz
- LUAD57_Seq2ATAC_NN_model_weights.tar.gz
- LUAD59_Seq2ATAC_NN_model_weights.tar.gz
- LUAD60_Seq2ATAC_NN_model_weights.tar.gz
- LUAD61_Seq2ATAC_NN_model_weights.tar.gz
- LUAD62_Seq2ATAC_NN_model_weights.tar.gz
- LUAD63_Seq2ATAC_NN_model_weights.tar.gz
- LUAD64_Seq2ATAC_NN_model_weights.tar.gz
- LUAD65_Seq2ATAC_NN_model_weights.tar.gz
Instructions for Data Download
Open Access Data
- Download the appropriate manifest file from the publication page
- Use the manifest file to download data using the GDC Data Transfer Tool (DTT) or the GDC API
- GDC DTT (Download, User's Guide)
- GDC API (User’s Guide)
Controlled Access Data
- Download the appropriate manifest file from the publication page
- Download a token from the GDC Data Portal
- GDC Data Portal (Launch, User’s Guide)
- Use the manifest file and token to download data using the GDC DTT or the GDC API
- GDC DTT (Download, User’s Guide)
- GDC API (User’s Guide)
For assistance, please contact the GDC Help Desk: support@nci-gdc.datacommons.io.