Accessing Genomic Data
The GDC Data Portal is a groundbreaking tool that enables a better understanding of cancer biology by allowing researchers to:
|
Documentation
Data Access Tools
The GDC provides web-based tools and API endpoints for searching, viewing and downloading data as well as client tools for downloading large volumes of data.
Controlled Data Access Policy
Any user requesting access to GDC controlled data must apply for access to the data through the database of Genotypes and Phenotypes (dbGaP):
High Quality Datasets
The GDC obtains datasets from NCI programs which maintain tissue collection strategies that couple quantity with quality. Data validation is performed on all data submitted to the GDC.
What’s New with the GDC and Cancer Research
Cancer Research Highlights and Publications:
From the GDC FAQ
What data types were updated in DR 32 (GENCODE Update Release)?
- RNA-Seq
- Replaced all RNA-Seq data including: Alignments, Gene Expression (STAR) + New Normalization, Transcript Fusion
- Removed HTSeq Files
- Re-harmonized TCGA data to use the newer pipeline
- WXS/Targeted Sequencing
- Generated and versioned new annotated somatic mutations and Ensemble MAFs
- Re-harmonized TCGA data to use the newer pipeline (alignments + mutation calls)
- WGS
- Generated and versioned structural variant and gene level copy number data
- Methylation
- Re-harmonized TCGA methylation data to use the new SeSAMe pipeline
- scRNA-Seq
- Generated and versioned CPTAC-3 scRNA-Seq data
- Other
- Replaced gene level copy number files for TCGA with those harmonized using ASCAT
- Replaced somatic mutation files for FM-AD and transitioned to aliquot-level MAFs
- Replaced all GENIE files
Need Assistance?
Need help with data retrieval, download, or submission?