Accessing Genomic Data
The GDC Data Portal is a groundbreaking tool that enables a better understanding of cancer biology by allowing researchers to:
|
Documentation
Data Access Tools
The GDC provides web-based tools and API endpoints for searching, viewing and downloading data as well as client tools for downloading large volumes of data.
Controlled Data Access Policy
Any user requesting access to GDC controlled data must apply for access to the data through the database of Genotypes and Phenotypes (dbGaP):
High Quality Datasets
The GDC obtains datasets from NCI programs which maintain tissue collection strategies that couple quantity with quality. Data validation is performed on all data submitted to the GDC.
What’s New with the GDC and Cancer Research
Cancer Research Highlights and Publications:
Why are there fewer open access TCGA mutations in DR 32 (GENCODE Update Release)?
The primary reasons for the fewer open-access mutations are from two strategies that improve quality: 1) TCGA is now using a 2-caller ensemble, instead of a single caller; 2) Removal of variants outside of the target capture region, instead of a combined “target capture + GAF exonic region”. Additionally, TCGA was the original project in which GDC open-access variants were produced and used variant rescue steps that only applied to TCGA. To keep the TCGA variant-calling pipeline consistent across projects, GDC is no longer rescuing MC3 and TCGA validation variants.
Need help with data retrieval, download, or submission?