Main Content

Access Data

Can GDC data be used in commercial and academic tools?

Submitted by gaheens on

GDC data available as open-access data can be used with proper accreditation. Use of controlled-access data requires dbGaP access and users must abide by the data use agreement (DUA) associated with the study. See GDC Data Access and Sharing Policies or contact the NCI Office of Data Sharing directly for further clarification: NCIOfficeofDataSharing@mail.nih.gov.

Why does TCGABiolinks no longer work when retrieving diagnosis?

Submitted by gaheens on

TCGA clinical data was expanded in GDC Data Releases 42 and 43. TCGA clinical data used to have one diagnosis per case. With the clinical data expansion, it is possible that a TCGA case has multiple diagnoses. This could be due to pre-enrollment diagnoses or other reasons. To properly query for the diagnosis information associated with the molecular data, the primary disease flag should be set to true (i.e., diagnosis_is_primary_disease = true).

Why do some genes show no expression in STAR results across all samples, even though I can see mapped reads in the raw RNA-Seq data?

Submitted by gaheens on

STAR gene expression quantification excludes reads that are mapped to multiple different genes. This can cause some genes to appear with zero expression in the final counts, even if mapped reads are present in the raw data. 

Subscribe to Access Data