Main Content

Why does TCGABiolinks no longer work when retrieving diagnosis?

Submitted by gaheens on

TCGA clinical data was expanded in GDC Data Releases 42 and 43. TCGA clinical data used to have one diagnosis per case. With the clinical data expansion, it is possible that a TCGA case has multiple diagnoses. This could be due to pre-enrollment diagnoses or other reasons. To properly query for the diagnosis information associated with the molecular data, the primary disease flag should be set to true (i.e., diagnosis_is_primary_disease = true).

Why do CNVs of different genes in the GDC Data Portal differ from other genomic portals?

Submitted by gaheens on

This discrepancy is due to differences in the data processing pipelines used by the GDC and other genomic portals. At the GDC, gene-level CNVs are derived from a mix of standardized pipelines. For TCGA projects, the CNV values are prioritized in the following order: SNP6 ABSOLUTE (LiftOver) > SNP6 ASCAT3 > WGS AscatNGS > SNP6 ASCAT2. All of these workflows produce absolute integer copy number values. 

Subscribe to