Main Content

Analyze Data

Why do CNVs of different genes in the GDC Data Portal differ from other genomic portals?

Submitted by gaheens on Fri, 07/18/2025 - 15:33

This discrepancy is due to differences in the data processing pipelines used by the GDC and other genomic portals. At the GDC, gene-level CNVs are derived from a mix of standardized pipelines. For TCGA projects, the CNV values are prioritized in the following order: SNP6 ABSOLUTE (LiftOver) > SNP6 ASCAT3 > WGS AscatNGS > SNP6 ASCAT2. All of these workflows produce absolute integer copy number values.

Read more about Why do CNVs of different genes in the GDC Data Portal differ from other genomic portals?

How are the five categories of copy number changes determined?

Submitted by gaheens on Fri, 06/06/2025 - 16:32

The GDC begins with integer-level estimates of absolute copy number generated by either the ASCAT or ABSOLUTE pipeline. To establish a baseline, an integer-valued sample ploidy is computed as follows:

Read more about How are the five categories of copy number changes determined?

How does the GDC choose the default transcript for each variant?

Submitted by gaheens on Tue, 06/03/2025 - 14:15

When a mutation overlaps multiple transcripts or genes, the GDC annotates all consequences in the all_effects column of the MAF file and in the CONSEQUENCE table on the Mutation Summary Page. One transcript is then selected as the default for detailed annotation and visualization where a single consequence is shown.

Read more about How does the GDC choose the default transcript for each variant?

New Single Cell RNA-Seq Tool and CNV Categories Available in the GDC Data Portal

The GDC 2.4 Röntgen Release brings a range of new features aimed at enhancing data analysis and user experience.

Read more about New Single Cell RNA-Seq Tool and CNV Categories Available in the GDC Data Portal

Why do some genes show no expression in STAR results across all samples, even though I can see mapped reads in the raw RNA-Seq data?

Submitted by gaheens on Tue, 02/04/2025 - 14:03

STAR gene expression quantification excludes reads that are mapped to multiple different genes. This can cause some genes to appear with zero expression in the final counts, even if mapped reads are present in the raw data.

Read more about Why do some genes show no expression in STAR results across all samples, even though I can see mapped reads in the raw RNA-Seq data?

8000+ New WGS Variant Calls Now Available in the GDC!

The GDC is excited to announce the release of 8000+ new whole genome sequencing (WGS) variant calls as part of Data Release 42. Highlights of this release include:

Read more about 8000+ New WGS Variant Calls Now Available in the GDC!

New GDC Data Portal Features in the Pauling Release (GDC 2.3)

The latest GDC 2.3 Pauling Release introduces several new features designed to enhance user experience and improve data analysis.

Read more about New GDC Data Portal Features in the Pauling Release (GDC 2.3)

Join the GDC Analysis Tool Challenge!

The NCI GDC Analysis Tool Challenge is a collaborative competition aimed at enhancing cancer research by integrating innovative analysis tools with the GDC.

Read more about Join the GDC Analysis Tool Challenge!

In the GDC Data Portal, where is the histogram of top frequently mutated genes for a cohort?

Submitted by gaheens on Tue, 08/13/2024 - 14:14

To view the top frequently mutated genes for a cohort, first build a cohort using the Cohort Builder and then select the Mutation Frequency tool in the Analysis Center

Read more about In the GDC Data Portal, where is the histogram of top frequently mutated genes for a cohort?

New Gene-Level Tool Enhancements and Gene Expression API Now in the GDC 2.2 McClintock Release!

In the GDC 2.2 McClintock Release, the following features have been released:

Read more about New Gene-Level Tool Enhancements and Gene Expression API Now in the GDC 2.2 McClintock Release!

Subscribe to Analyze Data