Main Content

Analyze Data

Why is the mutation frequency value higher (more mutations) in the OncoGrid than the # of Mutations listed on other pages for the same gene?

Submitted by Anonymous on

Mutation frequency in the context of the OncoGrid represents total mutation occurrences in the gene (total count), while the # of Mutations listed on other portions of the GDC Portal represents the number of unique mutations on a gene or within a particular cohort.

Why are there fewer cases in the 'Top Mutated Cancer Genes in Selected Projects' bar graph on the Project List Page, than there are affected cases listed on each project page?

Submitted by Anonymous on

There are less cases displayed with mutations in the 'Top Mutated Cancer Genes in Selected Projects' on the Project List Page because there is a filter on cases that have mutations on 1) Genes in the Cancer Gene Census and 2) Mutations with consequence types of {missense_variant, frameshift_variant, start_lost, stop_lost, initiator_codon_variant, stop_gained}.

In the OncoGrid, why are there less cases than there are cases listed as having mutations?

Submitted by Anonymous on

The cases in the OncoGrid are filtered by consequence type. Only cases that have mutations that have consequence types of: {missense_variant, frameshift_variant, start_lost, stop_lost, initiator_codon_variant, stop_gained} are displayed in the OncoGrid.

Why does the GDC display common genes such as TTN that are associated with every cancer in the most frequently mutated genes table?

Submitted by Anonymous on

The GDC is not normalizing frequency by gene length. This is currently under discussion. As such, these genes are appearing in the mutated genes table. Users can filter by the COSMIC Cancer Gene Census to display only genes for which mutations have been causally implicated in cancer.

Why are the number of analyzed cases in the MAF header not equal to the number of cases displayed in the GDC Data Portal?

Submitted by Anonymous on

Within the GDC data analysis workflow, both public (somatic) MAFs and protected MAFs generated are from the same pipeline and link back to the same cases. For example, For the TCGA-GBM project, the somatic MAF has the following header:

# in TCGA.GBM.muse.7e85de23-3855-4279-a3ac-a81827e4ccb6.DR6.0.somatic.maf.gz
#version gdc-1.0.0
#filedate 20170307
#n.analyzed.samples 393

Subscribe to Analyze Data