Main Content

Analyze Data

Why does the GDC display common genes such as TTN that are associated with every cancer in the most frequently mutated genes table?

Submitted by Anonymous on

The GDC is not normalizing frequency by gene length. This is currently under discussion. As such, these genes are appearing in the mutated genes table. Users can filter by the COSMIC Cancer Gene Census to display only genes for which mutations have been causally implicated in cancer.

On the GDC Project summary page or Mutation Frequency > Gene tab, why are the # of Mutations sometimes less than the # Affected Cases?

Submitted by Anonymous on

The “# Mutations” column in the Project or Mutation Frequency > Gene tab displays the number of distinct (unique) mutations within the affected cases and not necessarily the total number of all mutations within the project or query filter.

Can I use the GDC Application Programming Interface (API) to retrieve data sets associated with visualizations?

Submitted by Anonymous on

Yes. The GDC provides additional analysis endpoints to retrieve data sets associated with visualizations. Analysis endpoints include: survival, top_cases_counts_by_genes, top_mutated_genes_by_project, top_mutated_cases_by_gene, top_mutated_cases_by_ssm, and mutated_cases_count_by_project.

Please refer to the GDC API User's Guide Analysis Section for additional information.

In the Most Frequent Mutations table for the VEP impact score, which algorithm in the VEP is the GDC using to determine “H" or “M”?

Submitted by Anonymous on

The IMPACT is categorized by the Sequencing Ontology type of the variants that is also compatible to snpEff. The VEP IMPACT rating is a separate rating given for compatibility with other variant annotation tools (e.g. snpEff). Basically, each category is associated with a set of SO terms:

Why are there some projects without data analysis and visualization features?

Submitted by Anonymous on

Data analysis and visualization features are only available for projects which maintain open-access MAF files. Programs such as the Molecular Analysis for Therapy Choice (MATCH) trial maintain controlled-access MAF files only. As such, data analysis and visualization cannot be applied to MATCH projects.

Subscribe to Analyze Data