Main Content

Analyze Data

GDC Webinar: Analyzing Data using GDC Data Analysis, Visualization, and Exploration (DAVE) Tools

The Analyzing Data using GDC Data Analysis, Visualization, and Exploration (DAVE) Tools webinar will help introduce users to GDC tools for analyzing data from cancer genomic studies.

Read more about GDC Webinar: Analyzing Data using GDC Data Analysis, Visualization, and Exploration (DAVE) Tools

How do I search for a particular mutation?

Submitted by Anonymous on Mon, 06/19/2017 - 09:42

To search for a mutation, you can utilize the Quick Search bar at the top right portion of the GDC Portal by entering in either a dbSNP reference cluster ID (rs#) or the coordinates of the chromosomal change. For example entering in 'rs121912651' or 'chr17:g.7674221G>A' will bring the user to the mutation entity page for that mutation.

Read more about How do I search for a particular mutation?

Why does the GDC display common genes such as TTN that are associated with every cancer in the most frequently mutated genes table?

Submitted by Anonymous on Wed, 06/14/2017 - 15:36

The GDC is not normalizing frequency by gene length. This is currently under discussion. As such, these genes are appearing in the mutated genes table. Users can filter by the COSMIC Cancer Gene Census to display only genes for which mutations have been causally implicated in cancer.

Read more about Why does the GDC display common genes such as TTN that are associated with every cancer in the most frequently mutated genes table?

On the GDC Project summary page or Mutation Frequency > Gene tab, why are the # of Mutations sometimes less than the # Affected Cases?

Submitted by Anonymous on Wed, 06/14/2017 - 15:28

The “# Mutations” column in the Project or Mutation Frequency > Gene tab displays the number of distinct (unique) mutations within the affected cases and not necessarily the total number of all mutations within the project or query filter.

Read more about On the GDC Project summary page or Mutation Frequency > Gene tab, why are the # of Mutations sometimes less than the # Affected Cases?

The NCI GDC Officially Launched DAVE: Data Analysis, Visualization, and Exploration

The NCI's Genomic Data Commons (GDC) officially launched Data Analysis, Visualization, and Exploration (DAVE) tools transforming the GDC from a cancer genomics data repository into an interactive k

Read more about The NCI GDC Officially Launched DAVE: Data Analysis, Visualization, and Exploration

Where can I find information on the format of GDC MAF Files?

Submitted by Anonymous on Wed, 06/07/2017 - 09:35

Please refer to the GDC MAF File Specification to obtain detailed information on the format of GDC MAF files.

Read more about Where can I find information on the format of GDC MAF Files?

Can I use the GDC Application Programming Interface (API) to retrieve data sets associated with visualizations?

Submitted by Anonymous on Wed, 06/07/2017 - 09:33

Yes. The GDC provides additional analysis endpoints to retrieve data sets associated with visualizations. Analysis endpoints include: survival, top_cases_counts_by_genes, top_mutated_genes_by_project, top_mutated_cases_by_gene, top_mutated_cases_by_ssm, and mutated_cases_count_by_project.

Please refer to the GDC API User's Guide Analysis Section for additional information.

Read more about Can I use the GDC Application Programming Interface (API) to retrieve data sets associated with visualizations?

How is survival analysis calculated?

Submitted by Anonymous on Tue, 06/06/2017 - 12:29

Read more about How is survival analysis calculated?

In the Most Frequent Mutations table for the VEP impact score, which algorithm in the VEP is the GDC using to determine “H" or “M”?

Submitted by Anonymous on Tue, 06/06/2017 - 12:21

The IMPACT is categorized by the Sequencing Ontology type of the variants that is also compatible to snpEff. The VEP IMPACT rating is a separate rating given for compatibility with other variant annotation tools (e.g. snpEff). Basically, each category is associated with a set of SO terms:

Read more about In the Most Frequent Mutations table for the VEP impact score, which algorithm in the VEP is the GDC using to determine “H" or “M”?

Why are there some projects without data analysis and visualization features?

Submitted by Anonymous on Tue, 06/06/2017 - 12:18

Data analysis and visualization features are only available for projects which maintain open-access MAF files. Programs such as the Molecular Analysis for Therapy Choice (MATCH) trial maintain controlled-access MAF files only. As such, data analysis and visualization cannot be applied to MATCH projects.

Read more about Why are there some projects without data analysis and visualization features?

Subscribe to Analyze Data