Main Content

Access Data

Why did the GDC remove SomaticSniper?

Submitted by Anonymous on

The SomaticSniper whole exome variant caller was one of the first generation somatic mutation callers developed by the scientific community. It works the best with blood cancer that has high level of tumor-in-normal contaminations, but is often overly permissive for solid tumors. Since our first data release in 2016, the GDC has gradually adopted newer tools or new tool versions, and has transited the focus of somatic variant calling from any single caller to multi-caller ensemble.

Does the GDC provide access to germline variants?

Submitted by Anonymous on

Any germline SNP calls are not available for exploration in the GDC Data Portal. Instead, alignments for germline data are available under controlled access. Users with appropriate access may use the alignments to generate germline variants.

Some somatic variants callers, such as MuTect2, also output somatic calls with some level of germline possibilities, such as those labelled as "germline_risk". Please note that these calls are, by no means, germline variants. They are somatic calls with boundary probability of germline risks.

How often does the GDC update the workflow/reference genome? If the GDC updates the workflow/reference genome, does the GDC re-process all data sets?

Submitted by Anonymous on

For the reference genome, the GDC has been using an augmented version of GRCh38.p2 (with additional decoy sequences and virus sequences) since inception. The GDC does not use alternative contigs, and only derives high-level data from the major chromosomes, so the same reference genome is used for both gene model GENCODE v22 (from Data Release 1 to 31) and GENCODE v36 (from Data Release 32). As future versions of the reference genome are released, e.g., GRCh39, the GDC will evaluate the benefits of updating data to utilize the new version.

Subscribe to Access Data