The GDC provides an array of interactive, web-based Analysis Tools for performing in-depth gene- and variant- level analyses. The workflow is cohort-centric, meaning analyses are specific to a researcher's cohort of interest.
|
Documentation
Data Analysis Tools
The GDC provides interactive, cohort-centric tools for analyzing genomic and clinical data.
Data Analysis Policy
Policies and guidelines for appropriate use of data, are provided by the GDC whether open- or controlled- access.
Data Harmonization and Generation
The GDC developers best-in-practice pipelines for processing the most common molecular platforms. Variant calling, gene expression analysis, and other pipelines are implemented using software and algorithms selected in consultation with experts in the genomics community.
What’s New with GDC and Cancer Research
Cancer Research Highlights and Publications:
In the Most Frequent Mutations table for the VEP impact score, which algorithm in the VEP is the GDC using to determine “H" or “M”?
The IMPACT is categorized by the Sequencing Ontology type of the variants that is also compatible to snpEff. The VEP IMPACT rating is a separate rating given for compatibility with other variant annotation tools (e.g. snpEff). Basically, each category is associated with a set of SO terms:
- HIGH: The variant is assumed to have high (disruptive) impact in the protein, probably causing protein truncation, loss of function or triggering nonsense mediated decay: transcript_ablation, splice_acceptor_variant, splice_donor_variant, stop_gained, frameshift_variant, stop_lost, start_lost, transcript_amplification
- MODERATE: A non-disruptive variant that might change protein effectiveness: inframe_insertion, inframe_deletion, missense_variant, protein_altering_variant, regulatory_region_ablation
- LOW: Assumed to be mostly harmless or unlikely to change protein behavior: splice_region_variant, incomplete_terminal_codon_variant, stop_retained_variant, synonymous_variant
- MODIFIER: Usually non-coding variants or variants affecting non-coding genes, where predictions are difficult or there is no evidence of impact: coding_sequence_variant, mature_miRNA_variant, 5_prime_UTR_variant, 3_prime_UTR_variant, non_coding_transcript_exon_variant, intron_variant, NMD_transcript_variant, non_coding_transcript_variant, upstream_gene_variant, downstream_gene_variant, TFBS_ablation, TFBS_amplification, TF_binding_site_variant, regulatory_region_amplification, feature_elongation, regulatory_region_variant, feature_truncation, intergenic_variant
Details about predicted data in variations are available at ENSEMBL
Need help with data retrieval, download, or submission?