Generated Data Types and File Formats

The following table lists analyses and data files generated by the GDC. Only data files are made available through GDC data access tools. Not all programs, projects, or cases will have data available for all entities.

Entity Category Entity Name Access (Open, Controlled) File Format File Metadata Template
Analysis Read Group QC -- -- TSV, JSON
Alignment + Co-cleaning -- -- TSV, JSON
Alignment -- -- TSV, JSON
Genomic Profile Harmonization -- -- TSV, JSON
RNA Expression -- -- TSV, JSON
miRNA Expression -- -- TSV, JSON
Germline Mutation Calling -- -- TSV, JSON
Somatic Mutation Calling -- -- TSV, JSON
Structural Variation Calling -- -- TSV, JSON
Data File Aggregated Somatic Mutation Controlled MAF TSV, JSON
Aligned Reads Controlled BAM TSV, JSON
Gene Expression Open TSV TSV, JSON
Masked Somatic Mutation Open MAF TSV, JSON
miRNA Expression Open TSV TSV, JSON
Simple Germline Variation Controlled VCF TSV, JSON
Structural Variation Controlled TSV TSV, JSON

GDC Data Model for Generated Files

The following figure displays the relationship between the different data model entities generated by the GDC. Arrows point to the parent entity. The complete GDC Data Model can be viewed here

GDC Data Model - v1.12 - Generated