Imported Data Types and File Formats

The following table lists data type and data subtype categories used to classify imported data files available to users through GDC. Not all programs, projects, or cases will have data available for all types.

Data Type Data Subtype Format
Raw Sequencing data Aligned Reads BAM
Unaligned Reads FASTQ
Simple Nucleotide Variation Genotypes TSV
Simple Germline Variation MAFVCF
Simple Somatic Mutation
Simple Nucleotide Variation
Raw Microarray Data Raw Intensities TSV
CGH Array QC
Intensities Log2Ratio
Expression Control
Normalized Intensities
Probeset Summary
Methylation Array QC Metrics
Gene Expression Gene Expression Quantification TSV
miRNA Quantification
Isoform Expression Quantification
Exon Junction Quantification
Exon Quantification
Gene Expression Summary
Structural Rearrangement Structural Germline Variation VCFFASTA
Structural Variation VCFFASTA
DNA Methylation Bisulfite Sequence Alignment BAM
Methylation Beta Value TSV
Methylation Percentage
Clinical Clinical Data XML
Biospecimen Data
Tissue Slide Image SVS
Diagnostic Image
Pathology Report PDF
Copy Number Variation Copy Number Segmentation TSV
Copy Number Estimate
Copy Number Germline Variation<
Copy Number QC Metrics
Copy Number Variation
Normalized Copy Numbers
Copy Number Summary
Probeset Call
Protein Expression Protein Expression Quantification TSV
Protein Expression Control
Other Microsatellite Instability FSA
ABI Sequence Trace TR
Auxiliary Test