The following table lists data type and data subtype categories used to classify imported data files available to users through GDC. Not all programs, projects, or cases will have data available for all types.
Imported Data Types and File Formats
Data Type | Data Subtype | Format |
---|---|---|
Raw Sequencing data | Aligned Reads | BAM |
Unaligned Reads | FASTQ | |
Coverage WIG | WIGGLE | |
Simple Nucleotide Variation | Genotypes | TSV |
Simple Germline Variation | MAF, VCF | |
Simple Somatic Mutation | ||
Simple Nucleotide Variation | ||
Raw Microarray Data | Raw Intensities | TSV |
CGH Array QC | ||
Intensities Log2Ratio | ||
Expression Control | ||
Intensities | ||
Normalized Intensities | ||
Probeset Summary | ||
Methylation Array QC Metrics | ||
Gene Expression | Gene Expression Quantification | TSV |
miRNA Quantification | ||
Isoform Expression Quantification | ||
Exon Junction Quantification | ||
Exon Quantification | ||
Gene Expression Summary | ||
Structural Rearrangement | Structural Germline Variation | VCF, FASTA |
Structural Variation | VCF, FASTA | |
DNA Methylation | Bisulfite Sequence Alignment | BAM |
Methylation Beta Value | TSV | |
Methylation Percentage | ||
Clinical | Clinical Data | XML |
Biospecimen Data | ||
Tissue Slide Image | SVS | |
Diagnostic Image | ||
Pathology Report | ||
Copy Number Variation | Copy Number Segmentation | TSV |
Copy Number Estimate | ||
Copy Number Germline Variation< | ||
LOH | ||
Copy Number QC Metrics | ||
Copy Number Variation | ||
Normalized Copy Numbers | ||
Copy Number Summary | ||
Probeset Call | ||
Protein Expression | Protein Expression Quantification | TSV |
Protein Expression Control | ||
Other | Microsatellite Instability | FSA |
ABI Sequence Trace | TR | |
Auxiliary Test |