The following table and figure list entities and data submittable to the GDC. Links to the data dictionary and templates are also provided. Not all programs, projects, or cases will have data available for all types.
The following figure displays the relationship between the different submittable data model entities. Arrows point to the parent entity, which must be specified before the child entity. The complete GDC Data Model can be viewed here
Biospecimen and clinical data submitted in XML format must be valid with respect to the latest Biospecimen Core Resource (BCR) XML Schema. XML submission of biospecimen and clinical is only supported through he GDC API. Molecular sequence metadata submitted in XML format must be valid with respect to NCBI SRA XML Schema version 1.5.
Tab-separated value (TSV) files are typically submitted via the GDC Data Submission Portal user interface. These may be created in any text editor, or exported from MS Excel by using "Save As" from the File menu and selecting the format "Tab-delimited Text".
A TSV file contains data that correspond to a given entity defined in the GDC Data Dictionary. The file must contain a column for each required property for that entity; for example, see the Demographic entry. Each record in the TSV represents a submissible item described by the entity; for example, each line in a demographic TSV file contains metadata for a single case.
JSON data submitted as files to the GDC must have a structure that is valid with respect to the GDC Submission API specification. JSON files can be submitted via the GDC Data Submission Portal user interface.
NIH… Turning Discovery Into Health ®