GDC Data Model

The GDC Data Model is the central method of organization of all data artifacts ingested by the GDC.

The data model is designed to maintain data and metadata consistency, integrity, and availability while accommodating:

Biospecimen, clinical, and cancer genomic data and metadata
Multiple, disparate NCI ongoing projects
Completely new, as yet unthought of projects
Ongoing changes and technological progress
Frequent and complex queries from both external users and internal administrators

To meet these requirements, the design and implementation of the data model leverages:

GDC Data Model components are further described in subsequent chapters.

(Created on: February 24, 2017 • Last updated on: February 24, 2017)