Data Harmonization and Generation


The GDC draws upon the expertise of collaborators in the development of pipelines supporting data harmonization including the standardization of associated biospecimen and clinical data, the re-alignment of DNA and RNA sequence data against a common reference genome build, and the generation of derived data. GDC pipelines are implemented using data processing software and algorithms selected in consultation with the expert genomics community.

This section of the website describes the strategies employed by GDC for harmonizing data along with the software and algorithms used by the GDC in the data harmonization process.