Data Submission Processes and Tools

GDC data submission processes are enabled through a user-friendly web-based tool and programmatic interface for submitting clinical and biospecimen data, as well as experiment metadata.  Large, high volume experiment files can be uploaded using a high performance client-based tool.

Data Submission Process

Organizations interested in submitting data to the GDC should first review the conditions for requesting data submission and then submit a request for GDC data submission via the GDC Data Submission Request Form. The GDC reviews all data submission requests and notifies the data submitter as to whether the study is approved for data submission into the GDC.

Once approved for data submission into the GDC, the data submitter works with the NCI Genomic Program Administrator (GPA) to register the study and subjects in dbGaP. Study registration includes working with NCI GPA to have the GDC listed as a Trusted Partner, providing the institutional certification, and adding approved data submitters in the dbGaP study registration. Approved data submitters should be limited to only users uploading data and metadata into the GDC. Upon completion of study registration, the data submitter will submit the Subject IDs associated with the study in dbGaP.

Once the study has been registered in dbGaP, the GDC will setup the project in the GDC. Provided the Subject IDs are registered in dbGaP and data submitters have dbGaP submitter privileges, data can then be uploaded and validated within the GDC.  After validation, data is submitted to the GDC for processing.  Once the data is submitted, the GDC will process applicable data sets (see Data Harmonization for additional details).  After data processing has been completed, the user can release their data to the GDC, which must occur six (6) months after GDC data processing, per GDC Data Submission Policies. Data is then made available through GDC Data Access Tools as open or controlled access per dbGaP authorization policies associated with the data set. 

Data Submission Processes and Tools  

The GDC provides web-based, client-based, and programmatic tools to guide users through the data submission process. Data submitters can use the web-based GDC Data Submission Portal for submitting small volumes of data and metadata, and the client-based GDC Data Transfer Tool for submitting the large, high volume experiment data. A GDC Application Programming Interface (API) is available to large organizations to submit data programmatically through GDC submission pipelines.

Data Submission Tool Comparison

Data Submission Tools GDC Data Portal Tool icon
GDC Data Submission Portal
(Web-Based)
GDC Data Submission Client Tool icon
GDC Data Transfer Tool (Client-Based)
GDC API icon
GDC API (Programmatic)
Requires dbGaP Authorization Tools Checkmark Tools Checkmark Tools Checkmark
Submit Clinical Data Tools Checkmark   Tools Checkmark
Submit Biospecimen Data Tools Checkmark   Tools Checkmark
Submit Experiment Metadata Tools Checkmark   Tools Checkmark
Submit Experiment Files   Tools Checkmark  
Upload Small Volumes of Data Tools Checkmark Tools Checkmark Tools Checkmark
Upload Large Volumes of Data   Tools Checkmark