Data Submission Processes
Organizations interested in submitting data to the GDC should first review the conditions for requesting data submission and then contact the GDC Help Desk: support@nci-gdc.datacommons.io. The GDC reviews all data submission requests and notifies the data submitter as to whether the study is approved for data submission into the GDC.
Once approved for data submission into the GDC, the data submitter works with the NCI Genomic Program Administrator (GPA) to register the study and subjects in dbGaP. Study registration includes working with NCI GPA to have the GDC listed as a Trusted Partner, providing the institutional certification, and adding approved data submitters in the dbGaP study registration. Approved data submitters should be limited to only users uploading data and metadata into the GDC. Upon completion of study registration, the data submitter will submit the Subject IDs associated with the study in dbGaP.
Once the study has been registered in dbGaP, the GDC will setup the project in the GDC. Provided the Subject IDs are registered in dbGaP and data submitters have dbGaP submitter privileges, data can then be uploaded and validated within the GDC. After validation, data is submitted to the GDC for processing. The GDC will process applicable data sets (see Data Harmonization for additional details). After data processing has been completed, the user can release their data to the GDC, which must occur six (6) months after GDC data processing, per GDC Data Submission Policies. Data is then made available through GDC Data Access Tools as open or controlled access per dbGaP authorization policies associated with the data set.
This process can be described in discrete steps as identified in the table below.
# | Data Submitter Step | Response |
---|---|---|
1 | Contact the GDC Help Desk: support@nci-gdc.datacommons.io | GDC pre-approves study according to GDC Guidelines and sends confirmation email. |
2 | Create eRA commons account for any data submitters that do not already have one | eRA Commons accounts are created. |
3 |
Contact NCI Office of Data Sharing to identify appropriate GPA and register study. Provide GPA with:
|
GPA inputs information into dbGaP system. |
4 |
Study PI and PI assistant/submitter receive email invitations to the dbGaP Submission Portal that must be accepted within 7 days. (Save email for future reference.) Submit to dbGaP Submission Portal to begin processing study:
Note: Other files required in dbGaP submission package may be blank. |
dbGaP processes information and produces PHS Accession Number. (This may take 4-6 weeks to process) |
5 |
Contact the GDC helpdesk to create a project for submission
|
GDC creates project within GDC submission tools |
6 | Upload, validate, and submit data to GDC for harmonization | GDC harmonizes data |
7 | Review and release harmonized data | GDC releases data |
Within dbGaP system (NLM/NCBI)
Within GDC (NCI)
This process can further be illustrated in the diagram below.
Data Submission Tools
The GDC provides web-based, client-based, and programmatic tools to guide users through the data submission process. Data submitters can use the web-based GDC Data Submission Portal for submitting small volumes of data and metadata, and the client-based GDC Data Transfer Tool for submitting the large, high volume experiment data. A GDC Application Programming Interface (API) is available to large organizations to submit data programmatically through GDC submission pipelines.
Data Submission Tools |
|
|
|
---|---|---|---|
Requires dbGaP Authorization |
|
||
Submit Clinical Data | |||
Submit Biospecimen Data | |||
Submit Experiment Metadata | |||
Submit Experiment Files | |||
Upload Small Volumes of Data | |||
Upload Large Volumes of Data | |||