Data in the GDC can be accessed through the user‑friendly web‑based GDC Data Portal, which enables browsing, querying and downloading of data and metadata. In addition, the GDC provides a command-line tool for downloading large volumes of data, and an application programming interface (API) for programmatic access to GDC functionality.
Open and Controlled Access Data
The NIH promotes broad and responsible sharing of genomic research data and respects the privacy and intentions of research participants.
Some data in the GDC is open access, which means that no authentication or authorization is necessary to access it. Other data is controlled access, which means that dbGaP authorization and eRA Commons authentication are necessary for access. Whether a dataset is open or controlled is determined according to Data Access Policies in a process that is driven by informed consent of research participants.
Open access data generally includes high level genomic data that is not individually identifiable, as well as most clinical and all biospecimen data elements.
Controlled data generally includes individually identifiable data such as low level genomic sequencing data, germline variants, SNP6 genotype data, and certain clinical data elements. Access to controlled data is granted by program-specific Data Access Committees. See Obtaining Access to Controlled Data for details.
Open Access DataNo login required for access
|
Controlled Access DataAuthorization required for access
|
Data Access Process
The GDC Data Portal provides a web-based facility for users to browse, query, and download data. To download controlled access data, users must login to eRA Commons and have access to the data through dbGaP. No login is required when accessing open access data. From the GDC Data Portal, users can query for the data and add files to the cart for download. For low volumes of metadata and data, users can download the data directly from the GDC Data Portal.
For large, high volume data, users can download the data using the GDC Data Transfer Tool, which is a client-based tool designed for efficient data transfer. To download multiple files at once with the Data Transfer Tool, the user can create and download a manifest within the GDC Data Portal. To download controlled access data the user can download a token from the GDC Data Portal. A GDC Application Programming Interface (API) is also available to download data programmatically.
Data Access Tools |
|
|
|
---|---|---|---|
Search data using predetermined filters called “facets” | |||
Query data using smart search advanced query language | |||
Analyze advanced data visualizations | |||
Requires dbGaP account to browse & download Controlled Access Data | |||
Download SMALL volumes of data | |||
Download LARGE volumes of data | |||