The GDC Data Portal is a groundbreaking tool that enables a better understanding of cancer biology by allowing researchers to:
The GDC provides web-based tools and API endpoints for searching, viewing and downloading data as well as client tools for downloading large volumes of data.
Any user requesting access to GDC controlled data must apply for access to the data through the database of Genotypes and Phenotypes (dbGaP):
The GDC obtains datasets from NCI programs which maintain tissue collection strategies that couple quantity with quality. Data validation is performed on all data submitted to the GDC.
Cancer Research Highlights and Publications:
BAI files are included with the download when using the GDC Data Transfer Tool to download BAM files.
When using the API to download BAM files, BAI files will only be included if the related_files=true
parameter is specified together with the BAM UUID, for example:
https://api.gdc.cancer.gov/data/53f4ad60-0777-409c-a34d-ca4442dc9c44?related_files=true
Alternatively, users can determine the BAI file UUID from the API files
endpoint by supplying the BAM UUID. The BAI file UUID can then be used to download the BAI file from the data
endpoint.
https://api.gdc.cancer.gov/files/53f4ad60-0777-409c-a34d-ca4442dc9c44?pretty=true&expand=index_files
https://api.gdc.cancer.gov/data/60cefd89-b428-46b7-b5b0-3b6e2743ab20
Note: BAI files are not available for sliced BAM files.
Need help with data retrieval, download, or submission?
NIH… Turning Discovery Into Health ®