The following are some helpful resources for general information about cancer:
Helpful Cancer Genomics resource:
The GDC provides several resources for querying and downloading data from the GDC including the GDC Data Portal for querying and downloading GDC data files, the GDC Data Transfer Tool for downloading large volumes of files, and the GDC Application Programming Interface (API) for performing programmatic queries and downloads.
Generally, browsing indexed GDC metadata (such as information about the cases and files contained in the GDC Data Portal) does not require a login.
eRA Commons authentication and dbGaP authorization are required before accessing controlled data, which generally includes individually identifiable information such as low level genomic sequencing data and germline variants.
Controlled-access data users log in to the GDC using their eRA Commons accounts. The GDC then verifies that the user has authorization in dbGaP to access specific controlled datasets.
See Obtaining Access to GDC Data and Resources for more information on data download, and Obtaining Access to Submit Data for information on data submission.
The GDC provides helpdesk support for data submission and other issues. For information on the GDC helpdesk, please visit GDC Support.
The GDC provides code examples in the GDC Application Programming Interface (API) User's Guide
GDC authentication tokens remain valid for 30 days.
The GDC employs a hierarchical data model which requires metadata and files to be attached only at particular nodes or points in the hierarchy. If you have questions, please review the GDC Data Model or contact GDC Support.
BAI files are included with the download when using the GDC Data Transfer Tool to download BAM files.
When using the API to download BAM files, BAI files will only be included if the related_files=true parameter is specified together with the BAM UUID, for example:
https://api.gdc.cancer.gov/data/53f4ad60-0777-409c-a34d-ca4442dc9c44?related_files=true
Alternatively, users can determine the BAI file UUID from the API files endpoint by supplying the BAM UUID. The BAI file UUID can then be used to download the BAI file from the data endpoint.
https://api.gdc.cancer.gov/files/53f4ad60-0777-409c-a34d-ca4442dc9c44?pretty=true&expand=index_files
https://api.gdc.cancer.gov/data/60cefd89-b428-46b7-b5b0-3b6e2743ab20
Note: BAI files are not available for sliced BAM files.