Main Content

Access Data

How can I download BAM index files (BAI files) using the API?

Submitted by Anonymous on

BAI files are included with the download when using the GDC Data Transfer Tool to download BAM files. 

When using the API to download BAM files, BAI files will only be included if the related_files=true parameter is specified together with the BAM UUID, for example: 

https://api.gdc.cancer.gov/data/53f4ad60-0777-409c-a34d-ca4442dc9c44?related_files=true 

How can I access GDC sequencing data in FASTQ format?

Submitted by Anonymous on

Raw sequencing files submitted to the GDC are processed using GDC Genomic Data Alignment pipelines. The processed data are made available in the GDC Data Portal as BAM files containing aligned reads and unmapped reads (if available). No reads are hard-clipped, but reads that were flagged as "failed" during an Illumina sequencing run are discarded.

Where can I find the target and bait/probe files (BED files) that describe the capture kit used in an exome sequencing experiment?

Submitted by Anonymous on

Capture kit information is provided by the GDC API at the read group level, where available. In some cases, additional information may be available in SRA XML files.

The relevant read_group properties returned by the GDC API are:

Subscribe to Access Data