Main Content

General GDC

I only see patients with ages of 90 years or less in the GDC. Why is this?

Submitted by Anonymous on

HIPAA guidelines require that patients with ages greater than 89 years be aggregated into a single age category. This is to limit the ability to positively identify these individuals. In practice this will impact the values reported in several fields. We have chosen to accurately display the age at diagnosis, but fields that give dates or time periods after this benchmark may be compressed. This may include such fields as "Days to last follow up", "Days to last known disease status", "Days to recurrence", "Days to death", and "Year of death".

How is validation performed on genomic data (BAM files) submitted to the GDC?

Submitted by Anonymous on

Submitted BAM files are validated at the GDC for file integrity and format using md5sum checks, automated QC checks, and the Picard ValidateSamFiles tool. Sequencing quality is assessed using FASTQC, and additional quality metrics are gathered using tools like Picard and Samtools. Severe issues, such as high cross-sample contamination, may prevent the data from being released, but minor issues typically do not result in rejection. Instead, the GDC exposes many of the quality metrics so users may review them and do further filtering.

Subscribe to General GDC