The Genomic Data Commons (GDC) is a research program of the National Cancer Institute (NCI). The mission of the GDC is to provide the cancer research community with a unified repository and cancer knowledge base that enables data sharing across cancer genomic studies in support of precision medicine.
The National Cancer Institute, part of the National Institutes of Health (NIH), is the federal government's principal agency for cancer research and training. NCI’s mission is to lead, conduct, and support cancer research across the nation to advance scientific knowledge and help all people to live longer, healthier lives. NCI’s scope of work spans a broad spectrum of cancer research across a variety of disciplines and supports research training opportunities at career stages across the academic continuum.
GDC Promoting Precision Medicine in Oncology
The National Cancer Institute’s (NCI’s) Genomic Data Commons (GDC) is a data sharing platform that promotes precision medicine in oncology. It is not just a database or a tool; it is an expandable knowledge network supporting the import and standardization of genomic and clinical data from cancer research programs.
The GDC contains NCI-generated data from some of the largest and most comprehensive cancer genomic datasets, including The Cancer Genome Atlas (TCGA) and Therapeutically Applicable Research to Generate Effective Therapies (TARGET). For the first time, these datasets have been processed using a common set of bioinformatics pipelines, so that the data can be directly compared.
As a growing knowledge system for cancer, the GDC also enables researchers to submit data, and the GDC processes these data using bioinformatics pipelines for aligning the data to a common reference genome and generating higher level data such as variant calls and expression quantifications. As more researchers add clinical and genomic data to the GDC, it will become an even more powerful tool for making discoveries about the molecular basis of cancer that may lead to better care for patients.
The GDC provides the Research Community with the Following Benefits
- Access to high-quality standardized biospecimen, clinical, and molecular data
- Web-based tools supporting fine-grained queries, advanced visualization, smart search technologies, and personalized download facilities
- Bioinformatics pipelines supporting DNA and RNA sequence alignment against a common reference genome
- Programmatic interfaces supporting data retrieval, download, and submission by third party applications
- Resources supporting the high performance retrieval, download, and submission of GDC data
- Data submission tools for validating and submitting data into GDC
- Data generation pipelines supporting the high level data generation of DNA sequence variants, mutation analyses, SNP chip genotypes, and expression analyses
- Interfaces to eRA Commons and dbGaP for secure access to controlled data sets
|
|
Does the GDC Data Transfer Tool use random or sequential read/write? Does the choice of protocol make a difference?
The GDC Data Transfer Tool uses sequential read/write for each file segment that is being transferred. By default, the tool executes multipart transfers, which results in multiple parallel, sequential read or write operations. To turn off multipart transfers, users can set the number of processes to 1.
The latest news about the Genomic Data Commons (GDC):