Main Content

Data Analysis

TCGABiolinks

TCGAbiolinks was developed as an R/Bioconductor to address challenges with data mining and analysis of cancer genomics data stored at GDC. We offer bioinformatics solutions by using a guided workflow to allow users to query, download, and perform integrative analyses of GDC data. We combined methods from computer science and statistics into the pipeline and incorporated methodologies developed in previous TCGA marker studies. We also provide a graphics user interface (GUI) version of TCGAbiolinks that can run on a user's local machine.

GenomicDataCommons R-Package

The National Cancer Institute (NCI) Genomic Data Commons provides the cancer research community with an open and unified repository for sharing and accessing data across numerous cancer studies and projects via a high-performance data transfer and query infrastructure. The Bioconductor project is an open source and open development software project built on the R statistical programming environment. A major goal of the Bioconductor project is to facilitate the use, analysis, and comprehension of genomic data.

GDC RNASeq Tool

The GDC RNASeq Tool downloads / merges individual RNASeq files from the GDC Data Portal into a matrices identified by TCGA barcode.

The GDC RNASeq Tool:

  • Downloads RNA-Seq / miRNA-Seq data files using a GDC manifest file
  • Unzips the files into separate folders identified by experimental strategy and bioinformatics workflow
  • Merges the files into separate matrix files
Subscribe to Data Analysis