Clinical Proteomic Tumor Analysis Consortium (CPTAC)

The National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium (CPTAC) is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics.

Program Description

CPTAC is a comprehensive and coordinated effort to accelerate the understanding of the molecular basis of cancer through the application of robust, quantitative, proteomic technologies and workflows. The overarching goal of CPTAC is to improve our ability to diagnose, treat and prevent cancer. To achieve this goal in a scientifically rigorous manner, the NCI launched CPTAC to systematically identify proteins that derive from alterations in cancer genomes and related biological processes, and provide this data with accompanying assays and protocols to the public.

CPTAC has provided the Genomic Data Commons (GDC) with genomic data from a total of 1500+ cancer patients with diverse disease types including Endometrial, Renal, Lung Adenocarcinoma and Squamous Cell Carcinoma, Breast, Colon, Ovarian, Brain, Head and Neck, and Pancreatic cancers. The GDC harmonized DNA sequences from CPTAC whole genome sequencing (WGS), whole exomes sequencing (WXS), and RNA sequences with the GRCh38 reference genome using GDC DNA-Seq Analysis Pipelines and mRNA Analysis Pipelines, respectively. The CPTAC harmonized genomic data is available in the GDC Data Portal. CPTAC makes proteomic data that are processed through the CPTAC Common Data analysis Pipeline (CDAP) available in the CPTAC Data Portal. CPTAC proteomic data is also available in the Proteomic Data Commons (PDC).

Data Overview

Data Access

The CPTAC genomic data can be found on the GDC Data Portal. To request access to protected CPTAC data, please apply to dbGaP for access to the CPTAC 3 Study (study accession phs001287 – endometrial, lung, kidney, brain, head and neck, and pancreatic cancers) or the CPTAC 2 Study (study accession phs000892 – ovarian, breast and colon cancers).

Browse the CPTAC Genomic Data in the GDC »

Browse the CPTAC Proteomic Data in CPTAC »

Browse the CPTAC Proteomic Data in the PDC »

Apply to dbGaP »

Cancer Types

Disease Type	Primary Site
Adenomas and Adenocarcinomas	Breast, Bronchus and lung, Colon, Kidney, Rectum, Uterus, NOS
Blood Derived Normal	Breast
Cystic, Mucinous and Serous Neoplasms	Breast, Other and unspecified female genital organs, Ovary, Retroperitoneum and peritoneum
Ductal and Lobular Neoplasms	Breast, Pancreas
Gliomas	Brain
Solid Tissue Normal	Brain, Pancreas, Breast
Squamous Cell Neoplasms	Breast, Bronchus and lung, Other and ill-defined sites

Associated Proteomic Data

Endometrial Cancer (Proteomic Data)
Renal Cancer (Proteomic Data)
Lung Adenocarcinoma (Proteomic Data)
Lung Squamous Cell Carcinoma (Proteomic Data)
Breast Cancer (Proteomic Data)
Colon Cancer (Proteomic Data)
Ovarian Cancer (Proteomic Data)
Brain Cancer (Proteomic Data)
Head and Neck Cancer (Proteomic Data)
Pancreatic Cancer (Proteomic Data)

Data Types

Data Type	Data Format	Data Access Level
Clinical and Biospecimen	TSV, JSON	Open
WGS Aligned Reads	BAM	Controlled
WXS Aligned Reads	BAM	Controlled
WXS Raw Simple Somatic Mutations	VCF	Controlled
WXS Annotated Somatic Mutations	VCF, MAF	Controlled
WXS Aggregated Somatic Mutations	MAF	Controlled
WXS Masked Somatic Mutations	MAF	Open
Targeted Sequencing Aligned Reads	BAM	Controlled
Targeted Sequencing Raw Simple Somatic Mutation	VCF	Controlled
RNA-Seq Aligned Reads	BAM	Controlled
Gene Expression Quantification	TXT	Open
Splice Junction Quantification	TSV	Controlled
Transcript Fusion	TSV, BEDPE	Controlled
miRNA-Seq Aligned Reads	BAM	Controlled
miRNA Expression Quantification	TSV	Open
Isoform Expression Quantification	TSV	Open
Single Cell Analysis	TSV, HDF5	Open
Methylation Arrays	IDAT, TXT	Open

Questions about Cancer?

The following are some helpful resources for general information about cancer:

Helpful Cancer Genomics resource:

What is Cancer Genomics

Subscribe to Updates

Media Inquiries

NCI Press Offices

(301) 496-6641 (phone)
(301) 451-7440 (fax)
ncipressofficers@mail.nih.gov

@NCIGDC_Updates