Count Me In (CMI)

Count Me In (CMI) is a nonprofit research initiative that empowers patients to accelerate cancer research by sharing their samples, their clinical information, and their voices. A key goal of CMI is to enable scientific discoveries and the development of new cancer treatment strategies by widely sharing clinical, genomic, molecular, and patient-reported data.

Program Description

Working together with patients, patient advocates, and caregivers, CMI has launched a number of patient-partnered cancer research projects across different cancer types. Through the utilization of online registration and consent, these projects are open to cancer patients living anywhere in the United States or Canada. Count Me In has been stewarded by four leading organizations: Emerson Collective, a California-based social change organization; the Biden Cancer Initiative, an independent nonprofit organization building on the federal government’s Cancer Moonshot; the Broad Institute of MIT and Harvard; and the Dana-Farber Cancer Institute.

Count Me In projects generate datasets from patient-reported information, abstracted medical records, and sequencing and molecular analyses of biological samples that were acquired remotely. Patients enrolled in CMI projects are geographically dispersed and receive clinical care from different institutions across the United States and Canada. Data from CMI projects are regularly shared as generated, in a pre-publication manner. Researchers anywhere can harness Count Me In data, from repositories like the Genomic Data Commons (GDC), to advance cancer research and expand the understanding of cancer.

All CMI projects are currently ongoing. As more patients enroll, more data will be generated and released at the GDC and other scientific data repositories. Please feel free to contact Count Me In for more information about the projects or datasets.

CMI has provided the GDC with genomic data from multiple projects, including in metastatic breast cancer, angiosarcoma, and metastatic prostate cancer. The GDC harmonized DNA sequences from the whole exome sequencing (WES) and RNA sequences using their pipelines. The harmonized genomic data and unprocessed genomic data from CMI projects are available in the GDC Data Portal. Additional data, including expanded clinical information for patients and samples, for Count Me In projects can also be found on The cBioPortal for Cancer Genomics (

Data Overview

Genomic data from CMI projects can be found on the GDC Data Portal. To request access to protected data, please apply to dbGaP for access to the appropriate CMI study (study accessions as follows: phs001931, The Angiosarcoma Project; phs001709, The Metastatic Breast Cancer Project; phs001939, The Metastatic Prostate Cancer Project).

Cancer Types

Disease Type Primary Site
Adenomas and Adenocarcinomas Lymph nodes, Prostate gland
Ductal and Lobular Neoplasms Breast
Soft Tissue Tumors and Sarcomas, NOS Bladder, Breast, Bronchus and lung, Heart, mediastinum, and pleura, Lymph nodes, Other and ill-defined digestive organs, Other and ill-defined sites, Other and ill-defined sites within respiratory system and intrathoracic organs, Skin

Data Types and Access Levels

Data Type Data Format # of Cases and Files Estimated File Size Data Access Level
WXS Aligned Reads BAM 246 cases
552 files
15.52 TB Controlled
WXS Simple Somatic Mutations VCF 240 cases
1529 files
1.28 GB Controlled
WXS Annotated Somatic Mutations VCF, MAF 240 cases
3058 files
6.04 GB Controlled
WXS Aggregated Somatic Mutations MAF 240 cases
306 files
171.67 MB Controlled
WXS Masked Somatic Mutations MAF 240 cases
306 files
7.82 MB Controlled
RNA-Seq Aligned Reads BAM 148 cases
636 files
2.27 TB Controlled
Gene Expression Quantification TXT 148 cases
845 files
263.33 MB Open
Splice Junction Quantification TSV 148 cases
212 files
474.32 MB Controlled