Skip to main content

The All of Us Research Program

The All of Us Research Program is an initiative by the National Institute Health with the aim of building one of the largest biomedical data resources by enrolling one million participants. By gathering information on a large cohort of participants, a wide range of conditions and inclusion of diverse participants, the program aims to advance precision medicine. The database currently has 372,000+ participants.

Data Now Available in the Researcher Workbench

  • Electronic Health Records
  • Biosamples And Bioassays (Genomics)
  • Surveys (All Participants in the All of Us Research Program have data for, at least, the Basics section of the survey.)
    • The basics: basic demographic questions, including questions about a participant’s work and home.
    • Lifestyle: participant’s use of tobacco, alcohol, and recreational drugs.
    • Overall Health: participant’s overall health including general health, daily activities, and women’s health topics.
    • Personal/Family Health History: past medical history, including medical conditions and approximate age of diagnosis.
    • Health Care Access Utilization: participant’s access to and use of health care.
    • Social Determinants of Health: social determinants of health, including a participant’s neighborhood, social life, stress, and feelings about everyday life.
    • COVID-19 Participant Experience (COPE) Survey: impact of COVID-19 on a participant’s mental health, well-being, and everyday life.
    • Minute Survey on COVID-19 Vaccines: participant’s COVID-19 vaccination experience.
  • Physical Measurements
  • Wearable Devices (Digital Health)

The All of Us data is available in three tiers.

  1. Public tier: A high-level summaries of the data available for research. Anyone can access the aggregated participant and summary statistics.
  2. Registered tier: Includes individual-level data from surveys, physical measurements taken at the time of participant enrollment, longitudinal EHRs, and wearables like Fitbit. It can only be accessed by registered users.
  3. Controlled tier: In addition to all the data available under the registered tier, it includes genomic data and more detailed demographic, EHR, and survey data.

The registered and controlled tier data are accessed through researcher workbench, a cloud-based platform where registered user access the registered and control tiers data.

Follow the STEPS to register for researcher workbench.

After completing the registration (open a researcher workbench account, complete ethics trainings and submit a brief description of your study), you will get access to either the registered or controlled tier data based on your selection.

The data management and analysis team at Indiana University will provide support for researchers at institutions in research partnerships with Indiana CTSI: Indiana University, Purdue University, and the University of Notre Dame.

  1. General help with questions related to the All of Us research program
  2. Help with setting up a researcher workbench
  3. Help with selecting variables and filtering data
  4. Help with data analysis

To request assistance from the team, please fill out this form.

The data management and analysis team at Indiana University will provide support for researchers at institutions in research partnerships with Indiana CTSI: Indiana University, Purdue University, and the University of Notre Dame.

  1. Data Browser: Interactive tool to explore the public tier data. Aggregate measures such the number of All of Us participants with certain conditions, survey answers, and demographics can be obtained using the data browser.
  2. Data standardization: The All of Us EHR data is standardized using he Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), which is designed to standardize the structure and content of observational data.
  3. Data dictionaries: Registered tier and Controlled tier data dictionaries provider detailed information about the data tables and variables for the respective data tiers.
  4. Computing platform: The All of Us datasets are analyzed using a cloud based Jupyter Notebook. Researchers could use R or Python to query and analyze the datasets.

Publications

Li, R., Wang, H., Zhao, Y., Su, J., Tu, W. (2021). Robust estimation of heterogeneous treatment effects: an algorithm-based approach. Communications in Statistics-Simulation and Computation, 1-18.

Work in progress

Understanding the AllofUs Database: A comprehensive comparison to the Medical Expenditure Panel Survey (Hailemichael Shone, Kosali Simon and Engy Ziedan).

The data management and analysis team at Indiana University will provide support for researchers at institutions in research partnerships with Indiana CTSI: Indiana University, Purdue University, and the University of Notre Dame.

  1. Program website: https://allofus.nih.gov/
  2. OMOP CDM tables: https://ohdsi.github.io/CommonDataModel/cdm53.html#Clinical_Data_Tables
  3. Program YouTube channel: https://www.youtube.com/@AllofUsResearchProgram