In just over two years — with amazing teamwork across Truveta and our 25 health system members, and strategic partnerships with customers like Pfizer and Boston Scientific — it is so exciting to announce the availability of Truveta Studio.

It is amazing for me to reflect how our company and partners were so clearly motivated by the lack of useful public health data on how best to respond during COVID-19. The industry is deeply challenged by inaccessible, fragmented, and unstructured data. Clinical trials are just too slow to provide the information needed to care for patients.

And now we have Truveta Studio, bringing together unprecedented health data and analytics for researchers to study patient care and outcomes with any condition, drug, or medical device. Truveta Studio is the first integrated solution that combines data and analytics to accelerate learning in real time. It’s incredible to me how so many solutions are available to study the health care system itself, but no other system has been designed to study patient care and outcomes.

Figure 1: Truveta Studio COVID-19 dashboard – This COVID-19 dashboard shows an example of visualization and analytics possible with Truveta Studio.

Truveta Data is unprecedented

Today, most research is conducted on outdated claims data that do not include critical information, such as symptoms or lab test results that led to the diagnosis. When clinical data is accessible, it is unstructured and not useful for analytics. Truveta Studio is the first solution to make massive streams of daily clinical data useful for analytics through the integration of AI-powered natural language processing and de-identification. Truveta offers the most timely, complete, and highest quality data on US health, empowering researchers to answer complex medical questions in days, not years. Truveta Data is:

  • Timely: Updated daily from care at our members, enabling researchers to learn in real time from the most current view of US health.
  • Representative: Our members provide patient care in 43 states where 97% of the US population reside. Truveta Data covers the full diversity of the US across age, geography, race, ethnicity, and gender.
  • Complete: The breadth of our data is matched by unparalleled depth, including medical records with full diagnoses, vital signs, lab tests, clinical notes, and images. Truveta Data is linked across providers and with daily mortality data and comprehensive social drivers of health data from LexisNexis. Insurance claims fill in the patient journey when medical records are unavailable. The result? A complete, de-identified longitudinal journey for each patient.
  • Normalized: Unstructured content with the medical record is mapped to clinical ontology standards, such as LOINC for lab tests and GUDID for medical devices.

Today, data vendors don’t share the sources of their data. Committed to earning trust, we provide a comprehensive datasheet on the national population represented in Truveta Data and every population being studied. These datasheets include patient counts, diversity, completeness, and timeliness statistics, as well as the sources of all data.

Truveta Studio enables fast clinical insights with transparency

Today, researchers face frustrating months-long delays to assess feasibility of generating a representative population for analysis, and then more delays to setup the secure data analytics infrastructure. Fragmented and limited tools slow research, drive-up costs, and limit transparency and trust in the study conclusions. Now, Truveta Studio eliminates these issues with several industry firsts:

Truveta Prose makes medical concepts computable

Today, individual research projects define medical concepts with custom, opaque, and limited expressions. Truveta Prose is the first language to express computable medical concepts combining events from a patient’s longitudinal history, including diagnoses, labs, procedures, medications, vaccinations, devices, or any concept found within a clinical note. Like Google searches the Internet, researchers can search data in Truveta for any Prose-defined population within a few seconds.

Eric Eskioglu, MD, MBA, Executive Vice President and Chief Medical and Scientific Officer at Novant Health has been an active contributor to our approach. He recently shared,

“Researchers often spend countless hours attempting to stratify and define the patient populations they are seeking to study before they can even begin their analysis. Truveta not only ensures consistency and transparency across different clinical concepts and outcomes, but also fundamentally lowers the cost and increases the speed of research, enabling scientists to get to insights faster for saving more lives.”

For example, there is significant medical nuance in defining a patient who is hospitalized for COVID-19. Some researchers may leverage diagnostic codes for inpatient encounters; however, such a definition will also capture patients incidentally found to have COVID (and not necessarily hospitalized due to their COVID infection). Other researchers may leverage logic such as use of COVID-specific medications, oxygenation status, requirement for intubation, or specific lab markers. To help researchers navigate this complexity, Truveta Prose allows medical concepts like “COVID Hospitalization with COVID-19 infection” or “COVID Hospitalization from Medication Utilization” to be defined in a computable form as Truveta Definitions, which can be used to analyze Truveta Data.

Figure 2: “Hospitalization with COVID-19 Infection” Truveta Definition – The “Hospitalization with COVID-19 Infection” definition enables researchers to quickly study a patient population without having to pull all associated diagnostic codes.

In fact, Truveta Research recently used the “COVID Hospitalization with COVID-19 infection” definition to explore potential racial and ethnic disparities in COVID hospitalizations during different time periods throughout the pandemic. The results are available now as a preprint and a summary on the Truveta Research blog.

Or, if a researcher wanted to study people who have elevated Brain Natriuretic Peptide levels found in their blood work following a hospitalization for heart failure, the researcher would use the “Congestive Heart Failure,” “inpatient event,” and “Brain Natriuretic Peptide Lab Results” Truveta Definitions to easily identify this patient population for study (figure 3). These definitions contain dozens of diagnoses codes and combine data from encounters, medications administered, and laboratory results intertwined with complex time constraints, simplifying all of these concepts for rapid research.

Figure 3: Defining a specific patient population with Truveta Definitions and Truveta Prose –As an example of the complexity made simple, the Truveta Definition for people with elevated Brain Natriuretic Peptide (BNP) levels following hospitalization for heart failure contains dozens of diagnosis codes and combines data from encounters, medications administered, and laboratory results all intertwined with complex time constraints. Using Truveta Prose, a researcher can quickly combine a set of definitions in a logical sequence to further refine the population they are looking to study.

Truveta Prose also enables unprecedented transparency in how medical concepts are computed in a study to help earn trust in the conclusions of that study.

Truveta Library accelerates collaboration and learning

Truveta Definitions can be shared within the Truveta Library to ease the creation of populations for study and to accelerate the accumulation of computable medical knowledge. The Truveta Library already contains thousands of Truveta Definitions contributed by experienced clinical informaticists.

Take the Centers for Disease Control (CDC) definition of long COVID. This definition includes 19 different potential symptoms. Within the Truveta Library, any researcher can easily find the Truveta Prose which fully expresses this description, but in a computable form, accelerating accurate research of the condition (not to mention the 400K de-identified medical records which were found matching the description in 2.2 seconds!).

Figure 4: Truveta Library featuring Long COVID definition – Truveta Definitions can be shared within the Truveta Library to ease the creation of populations for study and to accelerate the accumulation of computable medical knowledge.

Another perspective from one of our earliest contributors, Ari Robicsek, MD, Chief Medical Analytics Officer and Senior Vice President of Research at Providence,

“For researchers, this is really exciting. Truveta Studio offers a dataset that’s huge, comprehensive, and up to date. And the Truveta Library makes it easy to do critical documentation and communication about how we’re defining our cohorts.”

Truveta Notebooks enable convenient analytics

Today, individual research projects require custom data infrastructure, which causes delay, expense, privacy and security risks, and limits the ability to share underlying statistics transparently. Truveta Studio includes an integrated Jupyter notebook atop a serverless SQL experience, pre-installed with the latest medical statistics and visualization libraries including pandas, NumPy, Matplotlib, SciPy, Tidyverse, Arrow, and dplyr, with full support for R and Python. The integrated analytics make it hassle-free for distributed research and data science teams to study daily updating Truveta Data populations within Studio – and for the underlying statistics to be shared transparently, earning trust in the conclusions of that study.

Watch the video

Unlimited discovery

Truveta provides the best value in real-world data with subscriptions including daily data and unlimited analytics by unlimited users. One Truveta subscription supports unlimited care quality, health equity, comparative effectiveness, safety, label expansion, AI training, regulatory filings, and publications. We designed this unlimited access to drive curiosity and continuous learning.

We are so excited to see what you can discover with Truveta Studio. Join us in achieving our vision of Saving Lives with Data. Curious to learn more? Contact us at


About Truveta

Truveta is a collective of US health systems with a shared vision of saving lives with data. Truveta offers innovative solutions to enable researchers to find cures faster, empower every clinician to be an expert, and help families make the most informed decisions about their care. To learn more, follow us on LinkedIn and Twitter, and sign up for our newsletter.