Truveta Data
Powering real-time insights
Get the most complete, timely, and representative view of US patient care
See the full picture with linked EHR data
By linking complete EHR data with closed claims and mortality data, Truveta enables researchers to generate both clinical and economic insights across the entire care journey. Track diagnoses, treatments, outcomes, and costs across settings, payers, and populations—supporting comparative effectiveness, burden of illness, health economics research, and more. Truveta Data can also be linked with proprietary datasets for expanded research applications.

Closed claims available for 200M+ patients across 100+ commercial payers, Medicare, and Medicaid

Includes medical and pharmacy claims dating back to 2016

Expands visibility into longitudinal outcomes and total cost of care
Key features
Truveta
Other
Impact
120M+ electronic health records directly from US health systems


Access daily updates and unprecedented completeness, including images and clinical notes
Daily refreshes


Clinical notes at scale


Few RWD vendors
Unlock any clinical concept of interest with expert-level AI to enable novel research
Longitudinal imaging studies


Lack EHR integration
Enable AI model development, adjudicate outcomes, and study real-world response
Integrated closed claims and mortality


Often requires linking
Conduct robust comparative effectiveness research with detailed cost, utilization, and outcomes data
Pediatric and mother-child data


Few RWD vendors
Study underrepresented patients and address critical evidence gap in children
Admission-discharge-transfer (ADT) data


Enables minute-level comparative effectiveness of procedure times, recovery and lengths of stay
Regulatory grade


Limited provenance
Aligns with FDA standards and provides full, audit-ready data provenance
Immediately available


Data cuts licensed separately
Provides immediate, unlimited access to row-level patient data for health economics and outcomes research
Broad payer mix


Often skews commercial
Offers broad coverage across 100+ payers for nationally representative studies
Emerging genomics and phenotype capabilities


Limited EHR coverage
Enables deep clinical research and broad use with access to vitals, labs, images, clinical notes, and long-term outcomes
Work faster with clean data, normalized with expert-led AI
To unlock these insights across the care journey, data must first be clean, consistent, and structured at scale. The Truveta Language Model cleans and normalizes trillions of daily EHR data points, giving you high-quality, ready-to-analyze inputs—no wrangling required.



Electronic health records








Clinical notes
Imaging studies
Unique devices
Mother-child pairs
Example of TLM mapping lab results to the appropriate medical ontology

Go beyond structured data with notes and images
Tap into previously inaccessible sources of clinical insight—now available at scale and mapped to longitudinal EHR data. When paired with structured data, notes and images enable deeper understanding of disease progression, treatment safety and effectiveness, reasons for treatment decisions, and more.
Access 7B+ clinical notes across all care settings and note types
Analyze 100M+ medical images—searchable and linked with rich EHR data
Unlock rare visibility into maternal and pediatric care
Truveta provides data on more than 1.4 million deterministically linked mother–child pairs. Researchers can study prenatal risk factors, birth outcomes, early drug and vaccine safety, and pediatric development using EHR data that spans pregnancy through early childhood.


Power new discoveries by linking genetics to real-world care
The Truveta Genome Project will create the largest and most diverse database of genotypic and phenotypic information ever assembled to enable drug discovery, optimize clinical trials, and transform how diseases are prevented, diagnosed, and cured. This genetic data will be linked to de-identified medical records and added to Truveta Data for research.