Truveta Data
Powering real-time insights
Get the most complete, timely, and representative view of US patient care
See the full picture with linked EHR data
By linking complete EHR data with closed claims and mortality data, Truveta enables researchers to generate both clinical and economic insights across the entire care journey. Track diagnoses, treatments, outcomes, and costs across settings, payers, and populations—supporting comparative effectiveness, burden of illness, health economics research, and more. Truveta Data can also be linked with proprietary datasets for expanded research applications.

Closed claims available for 200M+ patients across 100+ commercial payers, Medicare, and Medicaid

Includes medical and pharmacy claims dating back to 2016

Expands visibility into longitudinal outcomes and total cost of care
Key features
Truveta
Other
Impact
120M+ electronic health records directly from US health systems
Access daily updates and unprecedented completeness, including images and clinical notes
Daily refreshes
Clinical notes at scale
Few RWD vendors
Unlock any clinical concept of interest with expert-level AI to enable novel research
Longitudinal imaging studies
Lack EHR integration
Enable AI model development, adjudicate outcomes, and study real-world response
Integrated closed claims and mortality
Often requires linking
Conduct robust comparative effectiveness research with detailed cost, utilization, and outcomes data
Pediatric and mother-child data
Few RWD vendors
Study underrepresented patients and address critical evidence gap in children
Admission-discharge-transfer (ADT) data
Enables minute-level comparative effectiveness of procedure times, recovery and lengths of stay
Regulatory grade
Limited provenance
Aligns with FDA standards and provides full, audit-ready data provenance
Immediately available
Data cuts licensed separately
Provides immediate, unlimited access to row-level patient data for health economics and outcomes research
Broad payer mix
Often skews commercial
Offers broad coverage across 100+ payers for nationally representative studies
Emerging genomics and phenotype capabilities
Limited EHR coverage
Enables deep clinical research and broad use with access to vitals, labs, images, clinical notes, and long-term outcomes
Work faster with clean data, normalized with expert-led AI
To unlock these insights across the care journey, data must first be clean, consistent, and structured at scale. The Truveta Language Model cleans and normalizes trillions of daily EHR data points, giving you high-quality, ready-to-analyze inputs—no wrangling required.



Electronic health records
Clinical notes
Imaging studies
Unique devices
Mother-child pairs
Example of TLM mapping lab results to the appropriate medical ontology
Go beyond structured data with notes and images
Tap into previously inaccessible sources of clinical insight—now available at scale and mapped to longitudinal EHR data. When paired with structured data, notes and images enable deeper understanding of disease progression, treatment safety and effectiveness, reasons for treatment decisions, and more.
Access 7B+ clinical notes across all care settings and note types
Analyze 100M+ medical images—searchable and linked with rich EHR data
Unlock rare visibility into maternal and pediatric care
Truveta provides data on more than 1.4 million deterministically linked mother–child pairs. Researchers can study prenatal risk factors, birth outcomes, early drug and vaccine safety, and pediatric development using EHR data that spans pregnancy through early childhood.

Power new discoveries by linking genetics to real-world care
The Truveta Genome Project will create the largest and most diverse database of genotypic and phenotypic information ever assembled to enable drug discovery, optimize clinical trials, and transform how diseases are prevented, diagnosed, and cured. This genetic data will be linked to de-identified medical records and added to Truveta Data for research.





