Truveta Data

Clinical notes

Largest collection of clinical notes integrated with EHR data

Nearly 80% of data relevant to research is hidden in unstructured notes

The Truveta Language model extracts data from notes at scale, empowering researchers with data for more than 5 billion free-text notes.

Understand clinical context for patients

With access to complete EHR data — including notes — linked with social drivers of health, mortality, and claims data, researchers can understand the complete patient journey and address previously unanswerable questions.

Identifying key moments in the patient journey

Repeated outpatient visit in January

L

Recurrent issue

L

Prior treatment

L

Suspected diagnosis

L

Referral

First dermatology visit in February

L

Suspected diagnosis

L

Recent surgery

L

Not starting a biologic​

L

Next steps​

Hidradenitis suppurativa diagnosis in November​

L

Disease flare-up

L

Formal diagnosis

L

Antibiotic treatment

Primary care visit

Specialist visit

Diagnosis

Repeated outpatient visit in January

L

Recurrent issue

L

Prior treatment

L

Suspected diagnosis

L

Referral

Redacted patient note showing delays in diagnosis and treatment for a less common condition

First dermatology visit in February

L

Suspected diagnosis

L

Recent surgery

L

Not starting a biologic​

L

Next steps​

Redacted patient note showing delays in diagnosis and treatment for a less common condition

Hidradenitis suppurativa diagnosis in November​

L

Disease flare-up

L

Formal diagnosis

L

Antibiotic treatment

Redacted patient note showing delays in diagnosis and treatment for a less common condition

Unlock access to any clinical concept of interest

Truveta receives all clinical notes generated during a patient’s care. This includes progress notes, nursing evaluations, procedure/operative reports, referral notes, discharge summaries, imaging reports, and more.

Notes are available across disease areas, including heart failure, vessel disease, migraines, seizures, NASH, hypercholesterolemia, colon cancer, and rare diseases.

Example cardiovascular concepts extracted from notes

Sampling of normalized echocardiogram data in Truveta Data

Sampling of normalized cardiac catheterization data in Truveta Data

Ending the HIV epidemic using EHR data analytics. See how a leading health data company is making this a reality.

Answer novel research questions

Accelerating therapy adoption, improving clinical trials, and enhancing patient care.

Example applications of notes data

Classify disease severity and monitor disease progression to inform R&D

Using echocardiogram data to classify aortic stenosis severity

Assess lifestyle behaviors and symptom prevalence to optimize clinical trial design

Analyzing diet data for a rare disease requiring dietary modifications

Identify potential confounders relevant to comparative effectiveness research

Identifying confounders before head-to-head SGLT2i study

AI enables accuracy at scale

The Truveta Language Model, a large language model trained on medical records data, is designed to identify and structure clinical data from notes and account for nuances such as negation, hypotheticals/conditionals, and family history. The model is continuously evaluated and fine-tuned to ensure clinical accuracy.

Learn more about the depth of Truveta Data

Regulatory-grade EHR data

Truveta offers complete, timely, and clean EHR data linked with SDOH, mortality, and claims data for more than 100M patients representing the full diversity of the US.

Medical images and metadata

Truveta provides access to millions of medical images across all modalities, including MRI, CT, X-ray, ultrasound, mammogram, PET, and nuclear medicine, searchable by modality and protocol.