When I had the first conversations about creating Truveta, I knew technology was the answer to solving healthcare data challenges. Healthcare data traditionally has been too siloed, inaccessible, and messy to be useful for research. And one of the biggest challenges has been that critical information about a patient’s health is locked away in the unstructured text in clinician notes. Thanks to the power of the most advanced AI technology, Truveta has been able to take the messy data – including extracting medical concepts from clinician notes – and beautifully clean it for scientifically rigorous research.

As a result, today we are excited to announce the availability of more than 56.6 million clinical observations from more than 2.7 million echocardiogram reports from more than 1.7 million patients, making Truveta the market leader in real-world data to advance cardiovascular research at scale. Truveta delivers the most complete, timely, and clean electronic health record (EHR) data from more than 100 million patients across more than 30 health systems, empowering researchers with scientifically rigorous analytics to study safety and effectiveness, improve patient care, and train medical AI. 

Unlocking echocardiogram results through AI

Echocardiograms provide crucial clinical observations for cardiovascular care, yet obtaining these measures at scale for real-world data analysis has been a longstanding challenge. Existing methods, such as relying on registry data or physician referrals, have limitations in scope and timeliness, hindering large-scale research. The results of these reports have also been locked away in the clinician notes or transcriptions of echocardiogram procedures, making them only available through manual chart review. Truveta’s clinical expert-led AI solves this challenge.

The Truveta Language Model is a large-language, multi-modal AI model for transforming electronic health record (EHR) data into billions of clean and accurate data points for health research on patient outcomes with any drug, disease, or device. TLM combines pre-trained open large language models with deep training on the most complete and representative clinical data set to achieve above 90% accuracy on diagnoses, medications, lab results, lab values, clinical observations, and more, exceeding the accuracy of human clinical experts.

Using this clinical expert-led AI, we have been able to structure nearly 50 vital quantitative measures from 2.7 million unstructured echocardiogram reports, empowering researchers to conduct innovative cardiovascular studies with depth understanding of heart structure and function for patients of interest. From essential metrics like ejection fraction to specialized measures such as tricuspid annular systolic velocity (TASV), the availability of these measures at scale enables researchers to explore new frontiers in their cardiovascular studies.

Studying common cardiovascular conditions at scale

Take ejection fraction (EF). EF provides a quantitative measure of the heart’s ability to pump blood throughout the body. Individuals with a low EF and symptoms from this low EF are defined as having heart failure with reduced ejection fraction (HFrEF). Patients with HFrEF have an elevated risk of clinical complications, including arrhythmias, blood clots, heart valve issues, and even cardiac arrest. Given its clinical significance, EF is frequently used as an endpoint and as an inclusion/exclusion criterion in clinical trials. Changes in the measure from baseline to a specified follow-up timepoint, for example, can be used to assess the effectiveness of medications, medical devices, and surgical procedures.

However, even something as critical as EF has not been readily accessible for use in real-world studies because the results are not uniformly recorded in structured data. EF is most frequently assessed via an echocardiogram (echo) – an ultrasound taken of the heart – with the results captured as unstructured or semi-structured data within EHRs or medical imaging databases. The numeric result may be captured within the imaging report, for example, while clinician notes may contain descriptive interpretations of results such as normal/abnormal, low/moderate/severe, or reduced/preserved.

In figure 1 below, you can see from the redacted echocardiogram report that a patient had 17% EF at the time of heart failure diagnosis and that their most recent echo result showed 42% EF. TLM can extract these specific results across patients with available data, enabling more nuanced cardiovascular research studies.

Unlocking ejection fraction and other clinical measures from echo reports and clinical notes in the EHR to monitor patient journeys and outcomes.

Specific echo report results can also be extracted using the same process. So, rather than only being able to compare patients with “reduced EF” to those with “preserved EF,” for example, researchers can query patients with quantitative EF results (e.g., EF<40%, EF 40%-55%).

In addition, Truveta Data includes complete EHR data for more than 100 million patients, including labs, images, and clinician notes linked across health systems and augmented with social drivers of health and claims data. By extracting clinical measures like EF and combining them with other critical data points (updated daily), researchers can study the complete patient journey and explore treatment effectiveness, disease risk factors, patient subgroups, and more. For more on the potential research potential with ejection fraction, check out this Truveta blog.

Making less common cardiovascular conditions available for study

While ejection fraction is a common quantitative measure, echocardiogram reports include many other measures and clinical assessments that can be used to study rarer conditions like tricuspid stenosis, which is a narrowing of the tricuspid valve in the heart. If untreated, tricuspid stenosis can lead to arrhythmia, stroke, and irreversible heart dysfunction.

With details from echo reports captured like valve area, flow velocity, and pressure gradient across a valve, researchers can now study tricuspid stenosis and other valvular conditions at scale to learn more about the patient populations affected and identify new treatments that can improve patient care and outcomes, ideally saving lives.

Imagining what’s possible

Never before have these echocardiogram test results been available for research at this scale. The millions of clinical observations now available in Truveta mean cardiovascular researchers can study conditions more quickly than ever before, develop innovative new treatments, and improve patient care and outcomes.

Within the next few months, we will also have clinical observations from cardiac catheterization reports, which will provide even more data to help researchers study conditions like ischemic heart disease and pulmonary hypertension.

Imagine what’s possible and how many lives can potentially be saved because these data are now available at scale.