After years of uncertainty, a recent policy change is sparking new optimism in rare disease drug development. A change in federal policy now keeps certain multi-indication rare therapies exempt from Medicare price negotiations—a rule that had discouraged some companies from investing. Now, drugmakers are revisiting paused programs, reevaluating pipelines, and channeling new investment into high-need, genetically driven conditions.

At the same time, rapid advances in AI and clinical data integration are transforming what’s possible in rare disease research. With the launch of the Truveta Genome Project, researchers will soon be able to link genotypic data with the most representative, complete, and timely patient journey data available. For rare conditions—where small populations and heterogeneous presentations create major evidence gap—convergence of genomic and real-world data marks a step change in what’s possible.

Real-world data built for rare complexity

Rare diseases are, of course, rare. Patient numbers are small, phenotypes are complex, and diagnosis often requires deep expertise and time. Many rare conditions lack precise diagnostic codes, and traditional data sources fall short in capturing key endpoints like time to diagnosis, disease severity, and functional decline.

Closing these gaps requires complete, connected data.

Truveta links structured and unstructured EHR data from over 120 million patients across the US. We map these data to widely used clinical ontologies—such as ICD-10, LOINC, HCPCS, NDC, and SNOMED CT—using advanced processes to ensure every record is matched to the most precise, clinically meaningful code possible.

But Truveta also goes beyond standard codes. Using clinical text strings pulled directly from provider documentation, the Truveta Language Model normalizes and extracts granular disease concepts— In contrast, claims data typically rely only on ICD-10 codes, limiting both precision and the ability to uncover complex or overlapping rare disease presentations.

Bringing rare disease signals into focus

Here’s how life sciences partners are already using Truveta to advance rare disease insights:

Mapping Pompe disease progression

In one engagement, a rare disease team used Truveta Data to examine real-world progression in Pompe disease, an ultra-rare neuromuscular disorder that can lead to significant cardiac and respiratory issues. Truveta provides a uniquely deep and longitudinal dataset for Pompe, capturing key real-world endpoints. Among the endpoints available for analysis:

  • Time from symptom onset to diagnosis
  • Uptake of enzyme replacement therapy (ERT)
  • Pulmonary function markers like FEV1 extracted from clinical notes
  • Biomarkers such as creatine kinase via structured lab results
  • Functional milestones including wheelchair use and mechanical ventilation
  • Muscle MRI findings described in radiology narratives

Such endpoints are vital for gene therapy programs, comparative effectiveness analyses, and in this high-stakes, low-sample rare disease.

muscle MRI image report example
Identifying secondary hypertension

Another partner sought to identify patients with secondary hypertension tied to rare genetic or metabolic conditions. This condition is invisible from claims because ICD-10 lacks the vocabulary to capture the disease.

With Truveta, the team was able to define a nuanced and precise phenotype leveraging SNOMED CT diagnosis codes, labs, and provider-documented diagnoses, identify patients with persistent hypertension despite combination therapy, and flag pediatric populations for future outreach and clinical trial planning.

This kind of AI-powered, deep phenotyping—combining structured EHR, unstructured notes, and longitudinal outcomes—is only possible with deep, linked clinical data.

 

secondary hypertension

Not all RWD is created equal

Unlike other real-world data providers, Truveta delivers:

  • Representative scale: 120M+ patients across 900 hospitals and 20K clinics, representing 1 in 3 Americans.
  • Deep completeness: Diagnoses, labs, vitals, medications, 7B+ clinical notes, 100M+ images, and closed claims for 200M+ patients.
  • Timeliness: Daily-updated data ready to study yesterday’s care today.
  • Clinical richness: Structured + unstructured EHR data, plus SDOH, mortality, and soon genomics via the Truveta Genome Project.
  • AI-powered insights: Extracted concepts like seizure frequency, progression markers, and rare disease phenotypes from notes with precision, using the Truveta Language Model.

Explore more: Download the rare disease briefing

Ready to see how Truveta can advance your rare disease strategy?
Download our new rare disease briefing or contact us for a custom feasibility.