The signal for cancer often exists years before the diagnosis. We simply have not been able to see it.

by Johnathan Lancaster, MD, PhD | May 5, 2026

A recent Mayo Clinic study (REDMOD, published in Gut) makes that point with unusual clarity.

The investigators retrospectively analyzed nearly 2,000 abdominal CT scans, including scans from patients who were later diagnosed with pancreatic cancer and matched comparison patients; the scans had originally been interpreted as normal.

The model detected subtle imaging signatures up to three years before clinical diagnosis. It identified 73% of pre-diagnostic cancers at a median of about 16 months prior to diagnosis—roughly twice the sensitivity of specialists reading the same scans without AI assistance. On scans acquired more than two years before diagnosis, the advantage was substantially greater.

The senior author, Dr. Ajit Goenka, framed it well: “The greatest barrier to saving lives from pancreatic cancer has been our inability to see the disease when it is still curable.”

What is striking is that this is a single modality. One imaging study, ordered for unrelated indications, residing in a patient’s record for years.

Now consider the rest of the longitudinal record. Laboratory values captured in primary care. New-onset diabetes (a well-recognized upstream signal in pancreatic cancer) and the medications initiated in response. Unexplained weight loss. Family history documented at one encounter and never integrated with the rest. In isolation, much of this appears unremarkable. In aggregate, it can begin to tell a coherent story long before a diagnosis is made.

The disease begins well before the healthcare system formally recognizes it. A substantial portion of the relevant data already exists. The problem is that it lives in fragments, across health systems, across care settings, disconnected from what ultimately happened to the patient. Patients move. Their longitudinal data does not.

This is the strategic argument for real-world data captured across the full continuum of care, not solely at the cancer center and not solely after diagnosis. If we are to understand the biology of disease progression, see the patient journey as it actually unfolds, advance early detection and prevention, and connect upstream signals to outcomes after treatment, we have to be able to see what was happening to patients in the years before they became patients with cancer.

These findings warrant appropriate caution. False positives, overdiagnosis, and the burden of unnecessary follow-up are legitimate concerns, and any tool of this kind requires prospective validation and careful clinical integration. But we cannot act on a signal we cannot see. Detection is the prerequisite for everything that follows.

The Mayo team has surfaced one signal that had been hiding in plain sight. There are almost certainly others, embedded across longitudinal patient data, waiting to be identified and connected.

I welcome perspectives from colleagues working across real-world data and evidence. Where do you think the next high-value pre-diagnostic signals will come from—imaging, labs, claims, or something else entirely?

Truveta Data

Capabilities

Evidence

Truveta Intelligence

Capabilities

Evidence

Truveta customers

Who we serve

Saving Lives with Data

The signal for cancer often exists years before the diagnosis. We simply have not been able to see it.

The signal for cancer often exists years before the diagnosis. We simply have not been able to see it.

Introducing Truveta Intelligence: Insights from real-world care in minutes, not months

CDC study links later clinic visits to lower antihypertensive adherence

Predicting early-onset colorectal cancer with large language models

GLP-1 RA prescription trends: January 2019 – March 2026

Ready to accelerate your research with representative, complete, and real-time data?