The Uneasy Rise of Real-World Comparators
- Adigens Health
- Nov 5
- 5 min read
Real-world evidence (RWE) has long promised to bridge the chasm between what works in trials and what happens in practice. Yet over the past few years, it has done something bolder; it has crossed into the realm of comparative evidence. The “real-world comparator,” which used to be dismissed as an oddity of rare disease research has quietly become one of the most contested frontiers in evidence generation. This topic was discussed during the recent Adigens Health webinar featuring Uwe Siebert, Miguel Hernan, and Radek Wasiak, the summary of which is presented below.
From Curiosity to Expectation
A decade ago, the idea that a new medicine might win approval or reimbursement based on an external control arm would have seemed far-fetched. Today, it is increasingly common. In September, JAMA published a review of 14 years of literature on external controls, identifying 180 published papers, nearly half in oncology. Regulators are now routinely citing RWE in their own reports: the European Medicines Agency (EMA) tripled its use of RWE in regulator-led studies, largely through the DARWIN platform, while the US Food and Drug Administration (FDA) continues to expand the number of approvals supported by natural history comparators. Recent examples include approvals in rare metabolic disorders such as aromatic L-amino acid decarboxylase deficiency and metachromatic leukodystrophy, where traditional randomization was simply impossible.
What was once an exception is now edging toward expectation. The drivers are partly scientific, partly pragmatic. Small populations, ethical barriers, and time pressures have forced sponsors to look beyond the randomisation. But the intellectual ground has shifted, too: a growing consensus that the real world is not the enemy of rigor, assuming one knows how to emulate a trial rather than simply observe a cohort.
The Power of Emulation (Caveats Included…)
At the heart of this shift lies the idea of target trial emulation. It sounds abstract but is, in essence, a call for discipline. If one is going to make causal inferences from observational data, one must first specify the randomized trial one is trying to emulate: the eligibility criteria, interventions, comparators, follow-up, and outcomes. The pre-specification makes the analysis, if done correctly, credible.
For decades, observational studies were conducted without such explicit framing, producing results that were sometimes enlightening and sometimes misleading. The emulation framework, popularized by epidemiologists such as Miguel Hernán and James Robins, has forced the field to be more transparent about its assumptions. It is, as Professor Uwe Siebert noted, a “jump innovation”: a way to teach rigor, not invent it. Once an implicit craft, causal inference is now a discipline with a name, a structure, and, perhaps the biggest reason for the uptake, a growing following among regulators and HTA bodies alike.
Yet emulation brings its own discomforts. In the ideal scenario, it demands that sponsors design their external controls before they design their single-arm trials, a sequencing that few yet manage. Too often, companies design the experimental arm in isolation, only later asking analysts to “find a comparator.” The result, as Hernán observed, is half a trial emulation. This asymmetry can undermine the comparability that regulators prize and matching algorithms cannot fix what poor design has already broken.
The Institutions Catch Up
The regulatory climate has evolved faster than many expected. The FDA, EMA, and Japan’s PMDA all now issue guidance recognizing that, under the right circumstances, real-world comparators can support or even replace traditional controls. But acceptance remains uneven. Regulators are cautiously open; health technology assessment bodies appear to be slower to follow.
The reason lies not in ideology but in mandate. Regulators must judge safety and efficacy; their decisions assume uncertainty given that the alternative is inaction (i.e., no new therapy). By contrast, HTA agencies must judge value. Their task is comparative, and comparisons based on observational data still make many of them uneasy.
Even within Europe, differences abound. NICE in the UK has gone furthest, embedding the target trial framework into its methods guidance and for years has evaluated real world comparators embedded in the cost effectiveness models. Germany’s IQWiG and France’s HAS remain more guarded, preferring established hierarchies of evidence. Others, like Catalonia’s AQuAS, are experimenting with more agile approaches.
The upcoming European Joint Clinical Assessment (JCA), which will require early joint evaluations across member states, could force convergence. By compelling agencies to agree on the evidence base before national assessments, it may inadvertently accelerate the standardization of RWE methodologies. As Siebert put it, “Everyone will have to prepare earlier, think earlier, and guess better”. JCA might just be the process that could finally align regulators, payers, and methodologists around shared principles.
Data: The Double-Edged Sword
If methodology is half the story, data is the other. The past decade has seen an explosion in accessible real-world datasets from claims records and registries to electronic health records and genomic repositories. Commercial platforms now trade in petabytes of information, promising insight at industrial scale. But not all data are created equal.
Claims databases, rich in transactions but poor in clinical nuance, can document treatment patterns yet miss the subtleties of disease. Electronic health records hold greater promise: they capture the context of care, the confounders that matter, and increasingly due to AI advances, the structured clinician notes. The speakers predicted that the next leap in RWE will come from this frontier, where natural language processing can turn prose into analyzable evidence.
But scale invites risk. As more data are commercialized, academic researchers worry about access and independence. Siebert remains optimistic: Europe’s move toward trusted research environments (i.e., secure spaces for shared analysis) could make collaboration both faster and safer. The EU’s Health Data Space, if realized, might give researchers the infrastructure to emulate trials across borders rather than within silos.
Beyond the Hype
Every new paradigm has its hype cycle. RWE, branded more attractively than its predecessor “observational research,” has at times promised too much. Panelists agreed that the danger now is not underuse but overselling. The notion that real-world data can answer every question, supplanting the RCT entirely, should be dismissed quickly. Randomized trials remain the most efficient tool for eliminating bias when they are feasible. Observational designs are indispensable when they are not. The challenge is to know when each is appropriate. The breakthrough in observational design is the alignment that they should follow the trial design discipline. As Siebert put it, “We must dissect between good real-world evidence and bad real-world evidence, not worship it as a miracle weapon.”
In that spirit, the unease surrounding real-world comparators may be healthy. It reflects not rejection but scrutiny in a way that pushes science forward. The conversation has already matured: from “Can we use real-world comparators?” to “How do we use them well?”