Table of Contents >> Show >> Hide
- SBM vs. EBM: Same goal, different blind spots
- Why Cochrane matters (and why it sometimes drives people up a wall)
- Case study: Laetrile and the danger of a “more trials needed” reflex
- Case study: Touch therapies, Reiki, and the illusion of precision
- A little Bayes: the math that explains “prior plausibility” without turning into a philosophy seminar
- How to read a systematic review like a human (not a spreadsheet)
- The fair takeaway: Cochrane isn’t the villainmethodolatry is
- Conclusion
- Field Notes: Practical Experiences With Cochrane-Style Evidence and Bayesian Thinking
If you’ve ever finished a systematic review feeling like you just watched a detective movie where the final scene is,
“Well… we should probably do another detective movie,” you’ve already met the plot twist of modern evidence debates:
sometimes the method is so carefully followed that the question gets left standing outside in the rain.
This is where the long-running tug-of-war between evidence-based medicine (EBM) and
science-based medicine (SBM) gets interesting. Not “internet comment thread” interesting
(please drink water and step away from your keyboard), but genuinely important: how we decide what counts as
convincing evidence, what belongs in a systematic review, and when “more research” is a responsible conclusion
versus a polite way of keeping a bad idea on life support.
SBM vs. EBM: Same goal, different blind spots
EBM is at its best when it’s doing what it was built for: forcing clinicians and policymakers to stop relying on
vibe-based medicine (“I’ve always done it this way”) and instead lean on the most reliable clinical evidence we can
gatherideally randomized controlled trials (RCTs), careful observational data, and systematic reviews that try to
summarize the whole landscape without cherry-picking.
SBM doesn’t replace that. SBM is more like EBM’s slightly annoying friend who keeps asking, “Okay, but does this
make sense in the real world of biology, chemistry, and physics?” In other words, SBM tries to formally
include what EBM sometimes treats informally: mechanism, prior plausibility, and the “external” evidence that lives
outside the RCT universe.
When EBM forgets those inputs, it can slip into a kind of procedural trance: if there aren’t enough RCTs, the answer
becomes “insufficient evidence,” which can sound like “maybe it works” even when the broader scientific picture is
screaming “no.”
Why Cochrane matters (and why it sometimes drives people up a wall)
The Cochrane brand became influential for a reason. Cochrane-style reviews are designed to be
systematic, transparent, and replicable: define the question, set inclusion criteria, search comprehensively,
assess risk of bias, and synthesize results with appropriate statistics. In a world where medicine has a long history
of confident wrongness, that structure is a public service.
But structure can become a cage if it’s applied without judgment. A Cochrane review can be technically impeccable
and still land on a conclusion that feels oddly disconnected from the scientific reality of the interventionespecially
for claims that are biologically implausible or already disconfirmed by earlier-stage evidence.
The “absence of RCTs” trap
There are many legitimate reasons an intervention might have few or no RCTs: it’s too new, too rare, too expensive,
hard to blind, or ethically tricky. But there’s also another reason: it didn’t do well in early studies, or it never
had a plausible mechanism to begin withso it never earned a spot in the expensive, high-stakes Phase III arena.
This distinction matters because clinical trials aren’t supposed to start at the finish line. Especially in oncology,
large randomized trials usually happen only after earlier phases show enough promise to justify exposing people to risk.
When a review treats “no large RCTs” as a neutral gap, it can accidentally imply that the intervention is still a
reasonable candidate for late-stage testing.
Case study: Laetrile and the danger of a “more trials needed” reflex
The “Part IV, Continued” discussion in the SBM/EBM Redux storyline zeroes in on a classic example: Laetrile
(amygdalin), marketed for years as a cancer treatment. A Cochrane review once concluded that the literature
“identified the need” for randomized or controlled clinical trials. That sounds cautious. It also sounds like an open door.
The problem is that cancer drug development has a pipeline for a reason. If a drug is toxic, lacks meaningful activity in
earlier studies, or fails in a clinical trial designed to detect benefit, the ethical and scientific rationale for a large
randomized trial collapses. “Let’s randomize more people” is not automatically the virtuous choice; it can be clinical
wheel-spinning with real human cost.
In the Redux follow-up, a sharp “feedback” letter embedded in the review essentially makes that point: Phase III trials
are supposed to follow Phase I/II signals, and ignoring key clinical evidencebecause it doesn’t match a preferred trial
designcan lead to recommendations that violate both logic and ethics.
Why this argument isn’t anti-EBM
This isn’t a call to replace trials with intuition. It’s a call to recognize that evidence has a lifecycle. If an intervention
fails to clear early hurdles, later hurdles aren’t the “gold standard”they’re the wrong stadium.
The most uncomfortable implication is also the most useful: sometimes the evidence you need to stop is not “more RCTs.”
It’s an honest appraisal of biology, prior research, and what the existing clinical record already tells you.
Case study: Touch therapies, Reiki, and the illusion of precision
Another Redux example highlights how systematic review language can lend gravitas to claims that wobble on contact with
basic science. Reviews of “touch therapies” (including Reiki-style practices) can read like a high-tech flight manual:
heterogeneity statistics, subgroup analyses, sensitivity checks, dose-response explorationslots of math, lots of decimal
points, and a vibe that suggests we’re narrowing in on the truth.
Yet the underlying premise can remain unproven or scientifically incoherent, and the clinical data can be a patchwork of
small studies vulnerable to bias, placebo effects, and publication bias. The risk here isn’t that statistics are “fake.”
It’s that statistics can be misassigned a job: they can measure variation in trial results, but they can’t conjure
plausibility out of thin air.
A memorable reality check comes from a well-known experiment often discussed in skeptical and EBM circles: when practitioners
claimed they could sense a human energy field, they performed no better than chance under controlled conditions. If a claimed
mechanism can’t even clear a basic detection test, a pile of underpowered clinical trials shouldn’t be treated like a thrilling
mystery. It’s not a mystery. It’s a lesson in how noise can cosplay as signal.
A little Bayes: the math that explains “prior plausibility” without turning into a philosophy seminar
When people argue about plausibility, it can sound subjectivelike we’re voting on what seems “reasonable.” Bayesian thinking
offers a cleaner frame: start with a prior probability (based on existing knowledge) and update it using
new evidence. The updated number is the posterior probability.
You don’t have to love Bayes to see why it matters. Medicine constantly asks questions like “How likely is this diagnosis now?”
after a test result, and Bayes is the logic underneath that update.
A quick diagnostic example (with numbers you can do on a napkin)
Imagine a screening test for a condition that affects 1% of a population. The test has 95% sensitivity and 95% specificity.
Sounds amazing, right? Now test 10,000 people:
- About 100 truly have the condition. With 95% sensitivity, ~95 test positive.
- About 9,900 do not. With 95% specificity, 5% (≈495) will still test positive.
Total positives: 95 true positives + 495 false positives = 590 positives. So the chance a positive result is a true case is
95/590 ≈ 16%. That’s not a math trick; it’s Bayes reminding you that base rates matter. The prior (pretest)
probability shapes what the evidence means.
So what does Bayes have to do with Cochrane-style reviews?
If a claim starts with a low prior probabilitybecause it contradicts basic science or has repeatedly failed in credible tests
then a modest clinical effect in a handful of small studies should not shift belief very far. You’d need evidence that is both
strong and robust enough to overcome the prior. This is one reason why “statistically significant” findings can still be
misleading when the underlying hypothesis is implausible or when the research environment is biased.
In Bayesian terms: the likelihood provided by small, heterogeneous trials is often too weak to drag a low prior into “probably
true” territory. But in purely frequentist reporting, the story can get simplified to “p<0.05, therefore… excitement,” and the
prior gets treated like an optional accessory.
How to read a systematic review like a human (not a spreadsheet)
Systematic reviews are powerful tools. They are also not magical truth vending machines. When the topic is controversialor
when the intervention has a whiff of “this would require rewriting physics”you can read smarter without needing a PhD in
biostatistics.
Checklist: questions to ask before you fall in love with a forest plot
- What question is being asked? A review can be perfectly designed to answer the wrong question.
- What evidence did they exclude? If only RCTs count, what happens to mechanistic evidence, Phase II data, or strong disconfirming studies?
- How big and how believable are the effects? Tiny effects in small trials are fragile; bias can mimic benefit.
- What’s the heterogeneity? High inconsistency (e.g., I2) suggests “these studies don’t agree,” not “the truth is hiding in the average.”
- Are the studies high risk of bias? Weak blinding, selective outcomes, and flexible analysis inflate apparent effects.
- Is publication bias likely? If negative studies disappear, the “evidence base” becomes a curated highlight reel.
- Who did the review? Method expertise matters, but so does domain expertiseespecially in fields like oncology or pharmacology.
- Do the conclusions match the totality of evidence? “More research is needed” can be true, but it can also be a default setting.
- What would a Bayesian update look like? If the prior is very low, ask what kind of evidence would be required to meaningfully shift belief.
Frameworks like GRADE and the methods guidance used in U.S. evidence programs exist because the quality of evidence is not only
about design labels (“RCT” vs. “observational”). It’s about risk of bias, consistency, directness, precision, and whether the
entire body of evidence hangs together in a way that makes scientific and clinical sense.
The fair takeaway: Cochrane isn’t the villainmethodolatry is
The point of the Redux “more Cochrane and a little Bayes” argument isn’t to dunk on systematic reviews. It’s to resist a
particular failure mode: treating the method as the message.
Cochrane-style rigor is invaluable for many questions (especially “Does this work in people under real conditions?”). But when
the hypothesis is highly implausible, or when early and external evidence strongly argues against it, a rigid “RCT-only” lens can
produce conclusions that are technically cautious yet practically misleading.
The sweet spot is not “EBM or SBM.” It’s EBM done with SBM-level respect for biology, prior evidence, and ethical trial design.
That combination makes it harder for weak signals to masquerade as breakthroughsand harder for “more research needed” to become
an automatic sequel.
Conclusion
“More Cochrane” is greatwhen it means better methods, clearer bias assessment, and honest synthesis. “A little Bayes” is also
greatwhen it reminds us that evidence doesn’t float in a vacuum. It lands in a world where prior knowledge exists, mechanisms
matter, and clinical trials are ethical commitments, not just statistical exercises.
If you want one line to keep: evidence is strongest when it’s interpreted in context. The RCT is a tool, not a
crown. The systematic review is a map, not the territory. And Bayes is the quiet reminder that the story starts before the first
p-value appears.
Field Notes: Practical Experiences With Cochrane-Style Evidence and Bayesian Thinking
Talk to enough clinicians, guideline writers, or research-literate patients, and you start hearing the same “how did we get here?”
storiesespecially around treatments that live in the gray zone between “popular” and “proven.” The pattern is familiar: a new
therapy arrives with confident marketing, a handful of small trials, and a systematic review that ends with the most durable
sentence in medicine: more high-quality studies are needed.
In real-world discussions, that line behaves like a social inkblot. Enthusiasts hear, “See? The experts say it might work!”
Skeptics hear, “See? There’s no good evidence.” Meanwhile, the busy clinician hears, “This will not help me finish clinic on time.”
The experience is less about ideology and more about translation: what does a cautious academic conclusion mean in the decision
moment when a patient asks, “Should I try it?”
One practical lesson that comes up repeatedly is how often “the average effect” hides the real story. In journal clubs and committee
meetings, people will point to a pooled estimate and treat it like a verdict. But when you dig into the included trials, you find
different populations, different comparators, different outcome definitions, and different levels of bias control. Clinicians who
have lived through multiple “promising” interventions that later fizzled develop a healthy reflex: if the evidence looks fragile,
they ask how easily bias could reproduce it.
Bayesian thinking shows up here in a very down-to-earth way. Nobody has to say “posterior probability” out loud (and honestly, thank
you to everyone who doesn’t). Instead, people ask questions that are Bayesian in spirit: “Does this fit with what we know about
physiology?” “If this were true, would we see bigger effects?” “Why are results all over the place?” “Are there decent negative
studies that never made it to print?” It’s prior plausibility dressed in street clothes.
Another common experience is the ethical tension around trials. The public often imagines research as a neutral pursuit: try things,
see what happens. But clinicians involved in trials experience them as moral objects. Randomizing a patient isn’t just flipping a
coin; it’s a claim that equipoise is realthat we genuinely don’t know which path is better and that the question is worth the risk.
When a proposed study feels like it’s skipping the early evidence steps, discomfort isn’t “anti-science.” It’s scientific maturity
plus ethical memory.
Finally, there’s the communication problem: patients tend to want certainty, while honest evidence summaries often deliver nuance.
In practice, the best conversations don’t mock uncertainty; they label it. They separate “might help” from “probably helps,” and they
separate “safe enough to try” from “not worth the risk.” And when a systematic review ends with “more research needed,” the most
useful follow-up question is simple: more research for whom, at what cost, and with what prior reason to believe?