AI in healthcare: the problem is not hallucination, it is false confidence

PUBLISHED 17 hours ago
ORIGIN UK
TYPE Blogs
CONTENT TYPE New
SUGGESTED AUDIENCE Everyone

TAGGED

AI

Summary

Traditional large language models (LLMs) are extraordinarily useful. They can summarise, draft, explain, search, translate, simplify and accelerate work that previously sat in queues, inboxes and clinical admin backlogs. But we need to be brutally clear about what they are. They are not truth machines. They are language machines.

Content

My friend Herb Roitblat’s critique goes straight to the root of the issue. LLMs predict likely words. They do not, in their traditional form, represent truth. Roitblat’s framing is that probability and reinforcement can guide which tokens are selected, but this is not the same as the system knowing whether a proposition is true. Reliath’s position is even more direct: the problem is structural because the unit of analysis is the token, not the fact.[1]

That distinction matters everywhere. In healthcare, it matters more. A bad answer in marketing is embarrassing. A bad answer in healthcare can change a pathway, delay a diagnosis, distort a record, mislead a patient or create a false sense of clinical certainty.

The real problem: fluent nonsense at the point of trust

The danger with LLM hallucination is not simply that the model gets something wrong. People get things wrong all the time. The danger is that the model gets something wrong while sounding structured, fluent, balanced and authoritative. In healthcare, that is an especially toxic combination because patients often lack the knowledge to challenge the answer, and clinicians are already overloaded.

This is why hallucination is not just a technical bug. It is a trust failure. The World Health Organization (WHO) has warned that large multimodal models used in health can produce false, inaccurate, biased or incomplete statements, and that this can harm people when used for health decisions. It also highlights automation bias, where clinicians or patients overlook errors because the system appears authoritative.[2]

That is the strategic issue. Not whether AI can be useful. It clearly can. The issue is where we place it in the system, what level of authority we give it, and whether the output is grounded in verifiable facts or simply dressed in confident language.

Why healthcare makes the hallucination problem worse

Healthcare is not a clean data environment. It is full of abbreviations, conflicting notes, outdated pathways, local protocols, missing observations, patient-specific exceptions and subtle clinical context. A word like “negative” can be life-changing depending on where it sits.

A missing allergy can be catastrophic. A fabricated instruction in a discharge summary can move from screen to ward to patient before anyone has noticed.

Recent research into LLM-generated clinical notes found a 1.47% hallucination rate and a 3.45% omission rate across clinician-annotated sentences. That sounds low until you realise that 44% of hallucinated sentences were judged major, meaning they could affect diagnosis or management if left uncorrected.[3]

This is the healthcare problem in miniature. The percentages may look manageable. The consequences are not.

Guardrails are not enough

A lot of AI strategy today is built around mitigation: use better prompts, add retrieval, add a guardrail, add a human in the loop, add a second model to check the first one. All of these can help. None of them changes the fundamental nature of a traditional LLM.

Herb’s challenge to the market is that guardrails often mask the problem rather than remove it. RAG can improve grounding, but it is still vulnerable to retrieval errors, source errors, chunking errors, interpretation errors and confident synthesis of the wrong material. Herb instead argues for shifting from tokens to factoids and facts, with “Truth Profiles” and logical or semantic representations designed to distinguish verified information from hypothesis or fabrication.

That is an important strategic shift. The goal is not better autocomplete. The goal is accountable intelligence.

What this means for AI in healthcare

Healthcare AI cannot just be plausible. It has to be auditable. It must show what it knows, where it got it from, what is uncertain, what is missing and what should not be inferred.

That means future healthcare AI systems need to separate four things that traditional LLMs often blur together: known facts, clinical interpretation, hypothesis and recommended action. Mix those up and you create danger. Keep them separate and you create a system clinicians can inspect, challenge and use.

If the system can only generate likely language, then it must be treated as an assistant. If it can represent propositions, provenance, uncertainty and truth values, it starts to become something closer to clinical infrastructure, subject of course to validation, regulation and real-world safety testing.

The strategic takeaway

AI will absolutely transform healthcare. But the winners will not be the organisations that adopt the most AI the fastest. They will be the organisations that understand where AI is safe, where it is dangerous, where it is merely impressive and where it is genuinely trustworthy.

The next phase of healthcare AI cannot be built on beautiful answers that may or may not be true. It has to be built on verifiable facts, clear provenance, explicit uncertainty and clinical accountability.

Because in healthcare, the question is not “can the AI answer?” The question is “can we trust what happens next?”

References

Further blogs from Richard:

About the Author

Richard Jones writes, lectures and implements strategy and AI in healthcare and beyond, from startups to corporates (richardjones.com) and is taking big swings at the challenges in healthcare (mission10x.net).

0 reactions so far

Search

AI in healthcare: the problem is not hallucination, it is false confidence

Summary

Content

The real problem: fluent nonsense at the point of trust

Why healthcare makes the hallucination problem worse

Guardrails are not enough

What this means for AI in healthcare

The strategic takeaway

References

About the Author

0 Comments

Recommended Comments

Create an account or sign in to comment

Create an account

Sign in

Related hub content

About Us

My Hub

Important Information

Sign In

Summary

Content

The real problem: fluent nonsense at the point of trust

Why healthcare makes the hallucination problem worse

Guardrails are not enough

What this means for AI in healthcare

The strategic takeaway

References

About the Author

0 Comments

Recommended Comments

Create an account or sign in to comment

Create an account

Sign in

Related hub content

About Us

My Hub

Important Information