Summary
Traditional large language models (LLMs) are extraordinarily useful. They can summarise, draft, explain, search, translate, simplify and accelerate work that previously sat in queues, inboxes and clinical admin backlogs. But we need to be brutally clear about what they are. They are not truth machines. They are language machines.
Content
My friend Herb Roitblat’s critique goes straight to the root of the issue. LLMs predict likely words. They do not, in their traditional form, represent truth. Roitblat’s framing is that probability and reinforcement can guide which tokens are selected, but this is not the same as the system knowing whether a proposition is true. Reliath’s position is even more direct: the problem is structural because the unit of analysis is the token, not the fact.[1]
That distinction matters everywhere. In healthcare, it matters more. A bad answer in marketing is embarrassing. A bad answer in healthcare can change a pathway, delay a diagnosis, distort a record, mislead a patient or create a false sense of clinical certainty.
The real problem: fluent nonsense at the point of trust
The danger with LLM hallucination is not simply that the model gets something wrong. People get things wrong all the time. The danger is that the model gets something wrong while sounding structured, fluent, balanced and authoritative. In healthcare, that is an especially toxic combination because patients often lack the knowledge to challenge the answer, and clinicians are already overloaded.
This is why hallucination is not just a technical bug. It is a trust failure. The World Health Organization (WHO) has warned that large multimodal models used in health can produce false, inaccurate, biased or incomplete statements, and that this can harm people when used for health decisions. It also highlights automation bias, where clinicians or patients overlook errors because the system appears authoritative.[2]
That is the strategic issue. Not whether AI can be useful. It clearly can. The issue is where we place it in the system, what level of authority we give it, and whether the output is grounded in verifiable facts or simply dressed in confident language.
Why healthcare makes the hallucination problem worse
Healthcare is not a clean data environment. It is full of abbreviations, conflicting notes, outdated pathways, local protocols, missing observations, patient-specific exceptions and subtle clinical context. A word like “negative” can be life-changing depending on where it sits.
A missing allergy can be catastrophic. A fabricated instruction in a discharge summary can move from screen to ward to patient before anyone has noticed.
Recent research into LLM-generated clinical notes found a 1.47% hallucination rate and a 3.45% omission rate across clinician-annotated sentences. That sounds low until you realise that 44% of hallucinated sentences were judged major, meaning they could affect diagnosis or management if left uncorrected.[3]
This is the healthcare problem in miniature. The percentages may look manageable. The consequences are not.
Guardrails are not enough
A lot of AI strategy today is built around mitigation: use better prompts, add retrieval, add a guardrail, add a human in the loop, add a second model to check the first one. All of these can help. None of them changes the fundamental nature of a traditional LLM.
Herb’s challenge to the market is that guardrails often mask the problem rather than remove it. RAG can improve grounding, but it is still vulnerable to retrieval errors, source errors, chunking errors, interpretation errors and confident synthesis of the wrong material. Herb instead argues for shifting from tokens to factoids and facts, with “Truth Profiles” and logical or semantic representations designed to distinguish verified information from hypothesis or fabrication.
That is an important strategic shift. The goal is not better autocomplete. The goal is accountable intelligence.
What this means for AI in healthcare
Healthcare AI cannot just be plausible. It has to be auditable. It must show what it knows, where it got it from, what is uncertain, what is missing and what should not be inferred.
That means future healthcare AI systems need to separate four things that traditional LLMs often blur together: known facts, clinical interpretation, hypothesis and recommended action. Mix those up and you create danger. Keep them separate and you create a system clinicians can inspect, challenge and use.
If the system can only generate likely language, then it must be treated as an assistant. If it can represent propositions, provenance, uncertainty and truth values, it starts to become something closer to clinical infrastructure, subject of course to validation, regulation and real-world safety testing.
The strategic takeaway
AI will absolutely transform healthcare. But the winners will not be the organisations that adopt the most AI the fastest. They will be the organisations that understand where AI is safe, where it is dangerous, where it is merely impressive and where it is genuinely trustworthy.
The next phase of healthcare AI cannot be built on beautiful answers that may or may not be true. It has to be built on verifiable facts, clear provenance, explicit uncertainty and clinical accountability.
Because in healthcare, the question is not “can the AI answer?” The question is “can we trust what happens next?”
References
- Roitblat H. The self-curation challenge for the future of AI. 9 March 2025.
- WHO. WHO releases AI ethics and governance guidance for large multi-modal models. World Health Organization, 18 January 2024.
- Asgari E, Montaña-Brown N, Dubois M, et al. A framework to assess clinical safety and hallucination rates of LLMs for medical text summarisation. NPJ Digital Medicine, 2025; 8 (274).
Further blogs from Richard:
About the Author
Richard Jones writes, lectures and implements strategy and AI in healthcare and beyond, from startups to corporates (richardjones.com) and is taking big swings at the challenges in healthcare (mission10x.net).
0 Comments
Recommended Comments
There are no comments to display.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now