-
Posts
23 -
Joined
-
Last visited
Richard Jones
MembersContent Type
Communities
Learn
News
Events
Gallery
Everything posted by Richard Jones
-
Content Article
Traditional large language models (LLMs) are extraordinarily useful. They can summarise, draft, explain, search, translate, simplify and accelerate work that previously sat in queues, inboxes and clinical admin backlogs. But we need to be brutally clear about what they are. They are not truth machines. They are language machines. My friend Herb Roitblat’s critique goes straight to the root of the issue. LLMs predict likely words. They do not, in their traditional form, represent truth. Roitblat’s framing is that probability and reinforcement can guide which tokens are selected, but this is not the same as the system knowing whether a proposition is true. Reliath’s position is even more direct: the problem is structural because the unit of analysis is the token, not the fact.[1] That distinction matters everywhere. In healthcare, it matters more. A bad answer in marketing is embarrassing. A bad answer in healthcare can change a pathway, delay a diagnosis, distort a record, mislead a patient or create a false sense of clinical certainty. The real problem: fluent nonsense at the point of trust The danger with LLM hallucination is not simply that the model gets something wrong. People get things wrong all the time. The danger is that the model gets something wrong while sounding structured, fluent, balanced and authoritative. In healthcare, that is an especially toxic combination because patients often lack the knowledge to challenge the answer, and clinicians are already overloaded. This is why hallucination is not just a technical bug. It is a trust failure. The World Health Organization (WHO) has warned that large multimodal models used in health can produce false, inaccurate, biased or incomplete statements, and that this can harm people when used for health decisions. It also highlights automation bias, where clinicians or patients overlook errors because the system appears authoritative.[2] That is the strategic issue. Not whether AI can be useful. It clearly can. The issue is where we place it in the system, what level of authority we give it, and whether the output is grounded in verifiable facts or simply dressed in confident language. Why healthcare makes the hallucination problem worse Healthcare is not a clean data environment. It is full of abbreviations, conflicting notes, outdated pathways, local protocols, missing observations, patient-specific exceptions and subtle clinical context. A word like “negative” can be life-changing depending on where it sits. A missing allergy can be catastrophic. A fabricated instruction in a discharge summary can move from screen to ward to patient before anyone has noticed. Recent research into LLM-generated clinical notes found a 1.47% hallucination rate and a 3.45% omission rate across clinician-annotated sentences. That sounds low until you realise that 44% of hallucinated sentences were judged major, meaning they could affect diagnosis or management if left uncorrected.[3] This is the healthcare problem in miniature. The percentages may look manageable. The consequences are not. Guardrails are not enough A lot of AI strategy today is built around mitigation: use better prompts, add retrieval, add a guardrail, add a human in the loop, add a second model to check the first one. All of these can help. None of them changes the fundamental nature of a traditional LLM. Herb’s challenge to the market is that guardrails often mask the problem rather than remove it. RAG can improve grounding, but it is still vulnerable to retrieval errors, source errors, chunking errors, interpretation errors and confident synthesis of the wrong material. Herb instead argues for shifting from tokens to factoids and facts, with “Truth Profiles” and logical or semantic representations designed to distinguish verified information from hypothesis or fabrication. That is an important strategic shift. The goal is not better autocomplete. The goal is accountable intelligence. What this means for AI in healthcare Healthcare AI cannot just be plausible. It has to be auditable. It must show what it knows, where it got it from, what is uncertain, what is missing and what should not be inferred. That means future healthcare AI systems need to separate four things that traditional LLMs often blur together: known facts, clinical interpretation, hypothesis and recommended action. Mix those up and you create danger. Keep them separate and you create a system clinicians can inspect, challenge and use. If the system can only generate likely language, then it must be treated as an assistant. If it can represent propositions, provenance, uncertainty and truth values, it starts to become something closer to clinical infrastructure, subject of course to validation, regulation and real-world safety testing. The strategic takeaway AI will absolutely transform healthcare. But the winners will not be the organisations that adopt the most AI the fastest. They will be the organisations that understand where AI is safe, where it is dangerous, where it is merely impressive and where it is genuinely trustworthy. The next phase of healthcare AI cannot be built on beautiful answers that may or may not be true. It has to be built on verifiable facts, clear provenance, explicit uncertainty and clinical accountability. Because in healthcare, the question is not “can the AI answer?” The question is “can we trust what happens next?” References Roitblat H. The self-curation challenge for the future of AI. 9 March 2025. WHO. WHO releases AI ethics and governance guidance for large multi-modal models. World Health Organization, 18 January 2024. Asgari E, Montaña-Brown N, Dubois M, et al. A framework to assess clinical safety and hallucination rates of LLMs for medical text summarisation. NPJ Digital Medicine, 2025; 8 (274). Further blogs from Richard: The harsh interface between patient care and automation led to a highly avoidable death AI found to not speed up lung cancer diagnosis—AI alone is not enough -
Content Article
Keeping AI working
Richard Jones posted an article in Digital health regulatory bodies/standards/guidance
The most important healthcare AI story recently is not another model launch. It is governance. The American College of Radiology’s new imaging AI practice parameter matters because it asks the right question: not “does this AI work somewhere?” but “does this AI keep working here, for our patients, in our workflow, over time?” That is the real test for clinical AI. Applicability: Was the tool trained and validated for the patients, scanners, settings and clinical decisions where it will actually be used? Accuracy: Is performance monitored after deployment, not just at procurement? Does anyone know when the model drifts? Acceptability: Do clinicians trust it, understand its role and know when to override it? Do patients know when AI is involved? Accountability: Who owns the decision when AI flags, misses, prioritises or misclassifies? This is where healthcare AI becomes serious. The future will not be won by the hospitals with the most algorithms. It will be won by the hospitals with the best operating model for safely using them. The practical questions now are: Can we monitor AI like we monitor infection rates, readmissions or surgical outcomes? Can we make model drift visible before it becomes patient harm? Can we prove local value, not just vendor accuracy? Can we design AI systems clinicians actually accept because they are useful, safe and accountable? That is the shift from AI hype to AI healthcare infrastructure. -
Content Article
A recent interesting study looking at AI tools to diagnose lung cancer highlights that AI does not change diagnosis speed. However, the care pathway was not changed and perhaps the most obvious finding is that care pathways must be optimised if AI is to highlight cases where specialists should take a second look. A large NHS trial found that using AI to flag abnormal chest X-rays for faster review did not meaningfully speed up lung cancer diagnosis overall. It did shorten the time for radiologists to report X-rays, but delays later in the pathway, such as CT scans, clinic appointments and follow-up processes, meant patients were not diagnosed sooner. The study analysed 93,326 chest X-rays across five NHS trusts and identified 558 lung cancer cases. Median time to diagnosis was 44 days with AI prioritisation versus 46 days without, which was not significant. Referral rates, treatment start times and cancer stage at diagnosis were also similar between groups. Researchers said the main problem is not image reporting but the wider NHS pathway. They highlighted one especially important finding: patients whose X-rays were flagged by AI but not by radiologists had much longer waits for diagnosis, suggesting this group may deserve closer study. The authors conclude that AI alone is not enough—improving outcomes would require redesigning the full care pathway so an AI alert triggers rapid follow-up actions like CT booking and specialist review. -
Content Article
Here is a real example from the US of why embedding patient safety can be so difficult. We assume that patient safety is something everyone cares about. But what happens when it goes up against cost imperatives? Patient safety is easiest to move forward, particularly with the Centers for Medicare & Medicaid Services (CMS) Transforming Episode Accountability Model (TEAM) initiative, when improved outcomes and safety equal cost reductions. However, even this is not a guarantee. For example: in one provider, a trial on an AI analytics package was done on a hospital and results showed, according to their own cost estimates (not the vendor's), a potential 10-20 million US dollars savings that would recur if they remained 'under control'. A 'no brainer' right? Clinicians liked it. A patient safety genius there (I'm labelling his abilities correctly) loved it. So why didn't it happen? There is no line item in the accounts for cost reduction. The finance team refused to believe it. They were under huge pressure and did not want to put their heads above the parapet so an accounting quirk led to no savings. This was potentially hundreds of millions of dollars of saving, demonstrable improvements in outcomes and protection against outside scrutiny and criticism... It still didn't happen. I'd like to say there is a happy ending. There isn't. There is a lesson. Engage all stakeholders in discussions and then, perhaps, you might make a bit more progress. However, institutional issues are going to continue to create havoc until outcomes are aligned. If revenue versus cost is the main metric (and it is in some provider systems), you'll continue to get strange decisions driven by potentially perverse incentives. -
Content Article
hub topic lead Richard Jones highlights an incident where the sepsis warning AI system failed to highlight a patient's deterioration and led to an avoidable death. I'll hide the location of this tragic story. A busy nurse was doing her evening rounds. The ward was short on staff and so the nurse took some observations and put them on her uniform as a Post-It note. She'd enter the data later. The patient had cancer and was heavily immunocompromised. The nurse got back around to the patient and took further observations. She then went to enter them in the system. The AI in the system had been trained to understand that two observations so close (in time) was an issue and so it ignored one. This meant it did not enter the details of the patient's vitals that showed the patient had an issue (sepsis). The patient was given an Amber alert status instead of a Red one. The next day the patient died. The nurse was not at fault. You could argue the system was not at fault. However, it lacked 'real-world' experience of how nurses operate. The learning point here? I'm not sure. Mindless reliance on systems to spot the things we miss is unhelpful but I have never regretted a conversation with a nurse regarding how they work and how they care.- Posted
-
- Never event
- Patient death
-
(and 3 more)
Tagged with:
-
Content Article
Fascinating information in this graphic. What gets measured gets improved, but a 2024 Health Services Safety Investigations Body (HSSIB) investigation revealed that systematic underreporting of patient safety incidents involving general practitioner online consultation tools was occurring, and that the available data did not contain enough information to identify potential harm. From my own direct experience, unless you have risk-adjusted metrics for patient outcomes, the layer of incidents that are not flat out Never Events also remain hidden at scale. Patient safety work is still mainly at the tip of the iceberg!- Posted
-
- Near miss
- Never event
-
(and 2 more)
Tagged with:
-
Content Article Comment
Op-ed: Our patients deserve better safety reporting. AI could be the answer (27 February 2026)
Richard Jones commented on Patient Safety Learning's article in Artificial Intelligence
Hi Tejal There is a concern that at present, providers can't detect as much as 90% of avoidable harms. Where we report excess complications across different populations, we ignore the underlying comorbidities etc. Only by risk-adjusting for each patient can we detect that 90% and fix it. I know this works. I know the company went bust pushing the rock uphill to convince US healthcare that quality that improves costs as well is important. Thanks for sharing this information. -
Content Article Comment
Patient safety and the regulation of AI in healthcare
Richard Jones commented on Mark Hughes's article in Patient Safety Learning
- AI
- Digital health
-
(and 2 more)
Tagged with:
Hi Mark This is a super interesting area. A concern is that regulation globally is failing to keep up and the new 'health' models from the big AI players are playing right on the edge of being medical devices. I hope that lobbying and interested parties do not lower the bar on appropriate regulatory oversight.- Posted
- 1 comment
-
- AI
- Digital health
-
(and 2 more)
Tagged with:
-
Community Post
AI - the hype versus reality
Richard Jones replied to Richard Jones's topic in Artificial Intelligence
Thanks Theresa, Let me know what you think if there is anything you think if a bit off centre or really hits the mark. Regards Richard- Posted
- 4 replies
-
Community Post
AI - the hype versus reality
Richard Jones replied to Richard Jones's topic in Artificial Intelligence
The assurance part is very complex indeed. The difference between deterministic and non-deterministic AI is fascinating. The non-deterministic is the greater challenge for regulation. I don't envy those trying to come up with effective solutions. A simple search on Google Bard on me suggests my MBA is from three different places in three different drafts. None are correct.- Posted
- 4 replies
-
Community Post
The latest stat I heard is that each hospital generates more information than the Library of Congress. That is meant to store all media created (although I think that excludes Tik Tok videos and social media). I don't have a timescale for this but, if true, it's pretty impressive and also somewhat intimidating.- Posted
- 2 replies
-
Community Post
AI - the hype versus reality
Richard Jones posted a topic in Artificial Intelligence
I'm already seeing some of this come true with big payors in the US going off the idea of 'point solutions'. A lot of different concepts in here that will be unpacked in different ways in the next few months but what do you think? AI Hype versus Reality in Healthcare 20230803.pdf- Posted
- 4 replies
-
Community Post
Projections indicate that there could be as much as 2,314 exabytes of new data generated in 2020. That’s 2,314 billion gigabytes of data. With a population of nearly 8 billion globally, that’s around 300 gigabytes of data per person per year. Is this realistic? How much of this data is being stored on phones and smartwatches, Fitbits etc.? So who has this data and how useful is it when it sits in a commercial company’s silo and does not complement health system’s own data? One simple truth - that volume of data requires collation, curation, contemplation (sorry - on an alliterative roll here).. but it really needs smart systems to convert it from data to wisdom. Are we on the right path or are we drowning in the data?- Posted
- 2 replies
-
Community Post
Want to know why AI can be tricky
Richard Jones replied to Richard Jones's topic in Artificial Intelligence
If ice cream and dalmations are ever in a hospital context.. I want to be there.- Posted
- 3 replies
-
1
-
Community Post
Want to know why AI can be tricky
Richard Jones posted a topic in Artificial Intelligence
The classic dogs and muffins image has been beaten in my mind by this. How do you tell the diference between dalmations and ice cream? Imagine how hard this will be for AI. This level of find discrimination necessary is why AI is not easy.- Posted
- 3 replies
-
1
-
Community Post
Data and permissions
Richard Jones replied to Richard Jones's topic in Artificial Intelligence
There are some companies working on the control of longitudinal patient records using blockchain. Can't believe I didn't drop that word in previously. Thanks for the thoughts. I'm generally aligned with your thoughts on use of my data.- Posted
- 2 replies
-
1
-
Community Post
Data and permissions
Richard Jones posted a topic in Artificial Intelligence
I think many of us in the industry are still wondering about access to data and who should have it. NHS Digital do a great job of protecting access to health records and as one of the companies that has earned the rigth to access the national records, I can say it is a very rigorous process to maintain that privilege. More broadly we are seeing companies get in trouble for using data in the wrong way from individual hospitals, non-anonymised records being shipped (by mistake) to a company and other things that citizens in some countries (I'm looking at friends in Sweden) would find unacceptable. So who owns your data? Are you happy for it to be sold or just passed on to companies, or do you want the opposite end of the spectrum where you have control over it and sharing beyond your direct health providers requires your consent (and maybe.. whisper it quietly.. payment). There is no one right answer but I'm fascinated by how we deal with data and a swing the door open policy and let favoured companies get access to it willy nilly doesn't seem like the smartest idea. But maybe I'm wrong...- Posted
- 2 replies
-
1
-
Community Post
Absolutely. Also there is the rush to apply things at present which perhaps erodes some of the safety processes. Your point is why I was involved in a project to deliver synthetic data to then test software against a dataset that would highlight the efficacy or otherwise of the results.- Posted
- 8 replies
-
Article Comment
Using Twitter to assess patient takes on patient experience
Richard Jones commented on Clive Flashman's news article in News
- Patient
- Social media
-
(and 3 more)
Tagged with:
Hi @Clive Flashman. I suspect many of us, when told not to look up an ailment online, do the exact opposite. The availability of information has changed in our lifetimes beyond all recognition. However, the quality of that information has also changed. Previously there were limited number of experts and now we have sources at our fingertips. The danger is with misinformation and an inability to know what is correct and what is not. The vaxxer/anti-vaxxer argument is perhaps a prime example or the use of bleach and other products to combat Covid-19. However, I think patient involvement in their own care is vital and if patients can't learn about illnesses etc. themselves, it is beholden on the clinicians to get them to a level of informed consent. I had a good experience recently where the doctor listened to my own ideas about how to deal with an issue and agreed it was sensible. The challenge will be to know what information is accurate and for clinicians to integrate that into discussions that are now done remotely in many cases and in time poor situations. I'd suggest that social media platforms are not the best place for an unbiased view on life and death matters though. There are plenty of websites that specialise in quality medical content that might be better choices for peer reviewed insights. Final thought, clinicians today are generally more friendly and open to discussion than in my younger days. The 'consultant is god' model seems to have gone but we're clearly not providing some patients with the darned good listening to that they need.- Posted
- 3 comments
-
1
-
- Patient
- Social media
-
(and 3 more)
Tagged with:
-
Community Post
I think there is potential to develop scenarios far quicker and more tailored to particular situations. So for example, you can create AI based images in bulk to show a clinician far more cases than they would normally see and build systems to keep people up to date and up to scratch. You can build subtler cases in bulk to help discrimination between different cases of an illness or disease. You can create synthetic data sets to test medical software and build in whatever bias you need to truly test something by packing the data with suitable case profiles while actual anonymised data may have only a handful. So there's lots of potential. But true AI software can modify itself and will not always give the same answer... so we need to be careful about the application and also remember the basics. Surgeon told me a story about a relative who had died and their x-ray. He asked the doctor who looked after his mother to comment on the x-ray which they did. Lots of comments on thumbprint marks etc. but actually completely failed to notice the name on the x-ray was not the name of the relative. It wasn't their x-ray. So as smart as we get.. we still need discipline and people like PSL helping staff set the right standards, do the right thing and be able to point out poor practice in a safe way.- Posted
- 8 replies
-
1
-
Community Post
The wonderful team here at Patient Safety Learning think we need to talk about AI and the impact it can have on healthcare. So I'll be putting up a few topic starters in here but feel free to use this space and start your own conversations. AI means two things at the limit. It means software can change without instruction and the answers can sometimes change between a 'yes' or a 'no' for the same question. So how do we build safe, dependable applications that incorporate AI? How do we test them? How do we approve them? In the pandemic there is a rush to deploy solutions that is a commendable change of pace but at what cost? We should have authentic conversations here and I'm looking forward to discussing the topics above and many more with you. Ricahrd Jones- Posted
- 8 replies
-
Content Article
As trusts consider clearing the waiting list, there is an absence of objective approaches to prioritisation. There are 40 million variations of operative type and the NHS elective waiting list may reach more than 10 million. A coronavirus second wave may cause further delays and expansion of the waiting list. This blog from hub topic lead Richard Jones describes a proven approach to prioritising the waiting list built around individualised risk-adjustment for each patient and evolved from the core POSSUM methodology that is widely used for individual risk assessment pre-operatively. A significant backlog of elective surgical cases has built up during the COVID-19 crisis. The freeze on elective surgery has produced a waiting list that may take years to clear. In the US, the CDC has issued guidelines that "facilities should establish a prioritization policy committee consisting of surgery, anesthesia and nursing leadership to develop a prioritization strategy appropriate to the immediate patient needs". According to the CDC, this committee should work around 'objective priority scoring'. The MeNTS (Medically-Necessary, Time-Sensitive Procedures) instrument is a clever attempt to deliver this scoring, responding to availability of resources and the situation around COVID-19. However, the key challenge is that that the list needs to be prioritised in a way that reflects patient needs and ensures their safety. This is not something that MeNTS can deliver. It also is built around COVID-19 related limitations on resources and this will vary in significance depending on the hospital location and where it is in the journey out of lockdown. The risks of mortality and complications for a patient are a complex combination of the severity of the procedure and the physiological variables of the patient. As an example, a 55-year-old undergoing a radical laproscopic prostatectomy has a risk of mortality of 1.6%. However, if the patient has low blood pressure, that risk triples. If the patient also has low sodium then the risk is 10 times higher [C2-Ai insights]. The spectrum of different operations and key physiological variables creates at least 40 million potential combinations and hence risk. This is hard to manage with one patient but trying to prioritse a group of 5, 10, 100, 1,000 or even 10,000 becomes unmanageable. New patients will be joining the list while others leave following their procedures and so triage of the list will not be a one-off event. The list will need to be populated and triaged intelligently and in a consistent way repeatedly at least until there is a return to ‘normality’. There is evidence that some trusts are attempting to build their own systems for prioritisation. This may be possible around matching operative type and resource availability but the efficiency of these systems overall should be a concern. Best intentions are fine but, when reviewed later, the ability to correctly prioritise patients to minimise harm and mortality is likely to be limited if not flawed. C2-Ai’s COMPASS Surgical List Triage system is an example of a system that can support evidence-based triage and individualised risk assessment of patients, while supporting the objectives of the CDC. It supports clinical decision making across all phases from crisis back to steady state. It has been developed by the creator of the POSSUM system and is built around the world’s largest patient data set (140 million records from 46 countries) through the support of NHS Digital. The underlying algorithms are constantly refined against new and existing data sets to ensure relevance and accuracy. The Surgical List Triage tool combines the mortality and complication risks from the different patients to derive the prioritisation. The system carries out bulk assessments using individualised risk assessments for each patient. These reflect the operative type and their physiology to calculate the risk of mortality and complications, as well as providing a detailed breakdown of potential complications with percentage probability with a simple click. This system also suggests patients that should be reviewed for potential optimisation before any procedure. The physician can click on the link to see the detailed risks for the patient to support their decision making. The system can be used regularly to maintain the logic and integrity of the elective surgical list. This is superior to the potentially fragmented approach where parts of the list are manually considered in isolation as this cannot support effective optimisation of the whole list and the absence of any supporting evidence means the triage will vary enormously. COMPASS SLT is an evidence-based approach that supports optimal ordering of the list and clinical decision making that reduces avoidable harm and mortality. This in turn reduces variation, and cost while freeing bed capacity and also allowing the list to be tackled more quickly. When a patient comes in for the operation, an individual risk-assessment can be done using the COMPASS Pre-Operative Risk Assessment app. This provides a final check on whether the patient’s condition would justify optimising their condition before their procedure. However, it also details the most likely post-procedural complications individualised for the patient and their condition. That allows the treatment pathway to be tailored to that patient as well as recruiting the patient into their own recovery. For example, knowing that chest infection is the highest risk for a patient supports a conversation with them to stress the need for them to get up and about on the day of the operation. As an aside, the risk of mortality and complications can also be used as a strong element in showing informed consent has been obtained from the patient. In combination, these tools can provide a platform to support effective and ongoing triage of the list while reducing harm and unnecessary costs. The systems are currently in use in 12 trusts in the NHS. How are you prioritising waiting lists? We'd be interested to hear and share how you and your trust are dealing with the backlog. Opinions expressed in blogs and other content are those of the author. Patient Safety Learning welcomes sharing content and opinions that promotes safer patient care and for the reduction of avoidable harm. The views expressed on the hub however do not necessarily represent Patient Safety Learning's views or values. References to a specific product or service does not imply a recommendation or endorsement.- Posted
-
1
-
- Recovery
- Pre-op period
- (and 5 more)
-
Content Article Comment
Great article and a very important topic Lorri. We have just been named as one of the '10 Digital Health Ideas for a UK National Covid-19 Response' by Healthcare UK (a joint initiative of NHS England, UK Departments of Health and International Trade) and it would be very good to discuss how patient safety approaches can make a big difference in the crisis. During the pandemic, we are deploying a risk-assessment tool, sythesized from our patient safety system and reductions in AKI of over 90% (publisihed approach in BJN and winning an HSJ Patient Safety Award) and HAP by 60%. Long story short is that those patients acquiring these conditions are blocking beds for up to 8 days extra on average. Those beds are needed for Covid-19 patients and so reducing these conditions is a critical part of the patient safety vision you've supported for so long Lorri. A 50% reduction in these conditions in US hospitals would free enough capacity for an extra 67,000 C-19 patients in the next 3 months. Could you find time for a discussion? AKI HAP Overivew 002.pdf- Posted
- 3 comments
-
1