
How to Neutralize a Major Risk AI Poses to Your Reputation and Success
You’ve integrated AI into your operations to improve patient communications, streamline workflows, and enhance care delivery. But what happens if your AI tools generate fictional medical advice, wrong appointment details, and inaccurate records of critical patient interactions?
Welcome to the world of “hallucinations.”
The Hallucination Spectrum – a System-Wide Challenge
To reap the rewards of an effective AI strategy, steps must be taken to neutralize the most profound risks. Hallucinations rank high on that list, with their potential to erode trust, compromise patient safety, and expose healthcare organizations to serious liability.
Recent examples show hallucinations – fabricated AI outputs – causing problems in:
- Clinical Decision Support: When a tool tasked with detecting a life-threatening condition is unreliable, the consequences for patient safety are direct and severe. A 2021 STAT investigation [i] raised serious questions about a sepsis prediction model. Interviews with data scientists and clinicians at multiple health systems revealed the algorithm “routinely fails to identify the condition in advance and triggers frequent false alarms.”
- Patient-Facing Communication:In April 2024, a study in npj Digital Medicine examined AI-powered reply tools in patient portals like MyChart. It found that up to 15% of AI-generated responses contained inaccuracies that potentially posed safety risks, including outdated medication advice or nonexistent follow-up instructions. For the 250 million people using these portals, a single “hallucinated” email can shatter trust.
- Administrative Documentation:OpenAI’s tool, Whisper is creating EHR entries from transcribed conversations. In October 2024, an Associated Press (AP) investigation [ii] found instances of fabricated conversations, racial commentary, and even imagined medical treatments in its output.
Despite the ramifications for legal accountability, the tool has been embedded in EHRs and marketed by numerous vendors.
The Whisper Effect. How Transcription Becomes Fiction
OpenAI’s Whisper is America’s most popular open-source speech recognition model. Built into everything from call centers to voice assistants, it offers a de facto case study on the perils of deploying AI in high-stakes domains.
While outputs like “thank you for watching” and “like and subscribe” are no longer popping up to enliven clinical meeting transcriptions. Associated Press (AP) found Whisper hallucinations were not only commonplace, but sometimes included “racial commentary, violent rhetoric and even imagined medical treatments.” A tool trained on 680,000 hours of YouTube videos is bound to harken to its roots at times.
Knowing its less savory propensities, OpenAI has warned against its use in “high-risk domains.” Over 30,000 medical workers have adopted it, regardless, and Healthcare IT companies have embedded it as their AI transcription service, tasked with converting provider notes, such as exam room conversations, directly into EHR entries.
Nabla, a medical tech company, integrated Whisper into its AI copilot service, which is used by 40 health systems, including Children’s Hospital Los Angeles. Although Nabla admits its tool hallucinates, it prevents medical staff from being able to verify transcript accuracy against source material by erasing original audio recordings “for data safety reasons.”
When AI Goes Rogue. From Patient Harm to Legal Jeopardy
Imagine a post-op hip replacement patient receiving an email notifying them incorrectly of a stage 4 lung cancer diagnosis. Or, picture a sepsis patient whose life-threatening condition was missed by an algorithm that also triggered false alarms, as was revealed in the 2021 STAT investigation.
When patient trust and safety are compromised by technology, known to be error prone, the stakes extend beyond clinical errors into significant legal and financial risk. In 2017, a precedent-setting $155 million settlement with EHR vendor eClinicalWorks (eCW) centered on the company knowingly selling a faulty product. This established a clear principle pertinent to today’s AI enabled systems. If inaccurate medical information is used in treating Medicare or Medicaid patients, healthcare organizations not only face clinical consequences but can be held liable for cheating the government. This risk is compounded by AI’s opacity – organizations often have no audit trail to explain why their systems generated specific outputs.
Why ‘Smarter’ AI Is Not the Answer
The AP report benevolently suggests: “Researchers aren’t certain why Whisper and similar tools hallucinate,” but that is untrue.
Whisper and GPT-4 are Transformer-based predictors. These large language models (LLMs) don’t “think” or “know” facts. They make educated guesses based on statistical patterns learned from vast training datasets, rather than from true comprehension and fact checking.
When Whisper can’t make an accurate prediction from contextual information – for example, when transcribing a conversation between doctor and patient – it falls back on what it “knows” from its training data – a boatload of YouTube content, in other words. The result could be a hallucination: an output fabricated to fill the gap.
Because hallucinations arise from the inherent nature of predictor model itself, they are not a bug developers can simply fix. However, their severity and incidence can be dialed back.
The Foundational Fix – Data Integrity
Attempting to anchor AI in chaotic, unverified EHR data is the architectural equivalent of building a skyscraper on a swamp, explains CureIS CEO Chris Sawotin.
“AI hallucinations are a symptom of data chaos,” Sawotin says. “The industry is focused on chasing algorithmic fixes, but you can’t solve a data integrity problem with a better algorithm. A pristine data foundation places an AI strategy on solid ground from the outset, minimizing the incidence and severity of hallucinations.”
Every dollar invested in AI without first addressing foundational data integrity amplifies risk. The imperative is simple: A clean, conformed, and validated dataset is the fundamental foundation of a successful AI strategy.
[i] Ross, C. (2021, July 26). Epic’s AI algorithms, shielded from scrutiny by a corporate firewall, are delivering inaccurate information on seriously ill patients. STAT News.
[ii] Burke, G., & Schellmann, R. (2024, October 4). Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said. Associated Press.
Ready to Move Your AI Strategy to Solid Ground?
CureIS’s UniSync™ architecture automates data unification and validation, providing an AI-ready foundation that supports your vision. Connect with a CureIS expert today to see how we can de-risk your AI initiatives and ensure your technology delivers on its promise.


