Back

Medical Interpreting Where AI Makes Sense. Where it doesn't.

AI interpretation healthcare procurement isn't an all-or-nothing decision. The honest question is which encounters AI handles best and which ones still belong to a human.

Eyal Heldenberg

Co-founder and CEO, building No Barrier

Last Updated:

July 1, 2026

Minute Read

Key Topics Covered

A 47-year-old patient who speaks Haitian Creole arrives in a community health center for a pre-surgical consult. The clinic's phone line interpretation service can route Spanish in under a minute, Mandarin in two, Haitian Creole in roughly twelve to fifteen, if a qualified interpreter is on the platform that day. The consult is twenty minutes long. The patient has waited three weeks for the appointment. The surgeon has eleven more patients on the schedule.

This is the actual decision point where AI interpretation in healthcare either earns its place or doesn't and it's not a hypothetical. It's a procurement question that 1,400+ FQHCs are answering, right now.

‍

1. When does AI interpretation in healthcare actually fit?

The honest answer requires separating two things vendors usually fuse together: the clinical complexity of the encounter and the operational reality of getting language access at the moment it's needed. AI is strong on the second axis and increasingly strong on the first. But the fit is not uniform.

1.1 Routine, high-volume encounters

Intake. Vitals. Medication reconciliation. Discharge instructions. Pre-op screening questionnaires. Pharmacy pickups. Post-op follow-up calls. Scheduling. These encounters dominate the volume of any health system's language access need.

A community health center that runs 200 LEP encounters a day and pays per-minute for telephonic interpretation is paying twice: once in dollars and once in clinician time waiting on hold. AI removes the second cost entirely and reduces the first by an order of magnitude. That is not a marketing claim. It is what the math does when you remove queue time from the encounter. AI is preferred. AI is instant.

‍

1.2 Spanish, with dialect depth

Spanish is the largest LEP language in U.S. healthcare and it is also the most resourceful. A handful of well-staffed health systems run in-house Spanish medical interpreter programs and when those programs are well-credentialed and dialect-aware, they are an excellent answer for Spanish. That is the ideal. It is not the majority. Most systems do not have in-house Spanish interpreters at the volume their LEP population actually generates and the ones that do still hit two failure modes: simultaneous demand, where three encounters need an interpreter at the same minute and only one or two are available and dialect mismatch, where the available interpreter speaks Mexican Spanish and the patient speaks Dominican or Puerto Rican Spanish. Mexican, Dominican, Puerto Rican and Cuban Spanish carry distinct vocabulary for body parts, symptoms and medications. A flat Spanish interpretation, accurate at the level of words, can still fail at the level of meaning.

‍

This is where AI interpretation in healthcare fits cleanly. One click, instant interpretation, matched to the patient's regional variant, available at the same minute across every encounter running in the building. The in-house interpreter is freed for the encounters where a human is non-negotiable. AI handles the volume that would otherwise wait, escalate, or settle for the wrong dialect.

‍

1.3 Other major non-Spanish languages

"Mandarin and Cantonese. Vietnamese. Arabic. Russian. Portuguese. Haitian Creole. Korean. Tagalog. These are the languages that, after Spanish, dominate U.S. healthcare interpretation encounters nationally, per AMN Healthcare's tracking of 45 languages in patient-provider encounters. Almost no health system runs in-house compliant medical interpreters in any of them."

The default is telephonic interpretation, which means the encounter starts with a queue, the interpreter is a voice on a phone with no encounter context and the same patient sees a different interpreter at every touchpoint along the journey. Intake, follow-up, discharge, pharmacy pickup, telehealth check-in. Five encounters, five different interpreters, five different interpretations of the same medication protocol. The clinical record is inconsistent because the interpretation was inconsistent.

‍

AI is consistent by definition. The same platform, the same dialect handling, the same terminology, across every encounter the patient has. That consistency is not a feature. It is the difference between AI and human interpretation.

‍

1.4 Rare and ultra-rare languages

Marshallese. K'iche'. Mam. Tigrinya. Karen. Hmong dialects. These are interesting cases because AI capability for rare languages depends on the same scarce human resources, native speakers, trained linguists, transcribed corpora, available data that make qualified human interpreters scarce in the first place. Where the data exists, AI delivers point-of-care interpretation in languages a phone service cannot reliably staff at all. Where the data does not yet exist, neither AI nor the telephonic network has a good answer today. The trajectory is clear: as data collection and model training improve for low-resource languages, AI coverage extends. The honest framing for procurement is that rare-language capability is a frontier, not a finished product and any vendor claiming uniform quality across 200+ languages is overstating what the underlying data can currently support.

In discovery calls, we are often asked about exactly these rare languages, the ones that surface on the frontline in an FQHC or refugee health program on any given week. They deserve all of our attention, always. What we also tell those buyers is that the road to cover the actual volume, the top 10 and even the top 5 U.S. LEP languages, with real consistency, dialect depth and audit-grade quality, is itself a long road and most platforms have not yet finished walking it. The frontier and the volume are two different problems.

And when a vendor advertises "200+ languages" the operational reality is usually mixed delivery: languages handled by AI and the rest still relying on the traditional human telephonic flow with its queue times and consistency gaps. The number is real. The uniformity it implies is not. A procurement question worth asking in every demo: which of your 200+ languages are AI-delivered end-to-end and which are routed to a phone-based human interpreter behind the scenes?

‍

1.5 Triage and after-hours

The 2 a.m. ED call from a Vietnamese-speaking parent. The Saturday-morning nurse line. The Sunday telehealth visit. These are encounters where human interpreter availability collapses and patient need does not. AI interpretation in healthcare is, in this slice, the only option that scales with demand. The question is not whether it is better than a human interpreter. The question is whether it is better than no interpretation at all, or worse, the ad-hoc family-member workaround that Section 1557 prohibits.

‍

External clinician quote required. The current draft has only No Barrier voices. Please supply a quoted statement from a credentialed external clinician, ideally an ED physician or FQHC medical director, on the operational reality of after-hours interpretation gaps. Suggested angle: the patient-safety cost of either a long queue wait or a family-member workaround in a triage encounter.

‍

2. Where does AI interpretation belong nowhere near?

Equally honest: there are encounter categories where AI should not be the default and a credible vendor should say so out loud.

2.1 End-of-life conversations and goals-of-care discussions

The vocabulary is not the hard part of these conversations. The hard part is silence, pacing, cultural framing of mortality and a clinician's ability to read a family in the room. A qualified human medical interpreter is doing something AI cannot yet do in this setting. They are interpreting affect, not just words. For a goals-of-care conversation, the right answer is a human interpreter who can be in the room (in person or video) and who has the training to handle bereavement. No procurement framework that treats this encounter as equivalent to a medication reconciliation will survive contact with the bedside.

‍

2.2 Encounters defined by emotional weight, not by department

Oncology is the obvious example but the real category is encounters where the interpretive load is emotional, not technical. Suicidal ideation, miscarriage, hard diseases and treatments. These conversations are not located in a department. They happen wherever the patient decides to speak. The vocabulary is rarely the hard part. The hard part is emotions, pauses, code-switching, affect, cultural framing of shame and the interpreter's own training in trauma-informed practice. This is where the hybrid model exists for a reason. The patient or clinician should be able to request a human interpreter the moment the encounter shifts into emotional territory.

‍

2.3 Informed consent for high-risk procedures

Section 1557 sets the floor for meaningful language access. The standard of care for informed consent on a high-risk procedure sits well above that floor. When a patient is being asked to consent to a procedure with a non-trivial mortality risk, the legal and clinical defensibility of that consent rests partly on the interpreter's qualifications. AI may be present in the room. It should not be the only thing in the room.

‍

2.4 Patient and provider's choice

The patient has the right to choose. If the patient does not want AI medical interpretation, for any reason at all, whether it is unfamiliarity with the technology, discomfort with AI in a clinical setting or simply a personal preference, the patient has the right to a human interpreter. No explanation required.

The provider in the room is in charge. If at any point in the encounter the clinician sees that the conversation is not landing well, whether the issue is comprehension, register, emotional weight or anything else, the clinician has the authority to call in a human interpreter.

AI medical interpretation is a tool and part of a broader flow whose only purpose is to guarantee optimal communication between patient and provider. AI is not an end in itself. The moment it stops serving the encounter, it steps aside.

‍

3. How should a hybrid AI-human interpretation model actually work in practice?

The architecture matters more than the marketing. A platform that can route between AI and human interpreters has to do four things and a CMIO evaluating vendors should ask about each one explicitly.

3.1 Patient-initiated human escalation, at any moment

The patient, not just the clinician, must be able to request a human interpreter without explanation. This is the core of the hybrid model. Human choice is a right, not an escalation pathway.

3.2 Uncertainty flagging the AI does on itself

When the AI's confidence in a translation drops below threshold, for ambiguous terminology, the system should surface the flag in real time and the clinician should see it before continuing. This is the difference between a transcription product and an interpretation product. Section 1557 does not require zero errors. It requires meaningful access, which includes the system telling the truth about its own confidence.

‍

3.2 Pre-encounter language and risk-level routing

The provider should know, before the encounter starts, what kind of conversation is about to happen. A routine medication reconciliation in Spanish is not the same as a goals-of-care conversation in Haitian Creole.

‍

3.3 Encounter-level audit trails

Every encounter, whether AI, human or hybrid, generates a record. Who interpreted, what the AI confidence levels were, whether escalation happened. Section 1557 compliance is not a vendor's assurance; it is the auditability of the encounter when someone asks for it eighteen months later. A platform that cannot show you per-encounter logs in the demo cannot deliver them in the audit.

‍

4. What does this mean for procurement?

The procurement question is not "do we buy AI interpretation or stay with phone-based human interpretation." That framing was over by 2024. The procurement question now is whether the platform you choose has the architecture to handle the full range of encounter types, including the ones where it should hand off. Vendors selling AI-only are selling a product that fails Section 1557 by design in the encounter categories above. Vendors selling human-only are selling a product whose unit economics will not survive the volume any modern health system actually runs. The hybrid model is not a compromise. It is the only design that maps onto how clinical interpretation actually works.

‍

5. What is the honest answer for healthcare leaders right now?

AI interpretation in healthcare is the right default for the majority of encounters running through your system this week. It is also the wrong default for a small, identifiable, clinically critical set of encounters where the cost of a flat interpretation is high enough that a human interpreter has to be in the loop from the start. Any vendor who tells you the line between those two categories is fixed, or simple, or solved, is selling you the wrong product. The line moves with the language, the encounter type, the clinician and the patient's own preference. The platform's job is to make that line easy to honor.

‍

If you are evaluating AI interpretation in healthcare for an FQHC, a hospital system or a community health network and want to see what the hybrid model looks like at the point of care, book a demo.

Back