KNK Global — Independent human evaluation for conversational AI

Case study

Where a clinic booking agent loses the appointment.

A representative Voice Agent evaluation of an inbound appointment line for a multi-clinic group. Illustrative throughout — it shows the depth of a real evaluation, not a named client result.

Engagement · Voice agentILLUSTRATIVE

System evaluated

Inbound appointment voice agent

Domain

Multi-clinic healthcare booking

Reporting period

Single evaluation cycle

Channel

Voice · inbound

The challengeILLUSTRATIVE

The agent books cleanly when a caller knows exactly what they want. The open question was how it holds up under real call behavior — mid-call corrections, spelled-out names, and callers who stack a second request onto a booking.

Scope

160 human-tested calls · 4 caller profiles · 8 scenarios · English

Performance

Performance scorecard · Voice rubricILLUSTRATIVE

Intent Recognition4.1 / 5

Accent Handling3.4 / 5

Response Accuracy3.9 / 5

Context Retention2.8 / 5

Conversation Quality3.6 / 5

Weighted overall: 3.6 / 5 — Acceptable, with one limiting weakness.

Diagnosis

Top failure themesILLUSTRATIVE

Context lost after a correction34% of failures

Caller

Actually, make that Thursday, not Tuesday.

Agent

Great — I have you booked for Tuesday at 2pm.

Impact. The agent reverts to the original date after a mid-call correction. The caller leaves the call booked on the wrong day.

Recommendation. Treat any correction as the new source of truth and re-confirm the changed value before closing.

Spelled names misheard27% of failures

Caller

My last name is Nguyen — N-G-U-Y-E-N.

Agent

I have that as Wynn. Is that correct?

Impact. Spelled-out input is miscaptured under accent variation, forcing repeated correction and stretching the call.

Recommendation. Read spelled input back letter by letter for confirmation before saving it to the record.

Second request dropped21% of failures

Caller

Can I book for next week and also check if you take my insurance?

Agent

Sure — what day works for the appointment?

Impact. The insurance question is silently dropped. The caller has to phone back for the half they did not get.

Recommendation. Detect the second intent in the turn and confirm both will be handled before proceeding.

Sample failure logILLUSTRATIVE

Call IDIssue typeSeverityDescription

H-031Flow breakdownMAJORLost the appointment date after a mid-call correction.

H-052Wrong intentCRITICALTreated an insurance question as a booking request.

H-068Capture errorMAJORSaved a spelled surname incorrectly under accent variation.

H-094Multi-intent missMINORDropped the second request when two were stacked in one turn.

An excerpt of the per-call log — every finding carries a reproducible Call ID, a failure type, and a severity on the 4-band scale.

Resolution

Severity distribution · 4-bandILLUSTRATIVE

Critical4 · 5%

Major13 · 18%

Minor26 · 35%

Observations31 · 42%

74 findings total — scored on the Voice rubric and ranked by severity, the same way every evaluation reports.

Improvement prioritiesILLUSTRATIVE

Fix context retention after corrections

Targets the lowest dimension (2.8) and the single biggest source of wrong bookings. Highest expected lift.

Harden spelled-name capture

Reduces the repeated-correction friction that stretches calls and frustrates callers.

Add multi-intent detection on booking

Recovers the secondary questions that are currently dropped mid-booking.

Management summaryILLUSTRATIVE

Overall 3.6 / 5 — Acceptable. The agent books reliably on clean, single-intent calls but loses accuracy the moment a caller corrects a detail or stacks a second request, and context retention is the limiting factor. With the three priority fixes it is ready for a supervised pilot on the main booking line; it is not yet ready to run insurance-related or multi-step calls unattended.

Prepared by KNK Global · evaluation services

See this on your own booking line.

A pilot returns an evaluation in this exact shape, scored on your live voice agent under real callers.

Book a Pilot How a pilot runs

Services

How we work

Company

Where a clinic booking agent loses the appointment.

Fix context retention after corrections

Harden spelled-name capture

Add multi-intent detection on booking

See this on your own booking line.