Voice MCP doesn't ask you to configure agents or pick personas in a dashboard. Your AI host passes voice: "Polly.Joanna-Neural" as a parameter to make_call — same shape as any other tool argument. Pick a different voice on the next call. No state, no provisioning, no roster.
Voice MCP routes the voice arg through Twilio's <Say> verb. That gives you the full Polly + Google neural catalog — ~30 languages, ~70 voice IDs. The four below are the ones our test suite covers; everything else in the Polly catalog works without any code change on our side.
"Hi, this is calling about your appointment Tuesday at 3pm…"
voice on make_call, this is what your AI gets. Highest answer-rate of the four in informal testing.voice: "Polly.Joanna-Neural""Matthew calling on behalf of Acme — two minutes about your Q3 spend."
voice: "Polly.Matthew-Neural""السلام عليكم، أنا بتصل من المطعم عشان موعدك بكره…"
language: "ar-EG" on transcribe_call.voice: "Polly.Hala-Neural""Hola, le llamo desde la clínica para confirmar su cita…"
Polly.Mia (es-MX), Polly.Penelope (es-US).voice: "Polly.Lucia-Neural"The full Polly catalog — Portuguese, French, German, Mandarin, Hindi, Italian, Japanese, Korean, and ~20 more — works without changing anything on Voice MCP's side. Polly voice list →
There's no persona arg on Voice MCP. The "persona" of a call is the system prompt your AI host wires up — that's its job, not ours. The patterns below are suggestions you can paste into your prompt; they're not a Crixin abstraction.
"Ask three open questions before any pitch. Don't recommend until you understand what they need."
"Read back the appointment details, confirm or reschedule, hang up. Don't upsell."
"Apologize for the cold call. Ask permission for 30 seconds. Invite them to hang up. Counter-intuitively converts."
"Greet the caller. Find out what they need. Hand off to a human via send_sms for anything outside your knowledge base."
Your AI host invokes make_call as a tool. Voice MCP turns the args into a Twilio REST call. No agent state, no provisioning, no roster — just a tool call with a voice parameter.
// MCP tool invocation (what your AI host sends) { "name": "make_call", "arguments": { "to": "+15555550100", "prompt": "Hi, calling to confirm your 3pm appointment tomorrow.", "voice": "Polly.Joanna-Neural", "language": "en-US", "tag": "appointment-reminders" } } // Or in TypeScript via the library (skips MCP, same primitives) import { TwilioClient, makeCall, loadEnv } from "crixin/voice"; const client = new TwilioClient(loadEnv()); await makeCall(client, { to: "+15555550100", prompt: "Hi, calling to confirm your 3pm appointment tomorrow.", voice: "Polly.Joanna-Neural", language: "en-US", });
Full type signatures + the other five tools — see the API reference.