crixin · voice mcp · capability module

Give your AI a phone line.

Voice MCP is one half of the Crixin toolkit. Your AI agent — Claude Code, Cursor, Codex CLI, Claude Desktop — gets six MCP tools to dial real numbers, send real SMS, and transcribe real calls through your Twilio account. The other half (Coder MCP) gives it a memory. MIT-licensed. Bring your own Twilio. Local SQLite.

npm i -g crixin && crixin install Install in 30 seconds Meet the agents

Pick a voice. Per call.

Voice MCP exposes Twilio's Polly + Google neural voices as a tool parameter — pass voice: "Polly.Joanna-Neural" on a make_call invocation. No agent roster to configure, no personas to manage. Four illustrative examples below; the full Polly catalog (30+ languages) is available on every call.

Polly.Joanna-Neuralen-US · warm
Default for English outbound.
Soft pacing, lots of acknowledgment. The voice your AI will reach for most often without thinking.
Polly.Matthew-Neuralen-US · professional
B2B / executive register.
Tight, deliberate. Pass this when calling someone whose calendar is the bottleneck.
Polly.Hala-Neuralar-EG · natural
Egyptian Arabic.
The dialect, not MSA. Makes restaurant calls in Cairo feel local instead of imported.
Polly.Lucia-Neurales-ES · clear
Spanish (Castilian).
Also: Polly.Mia (es-MX), Polly.Penelope (es-US). LATAM Spanish has its own pool.

No personas, no campaigns, no roster to manage — your AI agent picks the voice on a per-call basis as a single tool parameter. Full voice catalog →

The whole sales loop. Open-source.

Voice MCP ships every primitive of a real sales loop — campaigns, leads, outcome classification, compliance — exposed as MCP tools, REST endpoints, or a TypeScript library. Pick your altitude. Local SQLite, your data.

Campaigns & cadences

Three presets baked in. Agents pace retries to match.

  • light_touch — 1–2 calls per lead
  • standard — 3–5 calls, spaced
  • full_court_press — 5–8 calls
  • Custom cadences via TypeScript

Leads & outcomes

Drop in a CSV or pull leads from Google Places. Every call lands a structured outcome.

  • interested · not_interested
  • no_answer · voicemail
  • callback with scheduled time
  • Qualification score & transcript per lead

Compliance, by default

Off-hour calls don't go out. DNC requests are honored mid-call. AMD before any prompt.

  • TCPA windows — 8am–9pm local time
  • DNC list checked pre-dial
  • Answering machine detection (AMD)
  • Auto-DNC on opt-out keywords

How compliance is enforced →

Six MCP tools. One stdio server.

Drop the engine straight into Claude Code, Cursor, or Codex CLI and your agent can place a call, drop an SMS, or read back a transcript — without you writing webhooks or babysitting Twilio.

make_call
Outbound call. to + prompt, or raw TwiML.
send_sms
SMS / MMS. Messaging Service support.
transcribe_call
Deepgram, 30+ languages incl. Arabic.
get_call
Status, duration, price, timestamps.
list_calls
Filter by to, from, status.
list_recordings
Recordings + auth-protected media URLs.

And the receipts. Yours.

Every call mirrored to local SQLite + (optional) Deepgram transcripts. crixin voice wrapped condenses it to a single HTML file you can open, screenshot, share. Below: an actual sample after 30 days running mixed inbound + outbound.

Voice Wrapped · last 30 days

sample · mixed inbound + outbound
Total calls
1,247
across 4 use cases
Minutes
8,432
avg 6m 45s
Twilio spend
$174.32
~$0.14 / call
Top country
Egypt
624 calls · ar-EG
Caller archetype
Patient Listener
avg call > 3 min
Duck rate
14.2%
"let me get back to you"
Hour of day
12am ← peak around 3pm local → 11pm · TCPA 8am–9pm window respected
PII auto-stripped · generated locally · shareable

Three hosts. Same one-liner.

No host-specific glue. Same npm package, same MCP transport, same env vars.

Claude Code

claude mcp add crixin-voice -- \
  npx -y crixin voice mcp

Cursor

// ~/.cursor/mcp.json
{
  "mcpServers": {
    "crixin/voice": {
      "command": "npx",
      "args": ["-y", "crixin", "voice", "mcp"]
    }
  }
}

Codex CLI

# ~/.codex/config.toml
[mcp_servers.crixin-voice]
command = "npx"
args = ["-y", "crixin", "voice", "mcp"]

Configure once. Forget it exists.

Set your Twilio creds in the environment your AI host launches from. Everything else is the agent's problem.

export TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
export TWILIO_AUTH_TOKEN=your_auth_token
export TWILIO_PHONE_NUMBER=+15555550100

# Optional — only needed if you ask transcribe_call to use Deepgram
export DEEPGRAM_API_KEY=...

crixin voice doctor       # verify config

No SaaS in the middle. Calls are placed straight to Twilio's REST API. Recordings live in your Twilio account. We never see your audio, your transcripts, or your phone numbers.

Why this is open.

Crixin used to run as a hosted SaaS — multi-tenant, Stripe-billed, 4 voices × 4 personas, Google-Places lead sourcing, Firestore campaign management, GPT-4o-mini outcome classification. It worked. But the hosted-platform shape pulled us into a category — telecom SaaS — where we couldn't beat Vapi / Bland / Telnyx on infrastructure, and where the moat was somebody else's.

So we cut it. Voice MCP is now the open-source half of a two-MCP toolkit (the other half: Coder MCP for memory). The structural moat — MCP-first, BYO carrier, BYO LLM, local SQLite — is the thing the hosted-platform players can't easily copy. See pricing →