Audio out.
One endpoint.
Maia is an HTTP API that returns narrated audio. Send text (basic) or context + goal (pro), get WAV bytes back. No SDK. No surprises. One call from your code to a voice your user hears.
Create an API key from the dashboard (keys look like mvk_…), then call POST /v1/generate:
curl -X POST https://api.maia.example/v1/generate \
-H "Authorization: Bearer mvk_live_xxx" \
-H "Content-Type: application/json" \
--output out.wav \
-d '{
"mode": "basic",
"text": "Welcome to Maia. Let'"'"'s get you set up.",
"voice": "ember",
"tone": "warm"
}'New accounts get a 27¢ trial credit on first sign-in — enough for about three minutes of basic narration.
All /v1/* endpoints require a bearer token:
Authorization: Bearer <token>Two token types are accepted:
- API keys (
mvk_…) — server-to-server. Create and revoke them on the keys page. Secret shown once at creation. - Firebase ID tokens — the dashboard uses these. API keys are rejected on
/v1/keysand/v1/billing/*.
Pricing
You supply the script. Billed per second.
Agents plan, write, narrate. One call.
Server rejects with 402 if below floor. Add credits on the billing page.
Synthesizes audio. Returns an audio/wav response.
Basic — bring your own script
{
"mode": "basic",
"text": "Your script here.", // required, ≤5000 chars
"voice": "ember", // optional, see Voices
"tone": "warm", // optional, see Tones
"expressiveness": 0.6, // optional, 0..1
"languageCode": "en-US" // optional, BCP-47
}Pro — agents write the script
{
"mode": "pro",
"context": "What this clip is for, audience, goal.", // required, ≤4000 chars
"target_seconds": 30, // optional, 15..120, default 30
"voice": "atlas",
"tone": "authoritative"
}Pro returns the generated script in the x-maia-script header (URL-encoded).
Response headers
| Header | Meaning |
|---|---|
x-maia-call-id | Stable ID for this generation — use when reporting issues. |
x-maia-seconds | Duration of the returned audio, in seconds. |
x-maia-charge-cents | Amount debited from your balance. |
x-maia-script | Pro only. URL-encoded script the agent wrote. |
Idempotency
Pass an Idempotency-Key header to prevent duplicate charges on retried requests. A second request with the same key on an already-settled call returns 409.
Returns the caller's balance and 30-day usage.
{
"account": { "id": "acc_…", "email": "you@example.com" },
"balance_cents": 1873,
"usage_30d": [
{ "mode": "basic", "total_cents": 412, "count": 87 },
{ "mode": "pro", "total_cents": 1060, "count": 12 }
]
}Voices
Set voice to one of the IDs below. Default is ember.
| ID | Gender | Best for |
|---|---|---|
| ember | female | Warm, grounded — coaching, onboarding. |
| nova | female | Bright, confident — announcements, launches. |
| wren | female | Crisp, precise — tutorials, product walkthroughs. |
| sage | female | Calm, measured — meditation, wellness. |
| marlow | female | Smoky, late-night — lifestyle, storytelling. |
| iris | female | Clear, professional — IVR, corporate narration. |
| juno | female | Authoritative — news, briefings. |
| lark | female | Upbeat, morning-show — ads, social spots. |
| dahlia | female | Rich, dramatic — audiobooks, fiction. |
| piper | female | Playful, sharp — UX prompts, quick reads. |
| atlas | male | Deep, steady — narration, documentaries. |
| archer | male | Energetic, confident — sales, pitches. |
| reese | male | Polished announcer — promos, trailers. |
| hugo | male | Avuncular, warm — explainers, coaching. |
| onyx | male | Low, serious — cinematic, high-stakes. |
| cyrus | male | Charismatic, expressive — hosts, interviews. |
| bram | male | Grounded, deliberate — meditation, guidance. |
| kai | male | Bright, modern — tech explainers, podcasts. |
| dash | male | Quick, punchy — ads, shorts. |
| orin | male | Wise storyteller — long-form, audiobooks. |
Tones
tone sets delivery register. Default is neutral.
Sprinkle {{tag}} markers in your script to shape local delivery. Unknown tags are left as-is.
"Hey — {{pause}} I have good news. {{happy}} We shipped it."Errors
Errors are JSON with a message field.
| Status | When |
|---|---|
400 | Bad body shape — missing text/context, invalid voice/tone, etc. |
401 | Missing, invalid, revoked, or expired token. |
402 | Balance below the mode's floor, or generated audio would exceed balance. |
403 | API key used on a user-only endpoint (keys, billing). |
409 | Idempotency key matches an already-settled call. |
502 | Upstream pipeline or TTS failure. Safe to retry. |