Same voice, every scene.
A character whispering at 2am sounds like the one shouting at noon. Timbre is not a performance.
One file defines who they are. One API call makes them talk.
For the people shipping companions, NPCs, and chat characters.
Three .persona.yaml files. Same input. Three completely different replies.
“I'm looking for a room. Just the night.”scene · comfort · reassure
▶ Press play on any row.Same API call — only the persona slug changes.
The proprietor. Polite to the point of weaponization.
The chancer at the bar. Sees angles other people miss.
The narrator. Fourteen character deaths and counting.
Same voice, every scene.
A character whispering at 2am sounds like the one shouting at noon. Timbre is not a performance.
The persona slips.
Around turn fifteen the character forgets who they are. A nice voice doesn't fix this.
You glue the pipeline.
LLM, prompt, emotion tag, TTS, streaming. Every project, from scratch, badly.
Same identity, different room — comfort, banter, challenge — without losing who they are.
A .persona.yaml you commit to git. A POST /v1/generate that returns audio.
voice.persona is Apache-2.0. Pin it like a dependency. Fork it like one.
A character
who stays themselves.
Persona slug. Scene. Input.
That's the whole API surface for a character that stays in character.
POST /v1/generate
Authorization: Bearer himaia_live_***
Content-Type: application/json
{
"mode": "voiced",
"persona": "himaia/warm_confidant",
"scene": "comfort",
"input": "You don't have to fix it tonight."
}
→ 200 OK audio/wav
x-himaia-seconds: 3.8
x-himaia-charge-cents: 1Thirty voices. Named, pre-tuned.
Same persona, any voice. Same voice, any persona.
Make your character cards talk without sounding like every other TTS demo. Drop in the SillyTavern extension; pick a persona; press play.
Ship voice without standing up an LLM-prompt-emotion-tag-TTS pipeline that breaks every release. Persona slug, scene, input — that's the whole call.
Same NPC across every scene. One persona file, one plugin per surface. Coming next: a Foundry VTT extension and a Unity package.
Enough to ship a demo and tell a friend.
Indie builders and one-person shops.
Teams shipping character apps.
Retail rates for high-volume consumers.
BYO-backend, hand-tuned, SOC 2.
1 credit = 1 Cinematic min = 6 Voiced min = 15 Basic min. Overage auto-bills at retail with an 80% threshold warning. Pricing pending first-usage calibration.
No. Persona is the unit, voice is the timbre. The runtime composes identity, scene, and idiolect at call time, then hands the line to TTS as the last hop. Most voice APIs stop at the timbre step.
voice.persona is Apache-2.0 on GitHub (fuselinkapp/himaia-voice-persona). You commit personas to git, fork the spec freely, and migrate runtimes when you want. The runtime is closed; the abstraction is open.
We don't pin a vendor on the wire. The spec is backend-opaque so we can swap or add providers without breaking your personas, and any upstream failure surfaces on the response — your retry logic stays the same.
No. himaia-sdk is two methods of convenience. curl works. Anything that can POST JSON and read a WAV stream works.
Not on the public roster. Thirty pre-tuned named voices ship today; custom voices land later on the higher tiers alongside SLA and a dedicated deploy.
20 Voiced minutes a month, no card. New accounts get the credits on signup; the same allotment refreshes every month if you stay on the free tier.
20 Voiced minutes a month.
Free. No card. No setup.