Hosted voice setup is now part of the product workflow.

NPCFoundry remains the system of record for world data, dialogue, and per-player memories. Voice playback is handled on the customer's own server through text-only, ElevenLabs, or Cartesia.

Included now

Downloaded voice setup package per world
Provider-specific env template and voice slot scaffold
World snapshot JSON for runtime use
Live dialogue API with per-player memory persistence
Structured actions, interrupt flags, and conversation close signals

Recommended runtime loop

1. Generate or open a world in NPCFoundry.
2. Choose text-only, ElevenLabs, or Cartesia on the world page.
3. Download the provider-specific voice setup package or export package.
4. Send dialogue turns through NPCFoundry with live spatial context from the runtime.
5. If the player interrupts, stop playback and send interruption context immediately.
6. Call the chosen provider from your own server and play the returned audio.

Intentional boundary

NPCFoundry does not act as a TTS relay or manage local model installs. The app owns dialogue logic and memory updates; the customer's own server owns provider keys, speech synthesis, and playback.