Voice over LiveKit (in-process orchestrator)
What shipped
A new LiveKit-based voice transport for VoiceX. Voice calls — both phone and the web widget — now run in-process against the same agent orchestrator that powers chat, with Deepgram for speech-to-text and text-to-speech. Because it reuses the chat pipeline, a voice turn produces the same rich responses as chat (text, cards, quick replies) on the channels that can render them, instead of a separate voice-only code path.
It also adds, on the voice path: acting on call transfer and disconnect, server-side call recording (LiveKit Egress), per-bot STT/TTS selection, and incremental streaming of widget display chunks during a turn.
Who it's for
Primarily platform / voice developers building and testing VoiceX agents. The new transport is gated to development builds — deployed environments continue to use the existing voice bridge and UI unchanged — so it is not yet a user-facing change in production.
How to use
This is currently a developer/testing capability:
- Run a development build of the widget (the LiveKit voice path is enabled only when the build is in dev mode).
- Open a bot that has voice enabled (in dev, the voice icon can be forced on for any bot for testing).
- Start a voice call from the widget, or use the Web Call (LiveKit) tab in Studio → AI Agent → Voice to talk to the agent directly from the browser.
- Calls are dispatched to the LiveKit voice worker (
voice-agent-prod), which runs the orchestrator and streams audio + rich responses back.
Production rollout (a public token endpoint and a per-bot transport switch) is tracked separately and gated behind the production-readiness plan.
Related
- PR: #2616