Skip to main content

Voice over LiveKit (in-process orchestrator)

What shipped

A new LiveKit-based voice transport for VoiceX. Voice calls — both phone and the web widget — now run in-process against the same agent orchestrator that powers chat, with Deepgram for speech-to-text and text-to-speech. Because it reuses the chat pipeline, a voice turn produces the same rich responses as chat (text, cards, quick replies) on the channels that can render them, instead of a separate voice-only code path.

It also adds, on the voice path: acting on call transfer and disconnect, server-side call recording (LiveKit Egress), per-bot STT/TTS selection, and incremental streaming of widget display chunks during a turn.

Who it's for

Primarily platform / voice developers building and testing VoiceX agents. The new transport is gated to development builds — deployed environments continue to use the existing voice bridge and UI unchanged — so it is not yet a user-facing change in production.

How to use

This is currently a developer/testing capability:

  1. Run a development build of the widget (the LiveKit voice path is enabled only when the build is in dev mode).
  2. Open a bot that has voice enabled (in dev, the voice icon can be forced on for any bot for testing).
  3. Start a voice call from the widget, or use the Web Call (LiveKit) tab in Studio → AI Agent → Voice to talk to the agent directly from the browser.
  4. Calls are dispatched to the LiveKit voice worker (voice-agent-prod), which runs the orchestrator and streams audio + rich responses back.

Production rollout (a public token endpoint and a per-bot transport switch) is tracked separately and gated behind the production-readiness plan.

  • PR: #2616