Prepare a real-time voice conversation with an AI Agent.
Returns a WebSocket URL and a ready-made start message. Open a WebSocket
connection to the returned url, send start_message as the first frame,
then stream audio back and forth.
You can also skip this endpoint and connect directly:
wss://api.datagrid.com/ws/voice?token=YOUR_API_KEY
WebSocket Protocol:
Once connected, send a JSON message with type: "start" and the session parameters as the payload.
The server responds with type: "started" containing the session and conversation IDs,
followed by type: "ready" when the agent is ready to receive audio.
Audio Format:
Message Types:
start, audio, stop, interrupt, textstarted, ready, audio, tool_call, interrupted, error, transcript, citation, endedBearer authentication header of the form Bearer <token>, where <token> is your auth token.
The ID of the agent to use for the voice conversation. If not provided, the default agent is used.
The ID of an existing conversation to continue. If not provided, a new conversation will be created.
Override the agent config for this voice session. Only prompt overrides are supported — voice sessions always use Gemini Live, so LLM model, agent model, planning prompt, and tool settings are not applicable.
Array of file IDs to attach to the voice conversation.
Array of secret IDs to include in the context.
Array of knowledge IDs to make accessible to the agent.
Array of page IDs to make accessible to the agent. The page and all knowledge under it will be accessible.
Override user information for this voice session.
Optional context text for the voice session. When provided, the AI will start by briefly explaining this content before listening for user input.
2000Optional initial user message. When provided, the system greeting is skipped and the AI responds directly to this text (e.g. a suggested prompt). Takes precedence over initial_context.
2000When true, the session is ephemeral and will not save messages to conversation history.
Voice session configuration options.
Voice session prepared
Object type discriminator.
voice.session WebSocket URL to connect to. Includes the authentication token as a query parameter.
The resolved agent ID. If no agent was specified in the request, this is the default agent.
Ready-made JSON message to send as the first WebSocket frame after connecting. Contains type: "start" and a payload with all session parameters.