- Added a text input panel to allow users to send plain text messages while in Live Mode. - Updated the LiveMode.vue component to handle text input and integrate it with WebSocket communication. - Improved the layout and styling of the Live Mode interface for better user experience. - Documented the new `text_input` message type in the Live API README.
6.7 KiB
AstrBot Live API Protocol
This document describes the current WebSocket protocol for AstrBot Live API.
Endpoint
- Legacy JWT endpoint:
/api/live_chat/ws - Legacy unified JWT endpoint:
/api/unified_chat/ws - Open API endpoint:
/api/v1/live/ws
Authentication
Legacy dashboard endpoints
Pass a dashboard JWT in the token query parameter.
Example:
ws://localhost:6185/api/live_chat/ws?token=<dashboard_jwt>
Open API endpoint
Use an API key and provide username in the query string.
Examples:
ws://localhost:6185/api/v1/live/ws?api_key=<api_key>&username=alice
ws://localhost:6185/api/v1/live/ws?api_key=<api_key>&username=alice&ct=chat
ct values:
live: voice conversation modechat: unified chat mode over the same WebSocket transport
The Open API endpoint reuses the chat API key scope.
Transport
- Protocol: WebSocket
- Payload format: UTF-8 JSON text frames
- Audio upload format in
livemode:- client sends raw PCM frames encoded as Base64
- sample rate:
16000 - channels:
1 - sample width:
16-bit
Top-Level Envelope
Client to server
{
"t": "message_type",
"...": "message specific fields"
}
When using the unified socket, the client can also include:
{
"ct": "live|chat",
"t": "message_type"
}
Server to client
Legacy live mode uses:
{
"t": "message_type",
"data": {}
}
Unified chat mode uses:
{
"ct": "chat",
"type": "message_type",
"data": {}
}
Some forwarded chat frames may also contain t, streaming, chain_type, message_id, or session_id.
Live Mode
Client messages
start_speaking
Start a voice capture segment.
{
"t": "start_speaking",
"stamp": "seg_001"
}
speaking_part
Send one audio frame.
{
"t": "speaking_part",
"data": "<base64_pcm_bytes>"
}
end_speaking
Finish the current voice capture segment.
{
"t": "end_speaking",
"stamp": "seg_001"
}
text_input
Send a plain text input directly while using ct=live. The server will still route through Live mode with TTS and interrupt handling.
{
"t": "text_input",
"text": "Hello, what is the weather today?"
}
interrupt
Interrupt the current model or TTS response.
{
"t": "interrupt"
}
Server messages
metrics
Performance and provider metadata.
Example:
{
"t": "metrics",
"data": {
"wav_assemble_time": 0.12,
"stt": "whisper_api",
"llm_ttft": 0.84,
"tts_total_time": 1.72
}
}
user_msg
STT result from the uploaded audio.
{
"t": "user_msg",
"data": {
"text": "Hello there",
"ts": 1710000000000
}
}
bot_delta_chunk
Raw model text delta. This is the token or chunk level stream and is not sentence segmented.
{
"t": "bot_delta_chunk",
"data": {
"text": "Hel"
}
}
Notes:
- This event is generated directly from the model streaming path.
- It is independent from TTS chunking.
- Consumers should append
data.textto a local buffer.
bot_text_chunk
Text associated with the current TTS chunk. This is usually sentence or phrase segmented.
{
"t": "bot_text_chunk",
"data": {
"text": "Hello there."
}
}
Notes:
- This event is aligned to TTS output, not raw token streaming.
- It may be coarser than
bot_delta_chunk.
response
One TTS audio chunk, Base64 encoded.
{
"t": "response",
"data": "<base64_audio_bytes>"
}
bot_msg
Final bot text when the response completed without audio streaming.
{
"t": "bot_msg",
"data": {
"text": "Final reply text",
"ts": 1710000001234
}
}
stop_play
Stop client-side audio playback because the response was interrupted.
{
"t": "stop_play"
}
end
Marks the end of the current response turn.
{
"t": "end"
}
error
Recoverable or terminal processing error.
{
"t": "error",
"data": "error message"
}
Unified Chat Mode
Set ct=chat on the Open API endpoint or include "ct": "chat" in each client frame when using /api/unified_chat/ws.
Client messages
bind
Subscribe to an existing webchat session.
{
"ct": "chat",
"t": "bind",
"session_id": "session_001"
}
send
Send a chat request.
{
"ct": "chat",
"t": "send",
"username": "alice",
"session_id": "session_001",
"message_id": "msg_001",
"message": [
{
"type": "plain",
"text": "Please summarize this"
}
],
"selected_provider": "openai_chat_completion",
"selected_model": "gpt-4.1-mini",
"enable_streaming": true
}
message uses the same message-part schema as POST /api/v1/chat.
interrupt
Interrupt the current chat response.
{
"ct": "chat",
"t": "interrupt"
}
Server messages
session_bound
Acknowledges a successful bind.
{
"ct": "chat",
"type": "session_bound",
"session_id": "session_001",
"message_id": "ws_sub_xxx"
}
Forwarded streaming events
The server forwards the normal webchat queue payloads. Common examples:
{
"ct": "chat",
"type": "plain",
"data": "Hello",
"streaming": true,
"chain_type": null,
"message_id": "msg_001"
}
{
"ct": "chat",
"type": "image",
"data": "[IMAGE]file.jpg",
"streaming": false,
"message_id": "msg_001"
}
{
"ct": "chat",
"type": "agent_stats",
"data": {
"time_to_first_token": 0.8
}
}
{
"ct": "chat",
"type": "message_saved",
"data": {
"id": 123,
"created_at": "2026-03-16T10:00:00Z"
}
}
{
"ct": "chat",
"type": "end",
"data": "",
"streaming": false,
"message_id": "msg_001"
}
Chat errors
{
"ct": "chat",
"t": "error",
"code": "INVALID_MESSAGE_FORMAT",
"data": "message must be list"
}
Recommended Client Strategy
For live mode:
- Append every
bot_delta_chunk.data.textinto a raw transcript buffer. - Use
bot_text_chunkonly when you need text aligned with audio playback. - Decode and play each
responseaudio chunk in arrival order. - Reset per-turn buffers after
end.
For chat mode:
- Treat
plain + streaming=trueas incremental text. - Treat
completeorendas the end of a response turn. - Persist
message_savedmetadata if you need server-side history IDs.
Compatibility Notes
bot_text_chunkremains sentence or phrase segmented for TTS compatibility.bot_delta_chunkis the new delta-level text event for real-time rendering.- The legacy JWT endpoints and the new Open API endpoint share the same runtime behavior after authentication.