Create chat completion
/v1/chat/completions
Request body
Name of the Polpo agent to handle the request. Must match an agent deployed to your project.
Array of message objects forming the conversation. Each message has a
role and content field.When
true, the response is delivered as a Server-Sent Events stream. When false, the full response is returned as a single JSON object.Sampling temperature between 0 and 2. Lower values produce more deterministic output. Passed through to the underlying model.
Maximum number of tokens to generate. Passed through to the underlying model.
Headers
Pass a session ID to continue an existing conversation. Without this header, a new session is always created. Set to
"new" to force a new session even when a header is present.x-session-id header with the session ID. Store it on the client to maintain conversation continuity.
Message format
Each message in themessages array:
| Field | Type | Description |
|---|---|---|
role | string | One of system, user, assistant, or tool |
content | string | ContentPart[] | Plain text or an array of content parts |
tool_call_id | string | Required for role: "tool" — the ID of the tool call this message responds to |
name | string | Tool name (for role: "tool" messages) |
content field accepts either a plain string or an array of content parts for multimodal messages:
| Content part type | Fields | Description |
|---|---|---|
text | text: string | Plain text content |
image_url | image_url: { url, detail? } | Image (data URL or HTTPS URL) |
file | file_id: string | Reference to a file by workspace path |
Response format
The response matches the OpenAI chat completion object:Streaming format
Whenstream: true, the endpoint returns Server-Sent Events. Each event contains a delta:
Examples
Non-streaming request
Streaming request
Because the response format is identical to OpenAI’s, you can switch between Polpo and OpenAI by changing only the
baseURL and apiKey. The model field maps to your agent name in Polpo.How does agent routing work?
How does agent routing work?
The
agent field (or model field when using the OpenAI SDK) tells Polpo which agent should handle the request. The agent’s configured model, system prompt, tools, and reasoning depth are all applied automatically. You do not need to specify the underlying LLM model — that is part of the agent’s configuration.Can I use function calling / tools?
Can I use function calling / tools?
Most tools are executed server-side by the agent automatically. However, some tools are client-side — the agent calls them, but the server returns the tool call to your client for handling.The built-in client-side tool is
ask_user_question: the agent asks the user clarifying questions, your UI shows them, and you send back the answer.When the agent calls a client-side tool, the response has finish_reason: "tool_calls" with the tool call in delta.tool_calls. Your client handles it and sends the result back as a role: "tool" message in the next request.What about multi-turn conversations?
What about multi-turn conversations?
Pass the full conversation history in the
messages array, just like with the OpenAI API. Polpo does not maintain conversation state between requests — you are responsible for accumulating messages on the client side.