Skip to main content
The chat completions endpoint is the primary way to interact with your Polpo agents. It follows the OpenAI API format exactly, so any OpenAI-compatible client library works out of the box.

Create chat completion

/v1/chat/completions
Send a conversation to a Polpo agent and receive a response.

Request body

agent
string
required
Name of the Polpo agent to handle the request. Must match an agent deployed to your project.
messages
object[]
required
Array of message objects forming the conversation. Each message has a role and content field.
stream
boolean
default:"false"
When true, the response is delivered as a Server-Sent Events stream. When false, the full response is returned as a single JSON object.
temperature
number
Sampling temperature between 0 and 2. Lower values produce more deterministic output. Passed through to the underlying model.
max_tokens
number
Maximum number of tokens to generate. Passed through to the underlying model.

Headers

x-session-id
string
Pass a session ID to continue an existing conversation. Without this header, a new session is always created. Set to "new" to force a new session even when a header is present.
The response always includes an x-session-id header with the session ID. Store it on the client to maintain conversation continuity.

Message format

Each message in the messages array:
FieldTypeDescription
rolestringOne of system, user, assistant, or tool
contentstring | ContentPart[]Plain text or an array of content parts
tool_call_idstringRequired for role: "tool" — the ID of the tool call this message responds to
namestringTool name (for role: "tool" messages)
The content field accepts either a plain string or an array of content parts for multimodal messages:
Content part typeFieldsDescription
texttext: stringPlain text content
image_urlimage_url: { url, detail? }Image (data URL or HTTPS URL)
filefile_id: stringReference to a file by workspace path

Response format

The response matches the OpenAI chat completion object:
{
  "id": "chatcmpl_abc123",
  "object": "chat.completion",
  "created": 1710000000,
  "model": "backend-dev",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Streaming format

When stream: true, the endpoint returns Server-Sent Events. Each event contains a delta:
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1710000000,"model":"backend-dev","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1710000000,"model":"backend-dev","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1710000000,"model":"backend-dev","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1710000000,"model":"backend-dev","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Examples

Non-streaming request

curl https://{project}.polpo.cloud/v1/chat/completions \
  -H "Authorization: Bearer sk_live_abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "backend-dev",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Write a function to reverse a string in TypeScript."}
    ],
    "stream": false
  }'

Streaming request

curl https://{project}.polpo.cloud/v1/chat/completions \
  -H "Authorization: Bearer sk_live_abc123" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "agent": "backend-dev",
    "messages": [
      {"role": "user", "content": "Explain dependency injection in 3 sentences."}
    ],
    "stream": true
  }'
Because the response format is identical to OpenAI’s, you can switch between Polpo and OpenAI by changing only the baseURL and apiKey. The model field maps to your agent name in Polpo.
The agent field (or model field when using the OpenAI SDK) tells Polpo which agent should handle the request. The agent’s configured model, system prompt, tools, and reasoning depth are all applied automatically. You do not need to specify the underlying LLM model — that is part of the agent’s configuration.
Most tools are executed server-side by the agent automatically. However, some tools are client-side — the agent calls them, but the server returns the tool call to your client for handling.The built-in client-side tool is ask_user_question: the agent asks the user clarifying questions, your UI shows them, and you send back the answer.When the agent calls a client-side tool, the response has finish_reason: "tool_calls" with the tool call in delta.tool_calls. Your client handles it and sends the result back as a role: "tool" message in the next request.
Pass the full conversation history in the messages array, just like with the OpenAI API. Polpo does not maintain conversation state between requests — you are responsible for accumulating messages on the client side.