Chat Completions

The chat completions endpoint is the primary way to interact with your Polpo agents. It follows the OpenAI API format exactly, so any OpenAI-compatible client library works out of the box.

Create chat completion

/v1/chat/completions

Send a conversation to a Polpo agent and receive a response.

Request body

agent

string

required

Name of the Polpo agent to handle the request. Must match an agent deployed to your project.

messages

object[]

required

Array of message objects forming the conversation. Each message has a role and content field.

stream

boolean

default:"false"

When true, the response is delivered as a Server-Sent Events stream. When false, the full response is returned as a single JSON object.

temperature

number

Sampling temperature between 0 and 2. Lower values produce more deterministic output. Passed through to the underlying model.

max_tokens

number

Maximum number of tokens to generate. Passed through to the underlying model.

Headers

x-session-id

string

Pass a session ID to continue an existing conversation. Without this header, a new session is always created. Set to "new" to force a new session even when a header is present.

The response always includes an x-session-id header with the session ID. Store it on the client to maintain conversation continuity.

Message format

Each message in the messages array:

Field	Type	Description
`role`	string	One of `system`, `user`, `assistant`, or `tool`
`content`	string \| ContentPart[]	Plain text or an array of content parts
`tool_call_id`	string	Required for `role: "tool"` — the ID of the tool call this message responds to
`name`	string	Tool name (for `role: "tool"` messages)

The content field accepts either a plain string or an array of content parts for multimodal messages:

Content part type	Fields	Description
`text`	`text: string`	Plain text content
`image_url`	`image_url: { url, detail? }`	Image (data URL or HTTPS URL)
`file`	`file_id: string`	Reference to a file by workspace path

Response format

The response matches the OpenAI chat completion object:

{
  "id": "chatcmpl_abc123",
  "object": "chat.completion",
  "created": 1710000000,
  "model": "backend-dev",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Streaming format

When stream: true, the endpoint returns Server-Sent Events. Each event contains a delta:

data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1710000000,"model":"backend-dev","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1710000000,"model":"backend-dev","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1710000000,"model":"backend-dev","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1710000000,"model":"backend-dev","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Examples

Non-streaming request

curl https://{project}.polpo.cloud/v1/chat/completions \
  -H "Authorization: Bearer sk_live_abc123" \
  -H "Content-Type: application/json" \
  -d '{
    "agent": "backend-dev",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Write a function to reverse a string in TypeScript."}
    ],
    "stream": false
  }'

Streaming request

curl https://{project}.polpo.cloud/v1/chat/completions \
  -H "Authorization: Bearer sk_live_abc123" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "agent": "backend-dev",
    "messages": [
      {"role": "user", "content": "Explain dependency injection in 3 sentences."}
    ],
    "stream": true
  }'

Because the response format is identical to OpenAI’s, you can switch between Polpo and OpenAI by changing only the baseURL and apiKey. The model field maps to your agent name in Polpo.

How does agent routing work?

The agent field (or model field when using the OpenAI SDK) tells Polpo which agent should handle the request. The agent’s configured model, system prompt, tools, and reasoning depth are all applied automatically. You do not need to specify the underlying LLM model — that is part of the agent’s configuration.

Can I use function calling / tools?

Most tools are executed server-side by the agent automatically. However, some tools are client-side — the agent calls them, but the server returns the tool call to your client for handling.The built-in client-side tool is ask_user_question: the agent asks the user clarifying questions, your UI shows them, and you send back the answer.When the agent calls a client-side tool, the response has finish_reason: "tool_calls" with the tool call in delta.tool_calls. Your client handles it and sends the result back as a role: "tool" message in the next request.

What about multi-turn conversations?

Pass the full conversation history in the messages array, just like with the OpenAI API. Polpo does not maintain conversation state between requests — you are responsible for accumulating messages on the client side.

​Create chat completion

​Request body

​Headers

​Message format

​Response format

​Streaming format

​Examples

​Non-streaming request

​Streaming request

Create chat completion

Request body

Headers

Message format

Response format

Streaming format

Examples

Non-streaming request

Streaming request