Streaming Guide

Streaming allows you to receive partial responses as they are generated, reducing perceived latency. The API uses Server-Sent Events (SSE) — a standard HTTP streaming protocol.

How it works

  1. Send a request with "stream": true
  2. The server responds with Content-Type: text/event-stream
  3. Each event is a JSON object prefixed with data:
  4. The delta field contains the incremental content
  5. The stream terminates with data: [DONE]

Usage statistics

To receive token usage in the final chunk, include "stream_options": { "include_usage": true } in your request.