Streaming Guide
Streaming allows you to receive partial responses as they are generated, reducing perceived latency. The API uses Server-Sent Events (SSE) — a standard HTTP streaming protocol.
How it works
- Send a request with
"stream": true - The server responds with
Content-Type: text/event-stream - Each event is a JSON object prefixed with
data: - The
deltafield contains the incremental content - The stream terminates with
data: [DONE]
Usage statistics
To receive token usage in the final chunk, include "stream_options": { "include_usage": true } in your request.