Models & Pricing
All prices are in USD per 1M tokens.
| Model | Context | Max Output | Input Price | Output Price | Cache Read |
|---|---|---|---|---|---|
glm-5 Most capable reasoning model | 203K | 131K | $1.00 / 1M tokens | $3.20 / 1M tokens | $0.20 / 1M tokens |
glm-4.7 Flagship reasoning model | 200K | 131K | $0.40 / 1M tokens | $1.75 / 1M tokens | $0.08 / 1M tokens |
glm-4.7-flash Fast, cost-efficient variant | 200K | 131K | $0.06 / 1M tokens | $0.40 / 1M tokens | $0.01 / 1M tokens |
glm-4.6 Previous-gen reasoning model | 200K | 128K | $0.43 / 1M tokens | $1.74 / 1M tokens | $0.08 / 1M tokens |
glm-4.5 General-purpose model | 131K | 96K | $0.60 / 1M tokens | $2.20 / 1M tokens | $0.11 / 1M tokens |
glm-4.5-air Lightweight, budget-friendly | 131K | 96K | $0.13 / 1M tokens | $0.85 / 1M tokens | $0.025 / 1M tokens |
minimax-m2.5 196.6K context, efficient reasoning | 196.6K | 196.6K | $0.30 / 1M tokens | $1.10 / 1M tokens | $0.03 / 1M tokens |
Pricing is subject to change. All models support text input/output, tool calling, JSON mode, and streaming. Reasoning is supported on GLM-5, GLM-4.7, GLM-4.6, and MiniMax-M2.5.