Pricing

Simple per-token pricing. No minimums, no commitments. All prices in USD per 1M tokens.

ModelContextMax OutputInput PriceOutput PriceCache Read
minimax-m2.5
196.6K context, efficient reasoning
196.6K196.6K$0.200 / 1M tokens$1.20 / 1M tokens$0.040 / 1M tokens
glm-5
Most capable reasoning model
203K131K$0.720 / 1M tokens$2.30 / 1M tokens$0.144 / 1M tokens
glm-4.7
Flagship reasoning model
200K131K$0.380 / 1M tokens$1.98 / 1M tokens$0.190 / 1M tokens
glm-4.7-flash
Fast, cost-efficient variant
200K131K$0.060 / 1M tokens$0.400 / 1M tokens$0.010 / 1M tokens
glm-4.5
General-purpose model
131K96K$0.600 / 1M tokens$2.20 / 1M tokens$0.110 / 1M tokens
glm-4.5-air
Lightweight, budget-friendly
131K96K$0.130 / 1M tokens$0.850 / 1M tokens$0.025 / 1M tokens
kimi-k2.5
262K context, MoE architecture
262K262K$0.405 / 1M tokens$1.98 / 1M tokens$0.225 / 1M tokens
qwen3-coder-next
262K context, fast code generation
262K262K$0.108 / 1M tokens$0.675 / 1M tokens$0.060 / 1M tokens

Pricing is subject to change. All prices are in USD per 1M tokens. All models support text input/output, tool calling, JSON mode, and streaming. Reasoning is supported on GLM-5, GLM-4.7, and MiniMax-M2.5.