Models & Pricing

All prices are in USD per 1M tokens.

ModelContextMax OutputInput PriceOutput PriceCache Read
glm-5
Most capable reasoning model
203K131K$1.00 / 1M tokens$3.20 / 1M tokens$0.20 / 1M tokens
glm-4.7
Flagship reasoning model
200K131K$0.40 / 1M tokens$1.75 / 1M tokens$0.08 / 1M tokens
glm-4.7-flash
Fast, cost-efficient variant
200K131K$0.06 / 1M tokens$0.40 / 1M tokens$0.01 / 1M tokens
glm-4.6
Previous-gen reasoning model
200K128K$0.43 / 1M tokens$1.74 / 1M tokens$0.08 / 1M tokens
glm-4.5
General-purpose model
131K96K$0.60 / 1M tokens$2.20 / 1M tokens$0.11 / 1M tokens
glm-4.5-air
Lightweight, budget-friendly
131K96K$0.13 / 1M tokens$0.85 / 1M tokens$0.025 / 1M tokens
minimax-m2.5
196.6K context, efficient reasoning
196.6K196.6K$0.30 / 1M tokens$1.10 / 1M tokens$0.03 / 1M tokens

Pricing is subject to change. All models support text input/output, tool calling, JSON mode, and streaming. Reasoning is supported on GLM-5, GLM-4.7, GLM-4.6, and MiniMax-M2.5.