Pricing

Simple per-token pricing. No minimums, no commitments. All prices in USD per 1M tokens.

Model	Context	Max Output	Input Price	Output Price	Cache Read
minimax-m2.5 196.6K context, efficient reasoning	196.6K	196.6K	$0.200 / 1M tokens	$1.20 / 1M tokens	$0.040 / 1M tokens
glm-5 Most capable reasoning model	203K	131K	$0.720 / 1M tokens	$2.30 / 1M tokens	$0.144 / 1M tokens
glm-4.7 Flagship reasoning model	200K	131K	$0.380 / 1M tokens	$1.98 / 1M tokens	$0.190 / 1M tokens
glm-4.7-flash Fast, cost-efficient variant	200K	131K	$0.060 / 1M tokens	$0.400 / 1M tokens	$0.010 / 1M tokens
glm-4.5 General-purpose model	131K	96K	$0.600 / 1M tokens	$2.20 / 1M tokens	$0.110 / 1M tokens
glm-4.5-air Lightweight, budget-friendly	131K	96K	$0.130 / 1M tokens	$0.850 / 1M tokens	$0.025 / 1M tokens
kimi-k2.5 262K context, MoE architecture	262K	262K	$0.405 / 1M tokens	$1.98 / 1M tokens	$0.225 / 1M tokens
qwen3-coder-next 262K context, fast code generation	262K	262K	$0.108 / 1M tokens	$0.675 / 1M tokens	$0.060 / 1M tokens

Pricing is subject to change. All prices are in USD per 1M tokens. All models support text input/output, tool calling, JSON mode, and streaming. Reasoning is supported on GLM-5, GLM-4.7, and MiniMax-M2.5.