Tool Calling
EmberCloud models can call tools you define — enabling your model to take actions, query APIs, or fetch real-time data. The API is fully compatible with the OpenAI tool calling format.
OpenAI SDK compatible: If you already use the OpenAI SDK for tool calling, just change baseURL and apiKey. No other changes required.
The flow has two steps:
- Send a request with a list of tools. The model decides when to call one.
- Execute the tool in your code, send the result back, and get the final reply.
1. Define Tools & Get a Tool Call
Pass a tools array to the request. Each tool has a JSON Schema parameters object so the model knows what arguments to generate.
from openai import OpenAI
import json
client = OpenAI(
base_url="https://api.embercloud.ai/v1",
api_key="your-api-key",
)
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "City name, e.g. 'San Francisco'",
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature unit",
},
},
"required": ["city"],
},
},
}
]
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
tools=tools,
)
# If the model wants to call a tool, finish_reason is "tool_calls"
choice = response.choices[0]
if choice.finish_reason == "tool_calls":
tool_call = choice.message.tool_calls[0]
print(f"Tool: {tool_call.function.name}")
print(f"Args: {tool_call.function.arguments}")
# → Tool: get_weather
# → Args: {"city": "Tokyo", "unit": "celsius"}Example response when the model wants to call a tool:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"city\": \"Tokyo\", \"unit\": \"celsius\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
],
"usage": {
"prompt_tokens": 82,
"completion_tokens": 18,
"total_tokens": 100
}
}2. Execute the Tool & Return the Result
When finish_reason is "tool_calls", run the function in your code and send the result back as a tool message. The model then generates the final user-facing reply.
import json
# --- After receiving a tool_calls response ---
# 1. Extract the tool call
tool_call = response.choices[0].message.tool_calls[0]
args = json.loads(tool_call.function.arguments)
# 2. Execute your function
def get_weather(city, unit="celsius"):
# Your real implementation here
return {"temperature": 18, "unit": unit, "condition": "Partly cloudy"}
result = get_weather(**args)
# 3. Build the follow-up messages
messages = [
{"role": "user", "content": "What's the weather in Tokyo?"},
response.choices[0].message, # assistant's tool_calls message
{
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result),
},
]
# 4. Send back to get the final answer
final = client.chat.completions.create(
model="glm-4.7",
messages=messages,
tools=tools,
)
print(final.choices[0].message.content)
# → "The weather in Tokyo is 18°C and partly cloudy."Parallel Tool Calls
The model may return multiple tool calls in a single response when it needs several pieces of information. Run them in parallel and return all results together.
import asyncio, json
# The model might call get_weather for multiple cities at once
response = client.chat.completions.create(
model="glm-4.7",
messages=[{"role": "user", "content": "Compare weather in Tokyo and London."}],
tools=tools,
)
tool_calls = response.choices[0].message.tool_calls
# tool_calls may have 2 entries: one for Tokyo, one for London
# Run all calls and collect results
tool_results = []
for tc in tool_calls:
args = json.loads(tc.function.arguments)
result = get_weather(**args) # your function
tool_results.append({
"role": "tool",
"tool_call_id": tc.id,
"content": json.dumps(result),
})
messages = [
{"role": "user", "content": "Compare weather in Tokyo and London."},
response.choices[0].message,
*tool_results,
]
final = client.chat.completions.create(
model="glm-4.7",
messages=messages,
tools=tools,
)
print(final.choices[0].message.content)Controlling Tool Use
Use tool_choice to control whether the model can, must, or cannot call tools.
| Parameter | Type | Required | Description |
|---|---|---|---|
tool_choice: "auto" | default | Optional | The model decides whether to call a tool or respond directly. This is the default. |
tool_choice: "required" | string | Optional | The model must call at least one tool. Useful when you always want structured output. |
tool_choice: "none" | string | Optional | The model will not call any tools. Use to force a plain text response even when tools are defined. |
tool_choice: { type: "function", function: { name: "..." } } | object | Optional | Force the model to call a specific named function. |
JSON / Structured Output
For simple structured data, use response_format: { type: "json_object" } instead of tool calling. The model guarantees valid JSON output.
response = client.chat.completions.create(
model="glm-4.7",
messages=[
{
"role": "system",
"content": "You extract information and return it as JSON.",
},
{
"role": "user",
"content": "Extract: name, age, city from 'Alice is 30 and lives in Paris'",
},
],
response_format={"type": "json_object"},
)
import json
data = json.loads(response.choices[0].message.content)
print(data)
# → {"name": "Alice", "age": 30, "city": "Paris"}