Tool use with Claude: the complete implementation guide
In brief
Defining tools, parsing responses, handling multi-turn tool calls, parallel tool use, and the failure modes that will bite you in production.
Contents
Tool use is how you give Claude the ability to take actions: call an API, query a database, run a function, search the web. The model decides when to use a tool and what arguments to pass. You execute the tool and return the result. Claude continues from there.
The concept is simple. The implementation has enough edge cases to warrant a full walkthrough.
Defining tools
Tools are described as JSON schema objects. Claude uses the description and parameter schema to decide when and how to call them. Quality here directly affects quality of tool calls — vague descriptions produce vague calls.
import anthropic
client = anthropic.Anthropic()
tools = [
{
"name": "get_weather",
"description": "Get current weather for a specific location. Returns temperature in Celsius, conditions (sunny/cloudy/rainy/etc), and humidity percentage.",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name and country code, e.g. 'London, GB' or 'Tokyo, JP'"
},
"units": {
"type": "string",
"enum": ["celsius", "fahrenheit"],
"description": "Temperature units. Defaults to celsius."
}
},
"required": ["location"]
}
},
{
"name": "search_database",
"description": "Search internal product database for items matching a query. Returns list of matching products with name, SKU, price, and stock status.",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query. Supports partial matches."
},
"max_results": {
"type": "integer",
"description": "Maximum number of results to return. Default 10, max 50.",
"default": 10
}
},
"required": ["query"]
}
}
]
Description writing rules that matter:
- Say what the tool returns, not just what it does. "Returns temperature in Celsius, conditions, and humidity" is more useful to the model than "Gets weather."
- Describe parameter formats explicitly. "City name and country code, e.g. 'London, GB'" prevents the model from passing "London" when you need "London, GB".
- Note defaults and limits. The model uses this to avoid unnecessary parameter inclusion.
Making a request with tools
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=[
{"role": "user", "content": "What's the weather like in Tokyo right now?"}
]
)
print(response.stop_reason) # "tool_use" or "end_turn"
print(response.content) # list of content blocks
When Claude wants to use a tool, stop_reason is "tool_use" and content contains one or more tool_use blocks alongside any text:
# Inspect the response
for block in response.content:
if block.type == "tool_use":
print(f"Tool: {block.name}")
print(f"ID: {block.id}")
print(f"Args: {block.input}")
elif block.type == "text":
print(f"Text: {block.text}")
Executing the tool and continuing
This is where most implementations get the message structure wrong. After a tool call, you must:
- Append Claude's full response (including the
tool_useblock) to messages as anassistantturn - Append a
userturn containing atool_resultblock with the result
import json
def execute_tool(name: str, args: dict) -> str:
"""Your actual tool execution logic."""
if name == "get_weather":
# Call your weather API
return json.dumps({"temp": 18, "conditions": "partly cloudy", "humidity": 72})
elif name == "search_database":
# Query your DB
return json.dumps({"results": [{"name": "Widget A", "sku": "W001", "price": 29.99, "in_stock": True}]})
return json.dumps({"error": "Unknown tool"})
def run_with_tools(user_message: str) -> str:
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=messages,
)
# Append assistant response to history
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
# Extract final text response
for block in response.content:
if hasattr(block, "text"):
return block.text
return ""
if response.stop_reason == "tool_use":
# Execute all tool calls in this response
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result,
})
# Append tool results as user turn
messages.append({"role": "user", "content": tool_results})
# Loop — Claude will continue from here
else:
# stop_reason is "max_tokens" or something unexpected
break
return ""
Parallel tool use
Claude can call multiple tools in a single response when the calls are independent. Your loop already handles this — process all tool_use blocks in a response before continuing.
# Claude might return this in a single response:
# - tool_use: get_weather(location="Tokyo, JP")
# - tool_use: get_weather(location="London, GB")
# - tool_use: search_database(query="umbrellas")
# Execute all three, return all three results, then continue
When parallel tool calls are possible, they are faster. If your execution is synchronous, that does not help you — use asyncio or threading if latency matters.
Forcing or preventing tool use
tool_choice controls whether Claude must use a tool, can choose, or cannot use any:
# Force Claude to call a specific tool
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
tool_choice={"type": "tool", "name": "search_database"},
messages=[...]
)
# Let Claude decide (default)
tool_choice={"type": "auto"}
# Prevent any tool use — Claude answers from context only
tool_choice={"type": "none"}
Force-calling a tool is useful when you want structured extraction rather than a conversational response. Ask Claude to call a tool whose schema matches the shape of data you want back — cleaner than asking for JSON and parsing it.
Error handling
Tools fail. Return errors in the tool_result content so Claude can respond appropriately rather than hallucinating a result:
try:
result = call_external_api(args)
tool_result_content = json.dumps(result)
is_error = False
except Exception as e:
tool_result_content = f"Error: {str(e)}"
is_error = True
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": tool_result_content,
"is_error": is_error, # tells Claude the tool failed
})
With is_error: true, Claude knows the call failed and can tell the user, retry with different parameters, or try a different approach — rather than continuing as if it got a valid result.
The failure modes
Infinite tool loops. If Claude calls a tool, gets a result, calls the same tool again, and so on — you have a loop. Cap iterations:
MAX_ITERATIONS = 10
iteration = 0
while iteration < MAX_ITERATIONS:
# ... your loop
iteration += 1
Tool descriptions that are too vague. If Claude calls the wrong tool or passes wrong arguments, the description is almost always the cause. Read the tool call in your logs and ask: given this description, would a developer know to call it this way? If not, fix the description.
Not passing the full content array. When you append the assistant turn after a tool call, you must pass response.content (the full list), not just the text blocks. Omitting the tool_use blocks causes an API error on the next request.
Assuming tool calls are atomic. Claude may call multiple tools before producing a final answer. Your loop must handle this correctly — do not return early after the first end_turn check if the response also contains unprocessed tool calls.
Further reading
- Tool use documentation — the full API reference for Claude tool use
- Claude can now use tools — the original tool use GA announcement
- Writing effective tools for agents — with agents — Anthropic's engineering guide to tool design