Agents & OrchestrationHow It Works

Tool use with Claude: the complete implementation guide

In brief

Defining tools, parsing responses, handling multi-turn tool calls, parallel tool use, and the failure modes that will bite you in production.

8 min read·Tool Use

Contents

♡Sign in to save

Tool use is how you give Claude the ability to take actions: call an API, query a database, run a function, search the web. The model decides when to use a tool and what arguments to pass. You execute the tool and return the result. Claude continues from there.

The concept is simple. The implementation has enough edge cases to warrant a full walkthrough.

Defining tools

Tools are described as JSON schema objects. Claude uses the description and parameter schema to decide when and how to call them. Quality here directly affects quality of tool calls — vague descriptions produce vague calls.

import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a specific location. Returns temperature in Celsius, conditions (sunny/cloudy/rainy/etc), and humidity percentage.",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name and country code, e.g. 'London, GB' or 'Tokyo, JP'"
                },
                "units": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature units. Defaults to celsius."
                }
            },
            "required": ["location"]
        }
    },
    {
        "name": "search_database",
        "description": "Search internal product database for items matching a query. Returns list of matching products with name, SKU, price, and stock status.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "Search query. Supports partial matches."
                },
                "max_results": {
                    "type": "integer",
                    "description": "Maximum number of results to return. Default 10, max 50.",
                    "default": 10
                }
            },
            "required": ["query"]
        }
    }
]

Description writing rules that matter:

Say what the tool returns, not just what it does. "Returns temperature in Celsius, conditions, and humidity" is more useful to the model than "Gets weather."
Describe parameter formats explicitly. "City name and country code, e.g. 'London, GB'" prevents the model from passing "London" when you need "London, GB".
Note defaults and limits. The model uses this to avoid unnecessary parameter inclusion.

Making a request with tools

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools,
    messages=[
        {"role": "user", "content": "What's the weather like in Tokyo right now?"}
    ]
)

print(response.stop_reason)   # "tool_use" or "end_turn"
print(response.content)       # list of content blocks

When Claude wants to use a tool, stop_reason is "tool_use" and content contains one or more tool_use blocks alongside any text:

# Inspect the response
for block in response.content:
    if block.type == "tool_use":
        print(f"Tool: {block.name}")
        print(f"ID:   {block.id}")
        print(f"Args: {block.input}")
    elif block.type == "text":
        print(f"Text: {block.text}")

Executing the tool and continuing

This is where most implementations get the message structure wrong. After a tool call, you must:

Append Claude's full response (including the tool_use block) to messages as an assistant turn
Append a user turn containing a tool_result block with the result

import json

def execute_tool(name: str, args: dict) -> str:
    """Your actual tool execution logic."""
    if name == "get_weather":
        # Call your weather API
        return json.dumps({"temp": 18, "conditions": "partly cloudy", "humidity": 72})
    elif name == "search_database":
        # Query your DB
        return json.dumps({"results": [{"name": "Widget A", "sku": "W001", "price": 29.99, "in_stock": True}]})
    return json.dumps({"error": "Unknown tool"})


def run_with_tools(user_message: str) -> str:
    messages = [{"role": "user", "content": user_message}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )

        # Append assistant response to history
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason == "end_turn":
            # Extract final text response
            for block in response.content:
                if hasattr(block, "text"):
                    return block.text
            return ""

        if response.stop_reason == "tool_use":
            # Execute all tool calls in this response
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result,
                    })

            # Append tool results as user turn
            messages.append({"role": "user", "content": tool_results})
            # Loop — Claude will continue from here

        else:
            # stop_reason is "max_tokens" or something unexpected
            break

    return ""

Parallel tool use

Claude can call multiple tools in a single response when the calls are independent. Your loop already handles this — process all tool_use blocks in a response before continuing.

# Claude might return this in a single response:
# - tool_use: get_weather(location="Tokyo, JP")
# - tool_use: get_weather(location="London, GB")
# - tool_use: search_database(query="umbrellas")

# Execute all three, return all three results, then continue

When parallel tool calls are possible, they are faster. If your execution is synchronous, that does not help you — use asyncio or threading if latency matters.

Forcing or preventing tool use

tool_choice controls whether Claude must use a tool, can choose, or cannot use any:

# Force Claude to call a specific tool
response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    tools=tools,
    tool_choice={"type": "tool", "name": "search_database"},
    messages=[...]
)

# Let Claude decide (default)
tool_choice={"type": "auto"}

# Prevent any tool use — Claude answers from context only
tool_choice={"type": "none"}

Force-calling a tool is useful when you want structured extraction rather than a conversational response. Ask Claude to call a tool whose schema matches the shape of data you want back — cleaner than asking for JSON and parsing it.

Error handling

Tools fail. Return errors in the tool_result content so Claude can respond appropriately rather than hallucinating a result:

try:
    result = call_external_api(args)
    tool_result_content = json.dumps(result)
    is_error = False
except Exception as e:
    tool_result_content = f"Error: {str(e)}"
    is_error = True

tool_results.append({
    "type": "tool_result",
    "tool_use_id": block.id,
    "content": tool_result_content,
    "is_error": is_error,   # tells Claude the tool failed
})

With is_error: true, Claude knows the call failed and can tell the user, retry with different parameters, or try a different approach — rather than continuing as if it got a valid result.

The failure modes

Infinite tool loops. If Claude calls a tool, gets a result, calls the same tool again, and so on — you have a loop. Cap iterations:

MAX_ITERATIONS = 10
iteration = 0

while iteration < MAX_ITERATIONS:
    # ... your loop
    iteration += 1

Tool descriptions that are too vague. If Claude calls the wrong tool or passes wrong arguments, the description is almost always the cause. Read the tool call in your logs and ask: given this description, would a developer know to call it this way? If not, fix the description.

Not passing the full content array. When you append the assistant turn after a tool call, you must pass response.content (the full list), not just the text blocks. Omitting the tool_use blocks causes an API error on the next request.

Assuming tool calls are atomic. Claude may call multiple tools before producing a final answer. Your loop must handle this correctly — do not return early after the first end_turn check if the response also contains unprocessed tool calls.