← Back to curriculum

Module 8 — Agentic AI

Tools & function calling

Full tool lifecycle, JSON schemas, validation and auth, error formatting, parallel vs sequential calls, and a minimal Python agent loop.

~95 min read + exercises

Tools and function calling

Before we begin

An LLM outputs text. It cannot by itself call OpenWeatherMap or query Postgres. Tool calling (also function calling) is the bridge: the model emits a structured request (“run get_weather with city=Paris”), and your application executes it and returns facts.

The model proposes. Your code disposes. Never let the model hold API keys or run SQL without validation.

Figure

Tool call lifecycle

LLMtool JSONAppvalidateAPIweatherLLManswer
Model proposes → app validates & runs → result back to model → repeat or answer.

What you will learn

  • Trace the full tool-calling lifecycle from registration to final answer.
  • Write JSON schemas models can use reliably.
  • Implement validation, auth, timeouts, and error messages in application code.
  • Format tool results so the model continues correctly.
  • Compare parallel vs sequential tool calls and provider differences.

Before this lesson


The lifecycle (step by step)

Step 1 — Register tools

You send the model a list of tools, each with:

  • name — stable identifier (get_weather, not weatherTool2)
  • description — when to use it (models read this heavily)
  • parameters — JSON Schema for arguments

The model does not execute anything yet — it only knows what could be called.

Step 2 — User message arrives

Example: “What should I pack for Paris tomorrow?”

Step 3 — Model returns a tool call (not final text)

Instead of answering directly, the API response includes something like:

json
{
  "name": "get_weather",
  "arguments": "{\"city\": \"Paris\", \"units\": \"metric\"}"
}

Different providers wrap this differently (OpenAI tool_calls, Anthropic tool_use), but the idea is identical.

Step 4 — Your app validates and executes

Never pass arguments straight to SQL or shell. Validate first:

python
def run_get_weather(args: dict) -> dict:
    city = args.get("city", "").strip()
    if not city or len(city) > 80:
        raise ValueError("invalid city")
    if not re.match(r"^[\w\s\-'.]+$", city):
        raise ValueError("city contains disallowed characters")
    units = args.get("units", "metric")
    if units not in ("metric", "imperial"):
        units = "metric"
  # HTTP call with key from os.environ — never from the prompt
    return fetch_weather_api(city, units)

Step 5 — Append tool result to conversation

You add a message the API understands as “tool output”:

text
Tool get_weather → {"temp_c": 18, "condition": "cloudy", "precip_mm": 2}

Step 6 — Model continues

It may call another tool (get_forecast) or produce the final packing advice using observed temperatures.


Example schema (weather)

json
{
  "name": "get_weather",
  "description": "Get current weather for a city. Use when user asks about temperature, rain, or what to wear.",
  "parameters": {
    "type": "object",
    "properties": {
      "city": {
        "type": "string",
        "description": "City name in English, e.g. Paris, Tokyo"
      },
      "units": {
        "type": "string",
        "enum": ["metric", "imperial"],
        "description": "Temperature units"
      }
    },
    "required": ["city"]
  }
}

Description tips:

  • Say when to use the tool (“Use when user asks about…”).
  • Say when not to (“Do not use for historical climate essays”).
  • Match parameter names to what your code expects.

Poor descriptions cause wrong tool selection — the most common agent bug after bad prompts.


A second tool — geocoding

Agents often chain tools. Add:

json
{
  "name": "geocode_city",
  "description": "Convert city name to latitude/longitude. Use before map or distance tools if coords unknown.",
  "parameters": {
    "type": "object",
    "properties": {
      "city": { "type": "string" }
    },
    "required": ["city"]
  }
}

Chain example: user says “distance from Eiffel Tower to Louvre” → geocode × 2 → get_route → answer.


Your responsibilities (not the model’s)

ResponsibilityWhy
Validate argumentsModels typo city names, invent enum values
Auth & secretsKeys in server env / vault only
Timeoutshttpx.get(..., timeout=10) — hung tools block the whole agent
Retries with backoff429/503 from upstream — retry 2×, then return error to model
Sanitize outputsStrip HTML, truncate huge JSON before re-prompting
Confirm destructive actionsdelete_account, charge_card → human or explicit UI confirm
Idempotent toolsSame action twice must not duplicate — e.g. create_ticket with an idempotency key (a unique ID so retries are safe)

The model is a planner, not a trusted runtime.


Formatting tool results for the model

Do:

text
get_weather(city="Paris") succeeded:
{"temp_c": 18, "condition": "cloudy", "precip_mm": 2}

Do not dump raw HTTP headers, 50 KB JSON, or stack traces into context.

On error:

text
get_weather(city="Pariss") failed: HTTP 404 — city not found. Try corrected spelling.

Clear errors help the model self-correct on the next iteration.


Minimal Python loop (conceptual)

python
messages = [{"role": "user", "content": user_query}]
tools = [WEATHER_SCHEMA, GEOCODE_SCHEMA]
 
for step in range(MAX_STEPS):
    response = client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=messages,
        tools=tools,
    )
    msg = response.choices[0].message
    if not msg.tool_calls:
        return msg.content  # final answer
 
    messages.append(msg)
    for call in msg.tool_calls:
        name = call.function.name
        args = json.loads(call.function.arguments)
        try:
            result = TOOL_REGISTRY[name](args)
            status = "succeeded"
        except Exception as e:
            result = {"error": str(e)}
            status = "failed"
        messages.append({
            "role": "tool",
            "tool_call_id": call.id,
            "content": f"{name} {status}: {json.dumps(result)}",
        })
return "Max steps reached — please try a simpler question."

This is the control loop from Lesson 1, made concrete.


Parallel vs sequential tool calls

Some APIs let the model request multiple tools in one turn:

  • get_weather(Paris) and get_weather(Lyon) in parallel — good for comparison queries.
  • Your app can asyncio.gather independent calls to save latency.

Dependencies must stay sequential: geocode before route if route needs coordinates.


Tool design checklist

Before shipping a tool:

  • Description explains when and when not
  • Required fields marked in schema
  • Server-side validation for every parameter
  • Timeout and rate limit on external API
  • Errors return actionable strings
  • Logged with trace_id (unique request ID for debugging), user_id, latency
  • Destructive ops behind confirmation

Common mistakes

MistakeConsequence
20 tools in one agentWrong tool picked — split agents (Lesson 6)
Vague descriptionModel calls calculator for weather
Returning secrets in tool outputLeak into logs and next prompts
No max stepsInfinite loop on confused model
Trusting model-generated SQLInjection — use parameterized queries only

Connect to the travel project

Your project registers tools like get_weather, geocode, search_places. Each follows this lesson’s lifecycle. The executor agent (Lesson 3) focuses on getting arguments right; this lesson is the mechanics underneath.


What's next

Lesson 3 — Planning vs execution — ReAct loops and splitting planner vs executor roles.