What is an AI agent?

Before we begin

Module 7 taught you to build a RAG chatbot: one user question → retrieve docs → one LLM answer. That pattern is powerful — but many real tasks are not one shot.

“Plan a 4-day trip to Rome with kid-friendly activities under $2,000.”

A chatbot might guess an itinerary from training data. An agent will look up weather, check flight APIs, search events, notice rain on day 2, and revise the plan — across many LLM calls and tool executions.

An AI agent = an LLM inside a control loop that can act on the world (via tools), observe results, and continue until a goal is met or limits are hit.

Figure

The agent loop

Not one shot — iterate: think → act → observe → repeat.

What you will learn

Define agent vs chatbot vs workflow vs plain LLM.
Name every part of the observe–think–act loop.
Walk through a worked example (travel planning) step by step.
Decide when agents are worth the complexity — and when they are not.
List failure modes you must design for in production.

Before this lesson

Module 8 welcome — includes a key terms table (DAG, ReAct, MCP, etc.)
Module 7 — LLM basics
Module 7 — RAG chatbot project (helpful context)

Four ways to use an LLM (do not confuse them)

Pattern	What happens	Example
Plain LLM	Prompt in → text out	“Explain what RAG means.”
RAG chatbot	Retrieve docs → prompt with context → answer	“What is our refund policy?”
Workflow	You fixed the steps; LLM may fill one node	Ingest PDF → chunk → embed → query (fixed pipeline — see DAG in welcome)
Agent	Model chooses next step within tool rules	“Book the cheapest dry-weather weekend for my family.”

Key distinction: in a workflow, the engineer encoded the path. In an agent, the model discovers the path within constraints you define (available tools, max steps, safety rules).

Many production systems are hybrids: a workflow skeleton with agent nodes only where flexibility is needed.

Anatomy of an agent

Every agent system has these pieces:

Component	Role
LLM	Reasons about the goal, picks tools, synthesizes final answer
Tools	Functions your code runs — APIs, DB queries, calculators, file reads
Control loop	Your app: call LLM → if tool request → execute → feed result back → repeat
Memory (optional)	Session history, user prefs, scratchpad state (Lesson 4)
Guardrails	Max iterations, timeouts, allowlist (only approved tools/paths), human approval (Lesson 8)

Without the control loop, you only have a chatbot that talks about calling APIs. The loop is what makes it an agent.

The observe–think–act loop (detailed)

Each iteration of the loop:

Observe — What does the model see? User message, prior tool results, memory snippets, system rules.
Think — Model decides: answer now, or call a tool? (May emit “thought” text — see ReAct in Lesson 3.)
Act — Your app runs the tool with validated arguments.
Observe again — Tool result is appended to context. Loop continues.

Stop conditions:

Model returns a final answer (no more tool calls).
Max steps reached (e.g. 10 iterations).
Timeout or token budget exhausted.
Human approves or cancels (high-risk flows).

Worked example — “Weekend in Rome with kids”

User: “Plan Sat–Sun in Rome, kid-friendly, avoid rain if possible.”

Step	Type	Content
1	Think	“I need weather for Rome this weekend.”
2	Act	`get_weather(city="Rome", date="2025-06-28")`
3	Observe	`{"sat": "rain", "sun": "clear"}`
4	Think	“Rain Saturday — indoor options; Sunday outdoor.”
5	Act	`search_events(city="Rome", tags=["family"], day="sunday")`
6	Observe	`[{"name": "Explora Museum", ...}]`
7	Think	“Enough data — draft itinerary.”
8	Final	Structured 2-day plan citing weather + events

8 LLM-related steps vs 1 for a simple RAG question. That is the trade-off: flexibility and freshness vs latency and cost.

Agent vs workflow (exam style)

DAG = Directed Acyclic Graph — a fixed flowchart of steps (A → B → C) with no loops. A workflow is usually implemented as a DAG.

	Workflow	Agent
Who picks the next step?	Engineer (fixed DAG)	Model (within tool set)
Predictability	High	Lower
Handles surprises	Poor — branch per edge case	Better — replan on failure
Latency	Often lower (fixed path)	Higher (variable steps)
Debugging	Easier — same path every time	Harder — traces vary
Best for	Compliance pipelines (e.g. verify identity step-by-step), data import jobs	Open-ended research, planning, coding

Interview answer: “We use a workflow when steps are known and audited; we add an agent when the user goal is underspecified or depends on live tool results.”

When agents help

Use an agent when all or most of these are true:

Task needs live external data (weather, inventory, prices, calendar).
Intermediate results change the next step (rain → indoor activities).
User goal is underspecified (“something fun near me”).
Steps are hard to enumerate in advance (debugging code, research).

When not to use an agent

Situation	Better approach
Single doc Q&A with good retrieval	Module 7 RAG — one retrieval + one LLM call
Strict regulatory sequence (every step audited)	Deterministic workflow + human sign-off
Sub-200ms latency requirement	Precomputed answers or tiny model, not multi-step agent
Tiny tool set, fixed order always	Workflow with one LLM summarization node
Cost-sensitive at huge scale	Cache tool results; agent only when cache misses

Rule of thumb: if you can draw a fixed flowchart that covers 95% of cases, start with a workflow. Add agentic loops only where the chart would have dozens of “if API fails” branches.

How agents differ from “plain LLMs”

A plain LLM (Module 7) predicts text from context. It has no guaranteed access to:

Today’s weather
Your private database
Whether a file exists on disk

It may hallucinate those facts. Agents reduce that risk by grounding decisions in tool observations — though the model can still misinterpret tool output.

	Plain LLM	Agent
Facts	From training cutoff + prompt	From tool results when used correctly
Actions	Text only	Can trigger real side effects
Multi-step	Simulated in one reply	Actually executed step by step
Risk	Wrong facts	Wrong facts and wrong API calls

Failure modes (design for these on day one)

Runaway loops

Model keeps calling tools without finishing — burns tokens and money.

Mitigation: max_iterations=10, cost cap per session, detect repeated identical tool calls.

Wrong tool arguments

get_weather(city="Pariss") → 404. Model may retry or give up.

Mitigation: validate args in your code before HTTP; return clear error strings to the model (Lesson 2).

Silent tool failures

API timeout returns empty; model invents data.

Mitigation: always return structured errors; never swallow exceptions without logging.

Over-trust

User assumes the agent “checked” flights when it only guessed.

Mitigation: show reasoning trace and which tools ran; cite tool timestamps in UI (your travel project).

Prompt injection via tools

Malicious webpage content in a fetch_url tool becomes part of context.

Mitigation: sanitize tool outputs; separate system rules from untrusted content.

Real-world agent examples

Domain	Goal	Typical tools
Travel	Itinerary	Weather, maps, events, hotels
Coding	Fix bug	Read file, run tests, edit, git
Support	Resolve ticket	CRM lookup, refund API, knowledge base
Analytics	Answer data question	SQL, chart render, export CSV

Your Module 8 project is travel — the same loop pattern applies everywhere.

Check yourself

Before Lesson 2, you should be able to answer:

Name the four components of an agent (LLM, tools, loop, …).
Why is RAG alone not enough for “plan my trip”?
What stops an agent from looping forever?
Give one case where a workflow beats an agent.

What's next

Lesson 2 — Tools & function calling — how the model actually requests actions and how your app runs them safely.