← Back to curriculum

Module 8 — Agentic AI

What is an AI agent?

Agents vs chatbots vs workflows vs plain LLMs, the observe–think–act loop with a worked travel example, when to use agents, and failure modes.

~90 min read + exercises

What is an AI agent?

Before we begin

Module 7 taught you to build a RAG chatbot: one user question → retrieve docs → one LLM answer. That pattern is powerful — but many real tasks are not one shot.

“Plan a 4-day trip to Rome with kid-friendly activities under $2,000.”

A chatbot might guess an itinerary from training data. An agent will look up weather, check flight APIs, search events, notice rain on day 2, and revise the plan — across many LLM calls and tool executions.

An AI agent = an LLM inside a control loop that can act on the world (via tools), observe results, and continue until a goal is met or limits are hit.

Figure

The agent loop

Agent loop: goal → think → tool → observe → repeatGoalThinkAct (tool)Observe
Not one shot — iterate: think → act → observe → repeat.

What you will learn

  • Define agent vs chatbot vs workflow vs plain LLM.
  • Name every part of the observe–think–act loop.
  • Walk through a worked example (travel planning) step by step.
  • Decide when agents are worth the complexity — and when they are not.
  • List failure modes you must design for in production.

Before this lesson


Four ways to use an LLM (do not confuse them)

PatternWhat happensExample
Plain LLMPrompt in → text out“Explain what RAG means.”
RAG chatbotRetrieve docs → prompt with context → answer“What is our refund policy?”
WorkflowYou fixed the steps; LLM may fill one nodeIngest PDF → chunk → embed → query (fixed pipeline — see DAG in welcome)
AgentModel chooses next step within tool rules“Book the cheapest dry-weather weekend for my family.”

Key distinction: in a workflow, the engineer encoded the path. In an agent, the model discovers the path within constraints you define (available tools, max steps, safety rules).

Many production systems are hybrids: a workflow skeleton with agent nodes only where flexibility is needed.


Anatomy of an agent

Every agent system has these pieces:

ComponentRole
LLMReasons about the goal, picks tools, synthesizes final answer
ToolsFunctions your code runs — APIs, DB queries, calculators, file reads
Control loopYour app: call LLM → if tool request → execute → feed result back → repeat
Memory (optional)Session history, user prefs, scratchpad state (Lesson 4)
GuardrailsMax iterations, timeouts, allowlist (only approved tools/paths), human approval (Lesson 8)

Without the control loop, you only have a chatbot that talks about calling APIs. The loop is what makes it an agent.


The observe–think–act loop (detailed)

Each iteration of the loop:

  1. Observe — What does the model see? User message, prior tool results, memory snippets, system rules.
  2. Think — Model decides: answer now, or call a tool? (May emit “thought” text — see ReAct in Lesson 3.)
  3. Act — Your app runs the tool with validated arguments.
  4. Observe again — Tool result is appended to context. Loop continues.

Stop conditions:

  • Model returns a final answer (no more tool calls).
  • Max steps reached (e.g. 10 iterations).
  • Timeout or token budget exhausted.
  • Human approves or cancels (high-risk flows).

Worked example — “Weekend in Rome with kids”

User: “Plan Sat–Sun in Rome, kid-friendly, avoid rain if possible.”

StepTypeContent
1Think“I need weather for Rome this weekend.”
2Actget_weather(city="Rome", date="2025-06-28")
3Observe{"sat": "rain", "sun": "clear"}
4Think“Rain Saturday — indoor options; Sunday outdoor.”
5Actsearch_events(city="Rome", tags=["family"], day="sunday")
6Observe[{"name": "Explora Museum", ...}]
7Think“Enough data — draft itinerary.”
8FinalStructured 2-day plan citing weather + events

8 LLM-related steps vs 1 for a simple RAG question. That is the trade-off: flexibility and freshness vs latency and cost.


Agent vs workflow (exam style)

DAG = Directed Acyclic Graph — a fixed flowchart of steps (A → B → C) with no loops. A workflow is usually implemented as a DAG.

WorkflowAgent
Who picks the next step?Engineer (fixed DAG)Model (within tool set)
PredictabilityHighLower
Handles surprisesPoor — branch per edge caseBetter — replan on failure
LatencyOften lower (fixed path)Higher (variable steps)
DebuggingEasier — same path every timeHarder — traces vary
Best forCompliance pipelines (e.g. verify identity step-by-step), data import jobsOpen-ended research, planning, coding

Interview answer: “We use a workflow when steps are known and audited; we add an agent when the user goal is underspecified or depends on live tool results.”


When agents help

Use an agent when all or most of these are true:

  • Task needs live external data (weather, inventory, prices, calendar).
  • Intermediate results change the next step (rain → indoor activities).
  • User goal is underspecified (“something fun near me”).
  • Steps are hard to enumerate in advance (debugging code, research).

When not to use an agent

SituationBetter approach
Single doc Q&A with good retrievalModule 7 RAG — one retrieval + one LLM call
Strict regulatory sequence (every step audited)Deterministic workflow + human sign-off
Sub-200ms latency requirementPrecomputed answers or tiny model, not multi-step agent
Tiny tool set, fixed order alwaysWorkflow with one LLM summarization node
Cost-sensitive at huge scaleCache tool results; agent only when cache misses

Rule of thumb: if you can draw a fixed flowchart that covers 95% of cases, start with a workflow. Add agentic loops only where the chart would have dozens of “if API fails” branches.


How agents differ from “plain LLMs”

A plain LLM (Module 7) predicts text from context. It has no guaranteed access to:

  • Today’s weather
  • Your private database
  • Whether a file exists on disk

It may hallucinate those facts. Agents reduce that risk by grounding decisions in tool observations — though the model can still misinterpret tool output.

Plain LLMAgent
FactsFrom training cutoff + promptFrom tool results when used correctly
ActionsText onlyCan trigger real side effects
Multi-stepSimulated in one replyActually executed step by step
RiskWrong factsWrong facts and wrong API calls

Failure modes (design for these on day one)

Runaway loops

Model keeps calling tools without finishing — burns tokens and money.

Mitigation: max_iterations=10, cost cap per session, detect repeated identical tool calls.

Wrong tool arguments

get_weather(city="Pariss") → 404. Model may retry or give up.

Mitigation: validate args in your code before HTTP; return clear error strings to the model (Lesson 2).

Silent tool failures

API timeout returns empty; model invents data.

Mitigation: always return structured errors; never swallow exceptions without logging.

Over-trust

User assumes the agent “checked” flights when it only guessed.

Mitigation: show reasoning trace and which tools ran; cite tool timestamps in UI (your travel project).

Prompt injection via tools

Malicious webpage content in a fetch_url tool becomes part of context.

Mitigation: sanitize tool outputs; separate system rules from untrusted content.


Real-world agent examples

DomainGoalTypical tools
TravelItineraryWeather, maps, events, hotels
CodingFix bugRead file, run tests, edit, git
SupportResolve ticketCRM lookup, refund API, knowledge base
AnalyticsAnswer data questionSQL, chart render, export CSV

Your Module 8 project is travel — the same loop pattern applies everywhere.


Check yourself

Before Lesson 2, you should be able to answer:

  1. Name the four components of an agent (LLM, tools, loop, …).
  2. Why is RAG alone not enough for “plan my trip”?
  3. What stops an agent from looping forever?
  4. Give one case where a workflow beats an agent.

What's next

Lesson 2 — Tools & function calling — how the model actually requests actions and how your app runs them safely.