Welcome to Module 8 — agentic AI
Before we begin
Module 7 gave you RAG chatbots — one model, retrieved context, one answer. Agentic AI adds loops: the model decides which tools to call, observes results, and continues until a goal is met.
Interview focus: agents, tool calling, MCP, ReAct, evals, multi-agent orchestration, production design.
Figure
Module 8 at a glance
What Module 8 covers
| Topic | What you will understand |
|---|---|
| Agents | Loops, goals, vs plain LLMs and fixed workflows |
| Tools | Function calling, schemas, execution |
| Planning vs execution | ReAct, specialized roles |
| MCP & context | Tool servers, prompt assembly, memory |
| Multi-agent | LangChain, LangGraph orchestration |
| Evals | LLM-as-judge, SME sets, regression gates |
| System design | Deploy, scale, observability, guardrails |
Key terms (read once)
Lessons use industry shorthand. If a word is new, check here first — each lesson also explains it again in context.
| Term | Plain English |
|---|---|
| DAG | Directed Acyclic Graph — a fixed flowchart of steps (A → B → C) with no loops back; the engineer decides every branch in advance |
| Workflow | Code that runs steps in a predetermined order (often a DAG) |
| ReAct | Reason + Act — the model alternates thinking in text and calling tools (Lesson 3) |
| MCP | Model Context Protocol — a standard way to plug files, APIs, and databases into agents (Lesson 5) |
| Handoff | One agent passing a structured summary to another — not the entire chat history |
| Orchestration | Coordinating multiple LLM calls, tools, and shared state in the right order |
| Eval | A repeatable test case + scoring rule (like a unit test for your AI feature) |
| SME | Subject-matter expert — someone who knows the domain and writes gold reference answers |
| LLM-as-judge | A second model scores the first model’s output against a rubric |
| CI | Continuous integration — automated tests that run when you push code (e.g. GitHub Actions) |
| JSONL | JSON Lines — a text file with one JSON object per line (easy to append eval cases) |
| Fixture | Canned fake API response used in tests so results are predictable |
| Idempotent | Safe to retry: running the same action twice does not duplicate charges or records |
| Stateless (server) | The app server stores no session in RAM; session data lives in Redis or a database |
| Allowlist | Only listed tools, paths, or actions are allowed; everything else is blocked |
| PII | Personally identifiable information — email, passport number, home address, etc. |
| TTL | Time to live — how long data is kept before it expires (cache, session, memory) |
| p95 latency | 95% of requests finish faster than this — a common “typical slow case” speed metric |
| Circuit breaker | Auto-pause a feature when cost or errors cross a threshold (like tripping a fuse) |
| Regression | A change that breaks something that used to work — evals catch regressions before users do |
| SSE | Server-sent events — the server pushes progress updates to the browser over HTTP |
| stdio | Standard input/output — a local process the agent talks to via pipes (common for dev MCP servers) |
Already from earlier modules: RAG (retrieve docs then answer), LLM (large language model), token (a chunk of text the model reads), embedding (a list of numbers representing meaning).
Before you start
Required: Module 7 RAG project or comfort with LLM APIs and prompts.
For the project:
- LLM API with tool/function calling support
- Weather API key (e.g. OpenWeatherMap) — free tier OK
- Optional: maps/geocoding API
pip install langgraph langchain-openai(optional but recommended)