Welcome to Module 7 — GenAI & LLMs

Before we begin

Module 6 taught how transformers work. Module 7 teaches how people use them in products — lifecycle, fine-tuning, RAG engineering, automated workflows, and trust.

You are entering modern AI engineering — not just training models, but shipping assistants that answer from your data.

Figure

Module 7 at a glance

LLM lifecycle through RAG engineering, quiz, then a hands-on RAG chatbot project.

What Module 7 covers

Topic	What you will understand
LLM basics & lifecycle	Pre-training, post-training, inference
Prompt engineering	System prompts, examples, structured outputs
Fine-tuning & quantization	LoRA, INT8/GPTQ for business workloads
AI-automated workflows	Coding agents, skills, subagents
RAG engineering	Chunking, ingestion, reranking, vector DBs
Trust & safety	Hallucinations, guardrails, citations

Key terms (read once)

Term	Plain English
SFT	Supervised fine-tuning — training on (prompt, ideal answer) pairs
RLHF	Reinforcement learning from human feedback — tuning model to prefer better answers
LoRA	Low-rank adaptation — cheap fine-tune that adds small adapter weights
Quantization	Storing weights in fewer bits (e.g. INT8 = 8-bit integers) for faster/cheaper inference
GPTQ / AWQ	Popular 4-bit quantization methods for running large models on one GPU
RAG	Retrieval-augmented generation — fetch relevant docs, then generate an answer
BM25	Classic keyword search score — good for exact product codes and names
Hybrid search	Combine keyword (BM25) + meaning (embeddings)
Reranking	Re-score top search hits with a slower, more accurate model
Hallucination	Model states a confident “fact” that is not true or not in your sources

Before you start

Required: Module 6 or solid transformer/token intuition.

For the project:

OpenAI-compatible API key or local model (Ollama)
pip install faiss-cpu openai pypdf (adjust to your stack)
Next.js already powers this site for the chat UI

Lessons 1–10 are reading. Lesson 11 is the RAG chatbot project.

Ready?

Lesson 1 — LLM basics