Welcome to Module 7 — GenAI & LLMs
Before we begin
Module 6 taught how transformers work. Module 7 teaches how people use them in products — lifecycle, fine-tuning, RAG engineering, automated workflows, and trust.
You are entering modern AI engineering — not just training models, but shipping assistants that answer from your data.
Figure
Module 7 at a glance
What Module 7 covers
| Topic | What you will understand |
|---|---|
| LLM basics & lifecycle | Pre-training, post-training, inference |
| Prompt engineering | System prompts, examples, structured outputs |
| Fine-tuning & quantization | LoRA, INT8/GPTQ for business workloads |
| AI-automated workflows | Coding agents, skills, subagents |
| RAG engineering | Chunking, ingestion, reranking, vector DBs |
| Trust & safety | Hallucinations, guardrails, citations |
Key terms (read once)
| Term | Plain English |
|---|---|
| SFT | Supervised fine-tuning — training on (prompt, ideal answer) pairs |
| RLHF | Reinforcement learning from human feedback — tuning model to prefer better answers |
| LoRA | Low-rank adaptation — cheap fine-tune that adds small adapter weights |
| Quantization | Storing weights in fewer bits (e.g. INT8 = 8-bit integers) for faster/cheaper inference |
| GPTQ / AWQ | Popular 4-bit quantization methods for running large models on one GPU |
| RAG | Retrieval-augmented generation — fetch relevant docs, then generate an answer |
| BM25 | Classic keyword search score — good for exact product codes and names |
| Hybrid search | Combine keyword (BM25) + meaning (embeddings) |
| Reranking | Re-score top search hits with a slower, more accurate model |
| Hallucination | Model states a confident “fact” that is not true or not in your sources |
Before you start
Required: Module 6 or solid transformer/token intuition.
For the project:
- OpenAI-compatible API key or local model (Ollama)
pip install faiss-cpu openai pypdf(adjust to your stack)- Next.js already powers this site for the chat UI
Lessons 1–10 are reading. Lesson 11 is the RAG chatbot project.