Welcome to Module 3 — neural networks basics
Before we begin
Phases 1 and 2 gave you gradient descent, classification, and honest evaluation. Module 3 is where models stop being one formula and become layers of neurons — the same idea behind modern vision and language systems, at a scale you can still trace by hand.
How do stacked neurons learn digits, words, and patterns in images?
You will learn forward pass, backpropagation, and classification loss — then train on MNIST and build a draw-a-digit UI that predicts in the browser.
Figure
Module 3 at a glance
What Module 3 covers
| Topic | What you will understand |
|---|---|
| Perceptron | One neuron: weights, bias, linear boundaries |
| Activations | ReLU vs sigmoid, non-linearity, vanishing gradients |
| Forward propagation | Input → hidden → output predictions |
| Backpropagation | Chain rule gradients for every weight |
| Loss functions | Softmax + cross-entropy for multi-class digits |
Before you start
Required: Module 2 project or equivalent comfort with train/test splits and classification metrics.
Install before the project:
- Python 3.10+ —
pip install torch torchvision matplotlib numpy - This site’s stack already includes Next.js for the draw UI
Lessons 1–6 are reading. Lesson 8 is the coding project.
How this connects to earlier phases
| Earlier idea | Module 3 use |
|---|---|
| Gradient descent | Still the update rule — backprop computes all gradients |
| Logistic regression | One neuron + sigmoid; networks stack many neurons |
| Cross-entropy (intro in Module 2) | Standard loss for digit classes 0–9 |
| Overfitting | Watch train vs validation accuracy on MNIST |