Welcome to Module 3 — neural networks basics

Before we begin

Phases 1 and 2 gave you gradient descent, classification, and honest evaluation. Module 3 is where models stop being one formula and become layers of neurons — the same idea behind modern vision and language systems, at a scale you can still trace by hand.

How do stacked neurons learn digits, words, and patterns in images?

You will learn forward pass, backpropagation, and classification loss — then train on MNIST and build a draw-a-digit UI that predicts in the browser.

Figure

Module 3 at a glance

Five core lessons, quiz, then MNIST + live prediction UI.

What Module 3 covers

Topic	What you will understand
Perceptron	One neuron: weights, bias, linear boundaries
Activations	ReLU vs sigmoid, non-linearity, vanishing gradients
Forward propagation	Input → hidden → output predictions
Backpropagation	Chain rule gradients for every weight
Loss functions	Softmax + cross-entropy for multi-class digits

Before you start

Required: Module 2 project or equivalent comfort with train/test splits and classification metrics.

Install before the project:

Python 3.10+ — pip install torch torchvision matplotlib numpy
This site’s stack already includes Next.js for the draw UI

Lessons 1–6 are reading. Lesson 8 is the coding project.

How this connects to earlier phases

Earlier idea	Module 3 use
Gradient descent	Still the update rule — backprop computes all gradients
Logistic regression	One neuron + sigmoid; networks stack many neurons
Cross-entropy (intro in Module 2)	Standard loss for digit classes 0–9
Overfitting	Watch train vs validation accuracy on MNIST

Ready?

Lesson 1 — The perceptron