Project: predict shading on an image patch
Before we begin
This is your first end-to-end training exercise. You will not use PyTorch yet — only Python, NumPy, and Matplotlib. That is intentional: you will see every step that later frameworks hide behind .backward() and .step().
What problem are we solving?
Many photos contain smooth regions where brightness changes gradually — sky, a painted wall, a cheek highlight, a desk surface. If you pick a small patch from such a region, brightness often increases (or decreases) roughly in a straight line as you move left/right or up/down.
We will fit a simple model to that pattern:
predicted brightness = w0 + w1 × x + w2 × y
- x, y — pixel column and row in the patch
- w0 — baseline brightness (like an intercept)
- w1 — how fast brightness changes horizontally
- w2 — how fast brightness changes vertically
This is linear regression — the simplest form of “learning from data.” The same loop (predict → measure error → compute slopes → update weights) powers far larger models.
Figure
What you are building
What you will build
- A dataset of pixels from a real or synthetic patch.
- A training loop with gradient descent.
- A plot of error vs epoch (should trend downward).
- A comparison with NumPy’s built-in solver (sanity check).
- A three-panel figure: original patch | prediction | residual (error map).
Estimated time: 2–3 hours if Python is new; 1–2 hours if comfortable with NumPy.
How this connects to Module 1
| Lesson | Where you use it in code |
|---|---|
| Vectors & matrices | Each row [1, x, y] is a small vector; X is a matrix |
| Dot products | pred = X @ w — one weighted sum per pixel |
| Probability / noise | Synthetic patch adds Gaussian noise; MSE averages over pixels |
| Gradient descent | w = w - lr * grad loop you implement by hand |
Folder layout:
phase1-linear-regression/
your_photo.jpg # optional
train_patch.py # load data + training loop + plots
outputs/
loss_curve.png
three_panel.pngBefore you start
- Finish Lessons 1–4 and the quiz.
- Install Python 3.10+, then:
pip install numpy matplotlib pillow - Optional: a
.jpgwith a smooth sky or wall — or use synthetic data (no photo required).
Choose a photo patch (recommended path)
Why patch choice matters
The model assumes brightness is approximately linear in x and y. That is true for smooth gradients — not for busy texture (grass, hair, sharp edges). A bad patch does not break Python; it produces large residuals — a useful learning moment.
Steps
- Load an image (Pillow or similar) and convert to grayscale.
- Crop 32×32 (or 24×24) from a smooth region.
- For each pixel, record:
- Inputs:
[1, x, y]— the leading1lets w0 act as baseline - Target: brightness
Iat that pixel
- Inputs:
You will have 32 × 32 = 1024 rows — 1024 tiny training examples from one patch.
Build the input table in code (sketch)
import numpy as np
from PIL import Image
img = np.array(Image.open("your_photo.jpg").convert("L"), dtype=float)
patch = img[row0:row0+32, col0:col0+32] # pick row0, col0 for smooth region
H, W = patch.shape
rows = []
targets = []
for y in range(H):
for x in range(W):
rows.append([1.0, x, y])
targets.append(patch[y, x])
X = np.array(rows) # shape (1024, 3)
I = np.array(targets) # shape (1024,)Synthetic patch (no camera needed)
import numpy as np
H, W = 32, 32
ys, xs = np.meshgrid(np.arange(H), np.arange(W), indexing="ij")
I_grid = 80 + 1.2 * xs + 0.8 * ys
I_grid += np.random.normal(0, 2, size=I_grid.shape)
I_grid = np.clip(I_grid, 0, 255)
rows, targets = [], []
for y in range(H):
for x in range(W):
rows.append([1.0, x, y])
targets.append(I_grid[y, x])
X = np.array(rows)
I = np.array(targets)True slopes are near 1.2 and 0.8 — you can check if learning recovers them.
The model in plain English
Each row [1, x, y] produces one prediction:
predicted = w0×1 + w1×x + w2×y
All 1024 predictions at once: NumPy can multiply table X by weight list w in one line (X @ w).
This is Lesson 2’s dot product repeated — same weights, different x and y each time.
The loss: mean squared error
For each pixel, error = predicted − true.
Squared error = error² (big mistakes count more).
Mean squared error (MSE) = average of all squared errors across the patch.
You want MSE to go down as weights improve.
Checkpoint: If you multiply every squared error by 10, does the best weight change?
No — same best answer; slopes scale too, so you may need a smaller learning rate.
Gradients: what your code must compute
For each weight, ask: “If I increase this weight slightly, does average error go up or down, and how fast?”
That answer is the slope for that weight. With mean squared error, a compact NumPy form is:
gradient = (2 / N) × (X.transpose() @ errors)
where errors = predictions − targets and N is the number of rows.
Beginner path: implement slopes for 10 rows only in a loop, print them, compare to the formula above, then scale to the full patch.
Training loop (full skeleton)
import numpy as np
import matplotlib.pyplot as plt
N = X.shape[0]
w = np.zeros(3)
learning_rate = 1e-4 # tune if needed
losses = []
for epoch in range(800):
pred = X @ w
err = pred - I
grad = (2 / N) * (X.T @ err)
w = w - learning_rate * grad
loss = (err @ err) / N
losses.append(loss)
plt.plot(losses)
plt.xlabel("epoch")
plt.ylabel("mean squared error")
plt.title("Training error should trend downward")
plt.show()
print("Learned weights w0, w1, w2:", w)If training misbehaves
| Symptom | Try |
|---|---|
| Error explodes or becomes NaN | Divide learning_rate by 10 |
| Error barely moves | Multiply learning_rate by 3 (carefully) |
| Hard to tune | Scale x, y to 0–1 and brightness to 0–1 |
Figure
What a healthy error curve looks like
Sanity check with NumPy
w_exact, _, _, _ = np.linalg.lstsq(X, I, rcond=None)
print("Gradient descent:", w)
print("NumPy exact: ", w_exact)They should be close. Large gaps → too few epochs, bad learning rate, or a bug in grad.
Visualize and interpret
- Original patch — grayscale image you cropped.
- Predicted patch — reshape
X @ wback to 32×32. - Residual — original minus predicted (shows where the linear model fails).
Write 3–5 sentences: What region did you use? Did the ramp fit well? Where are residuals large, and why (texture, shadow, edge)?
A perfect residual map is not the goal — understanding when the model fits and when it breaks is.
Deliverables checklist
- Training loop that updates three weights
- Error vs epoch plot (downward trend)
- Comparison with
lstsq - Three-panel visualization
- Short written reflection
What you accomplished
| Idea from Module 1 | Where you used it |
|---|---|
| Lists / tables | Rows [1,x,y], targets I |
| Dot products | Each prediction = weights dot one row |
| Noise / averaging | MSE averages over all pixels |
| Gradient descent | You implemented the update loop |
Module 1 complete. Return to the curriculum. Module 2 will add classification metrics and more projects; a later PyTorch module will automate slopes while you keep control of learning rate and data.