Color spaces & preprocessing
Before we begin
Module 1 Lessons 1–2 treated images as grayscale grids and gradients. Real pipelines use color, and almost every ML model expects preprocessed tensors — not raw JPEG bytes.
This lesson covers RGB, HSV, and Lab, white balance, histogram equalization, and normalization — plus mistakes that break production models.
Learning objectives
By the end you should be able to:
- Explain when to use RGB vs HSV vs Lab.
- Describe white balance and why illuminants tint images.
- Apply histogram equalization and know when it helps or hurts.
- Normalize images for CNN training and match stats at inference.
RGB — device-dependent storage
RGB stores red, green, blue per pixel. Values are often 0–255 (uint8) or 0.0–1.0 (float).
Important: RGB in a JPEG is usually gamma-encoded (sRGB), not linear light. Algorithms that assume linear intensity (HDR merge, physically based shading) must linearize first.
HSV — hue separated from shading
HSV (hue, saturation, value) separates:
- Hue — which color (angle on color wheel)
- Saturation — how vivid vs gray
- Value — brightness
Useful for color-based segmentation (track a red object) because hue is somewhat stable when lighting changes.
OpenCV: cv2.cvtColor(img, cv2.COLOR_BGR2HSV) — note OpenCV uses BGR order by default.
Lab — perceptually uniform color
CIE Lab (L*, a*, b*) is designed so Euclidean distance ≈ perceived color difference.
Use Lab when measuring color error or doing color correction — better than RGB distance.
White balance
Different lights (tungsten, daylight, fluorescent) shift RGB ratios. White balance scales channels so neutral gray surfaces map near equal R=G=B.
Auto white balance (AWB) in cameras guesses the illuminant — can fail under mixed lighting (indoor window + tungsten).
Checkpoint: Why does a photo look orange indoors without correction?
Answer sketch: Low color temperature (warm) light has more red; sensor records high R relative to B — AWB should boost B and cut R.
Histogram equalization
Goal: Spread intensity values to use more of the 0–255 range — boosts contrast on flat images.
Apply on grayscale or per channel (careful on RGB — can shift colors). CLAHE (adaptive equalization) limits amplification in local tiles to reduce noise blow-up.
Not a substitute for good exposure — can look unnatural on already well-exposed photos.
Normalization for machine learning
Common pattern for ImageNet-pretrained models:
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
# x = (x/255.0 - mean) / std per channelTrain and serve must use the same resize, channel order (RGB vs BGR), mean, and std.
Common preprocessing pitfalls
| Mistake | Symptom |
|---|---|
| BGR vs RGB swap | Model sees wrong colors — random accuracy |
| Different resize at serve | Scale mismatch — boxes/masks drift |
| Skip normalization | Saturated activations, poor convergence |
| Augment at inference | Non-deterministic outputs |
What's next
Module 1 quiz — then the imaging pipeline project.