Color spaces & preprocessing

Before we begin

Module 1 Lessons 1–2 treated images as grayscale grids and gradients. Real pipelines use color, and almost every ML model expects preprocessed tensors — not raw JPEG bytes.

This lesson covers RGB, HSV, and Lab, white balance, histogram equalization, and normalization — plus mistakes that break production models.

Learning objectives

By the end you should be able to:

Explain when to use RGB vs HSV vs Lab.
Describe white balance and why illuminants tint images.
Apply histogram equalization and know when it helps or hurts.
Normalize images for CNN training and match stats at inference.

RGB — device-dependent storage

RGB stores red, green, blue per pixel. Values are often 0–255 (uint8) or 0.0–1.0 (float).

Important: RGB in a JPEG is usually gamma-encoded (sRGB), not linear light. Algorithms that assume linear intensity (HDR merge, physically based shading) must linearize first.

HSV — hue separated from shading

HSV (hue, saturation, value) separates:

Hue — which color (angle on color wheel)
Saturation — how vivid vs gray
Value — brightness

Useful for color-based segmentation (track a red object) because hue is somewhat stable when lighting changes.

OpenCV: cv2.cvtColor(img, cv2.COLOR_BGR2HSV) — note OpenCV uses BGR order by default.

Lab — perceptually uniform color

CIE Lab (L*, a*, b*) is designed so Euclidean distance ≈ perceived color difference.

Use Lab when measuring color error or doing color correction — better than RGB distance.

White balance

Different lights (tungsten, daylight, fluorescent) shift RGB ratios. White balance scales channels so neutral gray surfaces map near equal R=G=B.

Auto white balance (AWB) in cameras guesses the illuminant — can fail under mixed lighting (indoor window + tungsten).

Checkpoint: Why does a photo look orange indoors without correction?

Answer sketch: Low color temperature (warm) light has more red; sensor records high R relative to B — AWB should boost B and cut R.

Histogram equalization

Goal: Spread intensity values to use more of the 0–255 range — boosts contrast on flat images.

Apply on grayscale or per channel (careful on RGB — can shift colors). CLAHE (adaptive equalization) limits amplification in local tiles to reduce noise blow-up.

Not a substitute for good exposure — can look unnatural on already well-exposed photos.

Normalization for machine learning

Common pattern for ImageNet-pretrained models:

python

mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
# x = (x/255.0 - mean) / std  per channel

Train and serve must use the same resize, channel order (RGB vs BGR), mean, and std.

Common preprocessing pitfalls

Mistake	Symptom
BGR vs RGB swap	Model sees wrong colors — random accuracy
Different resize at serve	Scale mismatch — boxes/masks drift
Skip normalization	Saturated activations, poor convergence
Augment at inference	Non-deterministic outputs

What's next

Module 1 quiz — then the imaging pipeline project.