Computer Vision Foundations
A complete, standalone path from light and sensors through geometry, classical features, CNNs, detection, segmentation, video, and production deployment — with worked examples, quizzes, and portfolio-ready projects at every stage.
What you'll learn
- Imaging to deployment
- Hands-on projects
- Self-paced modules
Your progress
0 / 43 lessons reached
Lessons in this path
Work top to bottom within each module, or jump in from the table of contents on each lesson page.
Module 1. Module 1 — Imaging & digital images
How cameras turn light into arrays of numbers — radiance, noise, convolution, edges, and color spaces — ending with a hands-on imaging pipeline lab.
- Lesson 130 min
Welcome — start here
Key CV vocabulary (pixel, radiance, convolution, ISP), how to read lessons, what Module 1 covers, and what to install before the project.
- Lesson 275 min
Light, sensors, and the imaging pipeline
Radiance vs irradiance, exposure and SNR, Bayer demosaicing, gamma/linear light, noise models, and the full ISP from RAW to JPEG.
- Lesson 380 min
Pixels, convolution, and edges
Discrete convolution with numeric examples, Gaussian scale-space, Sobel gradients, the full Canny pipeline, and separable filters.
- Lesson 470 min
Color spaces & preprocessing
RGB vs HSV vs Lab, white balance, histogram equalization, normalization for ML, and common preprocessing pitfalls.
- Lesson 545 min
Module 1 quiz & review
20 interactive multiple-choice questions with instant feedback, explanations, and lesson links for topics you miss.
- Lesson 6150 min
Project: build an imaging pipeline lab
Load RAW-like data, demosaic, apply gamma, add synthetic noise, visualize SNR vs ISO, and compare edge detectors in Python.
Module 2. Module 2 — Geometry & correspondence
Pinhole cameras, projection, features, matching, stereo depth, and robust estimation — ending with a panorama stitching project.
- Lesson 725 min
Welcome to Module 2
How Module 2 builds on Module 1, coordinate frames in vision, what you will learn, and prerequisites for the stitching project.
- Lesson 885 min
Camera models and projection
Homogeneous projection, K and [R|t], worked pixel examples, Brown–Conrady distortion, and Zhang calibration with reprojection error.
- Lesson 990 min
Features, matching, and robust estimation
Harris corners, SIFT/ORB, Lowe's ratio test, homography vs essential matrix, RANSAC iteration math, and epipolar geometry.
- Lesson 1075 min
Stereo depth & triangulation
Disparity maps, rectified stereo, triangulation geometry, depth from disparity, and common failure modes in real scenes.
- Lesson 1145 min
Module 2 quiz & review
20 interactive multiple-choice questions covering projection, features, RANSAC, and stereo with lesson review links.
- Lesson 12180 min
Project: panorama stitching with OpenCV
Detect ORB features, match with ratio test, estimate homography with RANSAC, warp and blend two photos into a panorama.
Module 3. Module 3 — Deep learning for vision
Convolutional networks, data augmentation, and transfer learning — ending with a fine-tuned image classifier project.
- Lesson 1325 min
Welcome to Module 3
When classical CV ends and learning begins, PyTorch setup, GPU vs CPU, and how Module 3 connects to Modules 1–2.
- Lesson 1490 min
Convolutional networks for images
Conv mechanics, receptive-field recurrence, batch norm and ResNet skips, backprop through conv, cross-entropy training, and augmentation.
- Lesson 1570 min
Data augmentation & training tricks
Random crop, flip, color jitter, mixup preview, learning rate schedules, weight decay, and monitoring train vs val curves.
- Lesson 1675 min
Transfer learning & fine-tuning
ImageNet pretraining, freezing vs unfreezing layers, learning rate schedules, domain shift, and when fine-tuning beats training from scratch.
- Lesson 1745 min
Module 3 quiz & review
20 interactive multiple-choice questions on CNNs, augmentation, and transfer learning with review links.
- Lesson 18240 min
Project: fine-tuned image classifier
Fine-tune a ResNet on a custom dataset with PyTorch, track train/val accuracy, visualize misclassifications, and export to ONNX.
Module 4. Module 4 — Object detection
A deep dive from classification to bounding boxes — architectures (Faster R-CNN, YOLO, DETR), training losses, IoU/NMS/mAP, deployment, and a full detector fine-tuning project with evaluation.
- Lesson 1935 min
Welcome to Module 4
Why detection is harder than classification, COCO vs YOLO labels, module roadmap, and what to install before the project.
- Lesson 2095 min
From classification to detection
Variable-size outputs, box formats, anchors, assignment, set matching, and annotation formats with worked numeric examples.
- Lesson 21100 min
Detector architectures — two-stage, one-stage & FPN
Faster R-CNN stage-by-stage, YOLO grid intuition, RetinaNet focal loss, DETR matching, and how to pick a family for your product.
- Lesson 2290 min
Training detectors — losses, labels & data pipelines
Classification + box regression losses, RPN and RoI training, hard negative mining, COCO/YOLO dataset loading, and common training bugs.
- Lesson 2395 min
IoU, NMS, mAP & evaluation
Worked IoU examples, NMS step-by-step, precision-recall and AP, COCO mAP@0.5:0.95, threshold tuning, and qualitative failure analysis.
- Lesson 2480 min
On-device detection & deployment
Latency vs resolution budgets, batching, INT8 quantization for detectors, ONNX/TFLite export, and profiling on CPU/GPU/mobile.
- Lesson 2555 min
Module 4 quiz & review
25 interactive multiple-choice questions on detection theory, training, metrics, and deployment with lesson review links.
- Lesson 26360 min
Project: train and evaluate an object detector
Fine-tune Faster R-CNN on Penn-Fudan, plot PR curves, compute mAP, tune thresholds, visualize failures, and export ONNX.
Module 5. Module 5 — Segmentation & instance masks
Semantic, instance, and panoptic segmentation — U-Net, Mask R-CNN, losses and metrics — ending with a pet segmentation project.
- Lesson 2725 min
Welcome to Module 5
Dense prediction vs detection, encoder–decoder intuition, what Module 5 covers, and dataset prep for segmentation.
- Lesson 2880 min
Semantic segmentation & U-Net
Per-pixel classification, encoder–decoder architectures, skip connections, U-Net blocks, and when segmentation beats bounding boxes.
- Lesson 2985 min
Instance segmentation & Mask R-CNN
Object masks vs semantic labels, Mask R-CNN heads, ROI align, panoptic segmentation overview, and COCO-style evaluation.
- Lesson 3070 min
Segmentation losses & metrics
Cross-entropy vs Dice vs focal loss, IoU and mIoU, boundary-aware losses, class imbalance, and qualitative vs quantitative eval.
- Lesson 3145 min
Module 5 quiz & review
20 interactive multiple-choice questions on U-Net, Mask R-CNN, losses, and mIoU with lesson review links.
- Lesson 32300 min
Project: U-Net pet segmentation
Train a U-Net on Oxford-IIIT Pet masks with PyTorch, visualize predictions, compute mIoU, and compare with a pretrained DeepLab baseline.
Module 6. Module 6 — Video & motion
Optical flow, temporal consistency, object tracking, and data association — ending with a multi-object video tracker project.
- Lesson 3325 min
Welcome to Module 6
Why video is harder than images, temporal redundancy, frame rates, and how Module 6 connects detection to tracking.
- Lesson 3475 min
Optical flow & motion
Brightness constancy, Lucas–Kanade, Horn–Schunck, flow visualization, motion boundaries, and when flow fails.
- Lesson 3580 min
Object tracking & data association
SORT and DeepSORT, Kalman filters for boxes, Hungarian matching, re-identification embeddings, and track lifecycle management.
- Lesson 3640 min
Module 6 quiz & review
15 interactive multiple-choice questions on optical flow, tracking, and data association with review links.
- Lesson 37240 min
Project: multi-object video tracker
Run a detector per frame, associate detections with a Kalman + IoU tracker, visualize tracks over a video, and measure ID switches.
Module 7. Module 7 — CV production & deployment
Model serving, edge optimization, monitoring, and data drift for vision systems — ending with a deployed inference API project.
- Lesson 3825 min
Welcome to Module 7
From notebook to production, latency vs accuracy trade-offs, and what a deployable vision pipeline needs beyond mAP.
- Lesson 3970 min
Model serving for vision
REST vs gRPC inference, batching, TensorRT and ONNX Runtime, warm-up, preprocessing on server vs client, and container basics.
- Lesson 4075 min
Edge deployment & optimization
INT8 quantization, pruning, knowledge distillation, TFLite and CoreML, NPU delegates, and profiling latency on device.
- Lesson 4165 min
Monitoring, drift & retraining
Input distribution shift, concept drift, shadow deployments, human-in-the-loop labeling, and when to retrain vs fine-tune.
- Lesson 4240 min
Module 7 quiz & review
15 interactive multiple-choice questions on serving, quantization, and production monitoring with review links.
- Lesson 43300 min
Project: deploy a vision inference API
Export a classifier to ONNX, serve with FastAPI + ONNX Runtime, add health checks, batch endpoint, and a simple latency dashboard.