← Back to curriculum

Module 6 — Video & motion

Project: multi-object video tracker

Run a detector per frame, associate detections with a Kalman + IoU tracker, visualize tracks over a video, and measure ID switches.

~240 min read + exercises

Project: multi-object video tracker

Before we begin

Build a SORT-style tracker: run a detector each frame, associate with IoU + Hungarian, visualize tracks with stable IDs.

Time: ~4 hours.


Setup

bash
pip install opencv-python numpy scipy ultralytics  # or torchvision detection

Use a short video (MOT subset, dashcam clip, or cv2.VideoCapture(0) sparingly).


Simple IoU tracker (educational)

python
import numpy as np
from scipy.optimize import linear_sum_assignment
 
def iou_box(a, b):
    # a, b: [x1,y1,x2,y2]
    xi1, yi1 = max(a[0], b[0]), max(a[1], b[1])
    xi2, yi2 = min(a[2], b[2]), min(a[3], b[3])
    inter = max(0, xi2 - xi1) * max(0, yi2 - yi1)
    area_a = (a[2]-a[0]) * (a[3]-a[1])
    area_b = (b[2]-b[0]) * (b[3]-b[1])
    return inter / (area_a + area_b - inter + 1e-6)
 
class Track:
    _next_id = 1
    def __init__(self, box):
        self.id = Track._next_id
        Track._next_id += 1
        self.box = box
        self.missed = 0
 
def associate(tracks, detections, iou_thresh=0.3):
    if not tracks or not detections:
        return [], list(range(len(tracks))), list(range(len(detections)))
    cost = np.zeros((len(tracks), len(detections)))
    for i, t in enumerate(tracks):
        for j, d in enumerate(detections):
            cost[i, j] = 1 - iou_box(t.box, d)
    row, col = linear_sum_assignment(cost)
    matches, unmatched_t, unmatched_d = [], set(range(len(tracks))), set(range(len(detections)))
    for r, c in zip(row, col):
        if cost[r, c] > 1 - iou_thresh:
            continue
        matches.append((r, c))
        unmatched_t.discard(r)
        unmatched_d.discard(c)
    return matches, list(unmatched_t), list(unmatched_d)

Integrate with your detector loop; draw track.id on boxes; write tracked.mp4.


Deliverables

  1. Output video with ID labels.
  2. Count ID switches manually on 30 seconds (note in README).
  3. One paragraph: when did association fail?

Extension

Add constant-velocity Kalman prediction or swap in DeepSORT / BoT-SORT.


What's next

Welcome to Module 7 — CV production & deployment.