Project: multi-object video tracker
Before we begin
Build a SORT-style tracker: run a detector each frame, associate with IoU + Hungarian, visualize tracks with stable IDs.
Time: ~4 hours.
Setup
bash
pip install opencv-python numpy scipy ultralytics # or torchvision detectionUse a short video (MOT subset, dashcam clip, or cv2.VideoCapture(0) sparingly).
Simple IoU tracker (educational)
python
import numpy as np
from scipy.optimize import linear_sum_assignment
def iou_box(a, b):
# a, b: [x1,y1,x2,y2]
xi1, yi1 = max(a[0], b[0]), max(a[1], b[1])
xi2, yi2 = min(a[2], b[2]), min(a[3], b[3])
inter = max(0, xi2 - xi1) * max(0, yi2 - yi1)
area_a = (a[2]-a[0]) * (a[3]-a[1])
area_b = (b[2]-b[0]) * (b[3]-b[1])
return inter / (area_a + area_b - inter + 1e-6)
class Track:
_next_id = 1
def __init__(self, box):
self.id = Track._next_id
Track._next_id += 1
self.box = box
self.missed = 0
def associate(tracks, detections, iou_thresh=0.3):
if not tracks or not detections:
return [], list(range(len(tracks))), list(range(len(detections)))
cost = np.zeros((len(tracks), len(detections)))
for i, t in enumerate(tracks):
for j, d in enumerate(detections):
cost[i, j] = 1 - iou_box(t.box, d)
row, col = linear_sum_assignment(cost)
matches, unmatched_t, unmatched_d = [], set(range(len(tracks))), set(range(len(detections)))
for r, c in zip(row, col):
if cost[r, c] > 1 - iou_thresh:
continue
matches.append((r, c))
unmatched_t.discard(r)
unmatched_d.discard(c)
return matches, list(unmatched_t), list(unmatched_d)Integrate with your detector loop; draw track.id on boxes; write tracked.mp4.
Deliverables
- Output video with ID labels.
- Count ID switches manually on 30 seconds (note in README).
- One paragraph: when did association fail?
Extension
Add constant-velocity Kalman prediction or swap in DeepSORT / BoT-SORT.
What's next
Welcome to Module 7 — CV production & deployment.