Object tracking & data association
Before we begin
Detection finds objects in one frame. Tracking maintains identity across frames — who moved where?
Learning objectives
- Describe SORT (Kalman + Hungarian + IoU).
- Explain DeepSORT appearance embeddings.
- Define ID switches and MOTA at high level.
- Outline track lifecycle (tentative, confirmed, lost).
SORT pipeline
- Predict each track with Kalman filter (box state: center, aspect, scale, velocities).
- Match detections to tracks via Hungarian algorithm on IoU cost.
- Update matched tracks; create new tracks for unmatched detections; age out lost tracks.
Simple and fast — ID switches rise when objects cross or occlude.
DeepSORT
Adds appearance descriptor (small CNN) per detection. Gating with Mahalanobis distance + appearance cost reduces swaps after occlusion.
Metrics
MOTA (multiple object tracking accuracy) combines false positives, misses, and ID switches.
Report IDF1 when identity matters more than box overlap.
Practical tips
- Higher frame rate → easier association.
- Tune max age and min hits for your detector's noise.
- Consider byte-track / modern variants for stronger baselines.
What's next
Module 6 quiz — then video tracker project.