← Back to curriculum

Geometry & correspondence

Features, matching, and robust estimation

Corners, descriptors, putative matches, RANSAC, homographies, and when planar models break.

~70 min read + exercises

Features, matching, and robust estimation

This lesson is the bridge between pixels and discrete geometric constraints: you extract repeatable local descriptors, propose matches, then fit models (like homographies) despite outliers using RANSAC.

Figure

The classical feature pipeline

The classical feature pipelineDetectkeypointsDescribevector / patchMatchnearest neighborRobust fitRANSAC + refine
Four stages: detect repeatable keypoints, describe them with a vector, find candidate matches, then keep only the geometry-consistent ones.

Learning objectives

  • Explain why corners (high gradient variation in multiple directions) make stronger features than flat regions or straight edges alone.
  • Describe the pipeline: detect → describe → match → robust fit.
  • State what RANSAC optimizes for and why outliers break least-squares.

Prerequisites

  • Convolution / gradients lesson.
  • Camera projection lesson (helpful for geometric interpretation of matches).

Step 1 — What makes a “good” local feature?

A local feature is a point neighborhood that is:

  1. Detectable reliably under viewpoint and lighting changes (within limits).
  2. Describable with a vector summarizing local appearance for matching.

Harris corners (classic) score intensity structure using the second-moment matrix of image gradients.

Figure

Flat patch vs edge vs corner

What makes a “good” local feature?A useful patch must change when shifted in any direction.Flatλ₁ ≈ λ₂ ≈ 0ambiguous — shifts in any direction look the sameEdgeλ₁ ≫ λ₂ambiguous along the edge directionCornerλ₁ ≈ λ₂ ≫ 0uniquely localizable in 2D
A patch only counts as a distinctive feature if shifting it in any direction makes the patch content change.

Checkpoint: Why is a straight step edge a poor unique landmark compared to a corner?


Step 2 — From corners to descriptors (SIFT / ORB intuition)

Classical SIFT (Scale-Invariant Feature Transform) searches scale-space for extrema, assigns orientation, and forms a histogram-of-gradients descriptor.

Faster modern alternatives (e.g. ORB) trade some invariance for speed — common on mobile.

Exercise: List three transformations a descriptor tries to be invariant to, and one transformation that still commonly breaks matchers.


Step 3 — Nearest-neighbor matching and ratio test

Given descriptors dᵢ in image A and d′ⱼ in image B, a naive matcher finds the nearest neighbor in descriptor space (often Euclidean distance).

Lowe’s ratio test: compare best distance to second-best distance to reject ambiguous matches (e.g. repetitive texture).

Checkpoint: Why do brick walls break naive feature matching?


Step 4 — Geometric consistency: homography as an example

If the scene is approximately planar (or the camera purely rotates), corresponding points can be related by a projective homography HH (a 3×33 \times 3 matrix up to scale):

λ[uv1]=H[uv1]\lambda \begin{bmatrix} u' \\ v' \\ 1 \end{bmatrix} = H \begin{bmatrix} u \\ v \\ 1 \end{bmatrix}

Matches that agree with the same HH are inliers; mismatches are outliers.

Exercise (conceptual): Give a real scene where the planar homography assumption fails badly.


Step 5 — RANSAC in plain language

RANSAC (Random Sample Consensus):

  1. Randomly sample the minimum set of matches needed to estimate a model (e.g. 4 points for homography).
  2. Count how many other matches agree within a tolerance (inliers).
  3. Repeat many iterations; keep the model with the most inliers.
  4. Optionally refine with all inliers.

Figure

RANSAC keeps the line with the most votes

RANSAC keeps the line that earns the most votesOutliers can dominate least-squares; RANSAC samples small subsets and counts agreement.Legendinlier (within ε)outlierbest modelinlier band ±ε
Inliers fall inside an ε-tolerance band around the candidate model. Outliers don't influence the chosen line — that's why RANSAC beats least-squares here.

Checkpoint: What happens to required iterations if the outlier fraction increases?


Step 6 — Epipolar geometry (preview pointer)

When you have two calibrated views of a general 3D scene, matches must satisfy the epipolar constraint rather than a single homography (unless planar). The next track modules in advanced courses develop the essential and fundamental matrices.

For now: remember that geometry narrows search — matching along epipolar lines is cheaper and more robust than global search.


Check your understanding

  1. What is the difference between a feature detector and a descriptor?
  2. Why does least-squares fitting of a homography to all matches (including wrong ones) fail?
  3. Name two sources of false matches unrelated to the descriptor itself.

Lab-style stretch goal (optional)

Run ORB or SIFT matching between two photos of the same desk. Visualize matches, then run homography RANSAC and visualize inliers vs outliers.