The perceptron — one neuron

Before we begin

Every neural network is built from the same brick: a perceptron (one artificial neuron).

Output = activation(weights · inputs + bias)

Module 1 dot products appear again — now with a bias and a non-linear activation.

Figure

Anatomy of one neuron

Inputs x₁…xₙ, weights, sum, then activation.

For inputs $x_1, …, x_n$ and weights $w_1, …, w_n$ :

z = b + w₁x₁ + w₂x₂ + … + wₙxₙ

Then a = activation(z) — e.g. ReLU(z) or sigmoid(z).

This is exactly a dot product plus bias, then a function.

Inputs x = [2, 1], weights w = [0.5, -1], bias b = 0.2

z = 0.2 + 0.5×2 + (-1)×1 = 0.2 + 1 - 1 = 0.2

If activation is ReLU: a = max(0, 0.2) = 0.2
If activation is step (z > 0 → 1 else 0): output = 1

One neuron with a linear activation before threshold draws a straight line (or hyperplane in higher dimensions) separating “activate” vs “don’t.”

It can learn AND and OR gates with the right weights. It cannot learn XOR alone — no single straight line separates XOR’s 1s from 0s.

That limitation motivated multi-layer networks — hidden neurons build new features, then the output combines them.

Can one perceptron classify XOR perfectly?

Answer sketch

No. XOR is not linearly separable. You need at least one hidden layer (or a curved boundary).