← Back to curriculum

Module 3 — Neural networks basics

Module 3 quiz & review

25 interactive questions on activations, forward/backward passes, vanishing gradients, and classification loss.

~55 min read + exercises

Module 3 quiz and review

Before we begin

Test activations, forward/backward passes, and loss before the MNIST project. Aim for at least 19 out of 25.


Multiple choice quiz

Interactive quiz

Pick one answer per question. Feedback appears immediately — take your time before clicking.

0 / 25 correct·0 answered
  1. Question 1 of 25

    What does a perceptron compute before applying an activation?

    Answer options for question 1
  2. Question 2 of 25

    Why can a single perceptron not learn XOR?

    Answer options for question 2
  3. Question 3 of 25

    Why do we need activation functions in hidden layers?

    Answer options for question 3
  4. Question 4 of 25

    What is a key advantage of ReLU over sigmoid in hidden layers?

    Answer options for question 4
  5. Question 5 of 25

    Which activation is most common for binary output probability?

    Answer options for question 5
  6. Question 6 of 25

    Forward propagation means:

    Answer options for question 6
  7. Question 7 of 25

    What does backpropagation actually compute?

    Answer options for question 7
  8. Question 8 of 25

    What is the vanishing gradient problem?

    Answer options for question 8
  9. Question 9 of 25

    For multi-class digit classification (0–9), which loss is standard?

    Answer options for question 9
  10. Question 10 of 25

    Loss is high but not decreasing. What is a sensible first check?

    Answer options for question 10
  11. Question 11 of 25

    A network has input 784, hidden 128, output 10. How many weight connections (approx.) from input to hidden?

    Answer options for question 11
  12. Question 12 of 25

    Sigmoid output near 0 or 1 with saturated hidden sigmoids contributed to vanishing gradients because:

    Answer options for question 12
  13. Question 13 of 25

    After forward pass and loss, weights update using:

    Answer options for question 13
  14. Question 14 of 25

    Stacking multiple neurons in a hidden layer fixes XOR because:

    Answer options for question 14
  15. Question 15 of 25

    Why use softmax + cross-entropy together on the output layer for MNIST?

    Answer options for question 15
  16. Question 16 of 25

    A perceptron without activation on the output is limited because:

    Answer options for question 16
  17. Question 17 of 25

    Tanh activation outputs are in range:

    Answer options for question 17
  18. Question 18 of 25

    In a 3-layer network (input → hidden → output), forward propagation computes:

    Answer options for question 18
  19. Question 19 of 25

    Backpropagation is efficient because it:

    Answer options for question 19
  20. Question 20 of 25

    Why is ReLU often preferred over sigmoid in hidden layers?

    Answer options for question 20
  21. Question 21 of 25

    MSE loss is commonly used for:

    Answer options for question 21
  22. Question 22 of 25

    If a network has no activation between linear layers, stacking many layers is equivalent to:

    Answer options for question 22
  23. Question 23 of 25

    Learning rate too high during neural net training often causes:

    Answer options for question 23
  24. Question 24 of 25

    The perceptron computes weighted sum + bias, then applies:

    Answer options for question 24
  25. Question 25 of 25

    For binary classification (spam vs ham), a common output activation + loss pair is:

    Answer options for question 25

After the quiz

19/25 or higher? Start the MNIST project.

Checklist:

  • I can explain why hidden layers need activations.
  • I can describe forward vs backward pass roles.
  • I know what vanishing gradients means.
  • I can name the standard MNIST loss setup.

What's next

Project: MNIST digit classifier + draw UI