Welcome to Module 5
Before we begin
Welcome to Module 5 of Deep Reinforcement Learning. This module is standalone within the track — read lessons in order unless you already know the prerequisites listed below.
Figure
Module 5 at a glance
What this module covers
| Lesson | Focus |
|---|---|
| Why learn policies directly | Core concepts + checkpoints |
| REINFORCE & the policy gradient theorem | Core concepts + checkpoints |
| Baseline & variance reduction | Core concepts + checkpoints |
| Actor–critic architecture | Core concepts + checkpoints |
| Quiz | Multiple-choice review with lesson links |
| Project | Portfolio-ready code you can extend |
Prerequisites
- Comfort with basic Python and NumPy.
- For Module 1: no prior RL required. Later modules assume earlier modules in this track (or equivalent background).
- Optional: the AI course Modules 1–3 help with gradients and neural networks before Module 4.
Figure
The RL loop
What to install before the project
- Python 3.10+
pip install gymnasium numpy matplotlib- From Module 4 onward:
pip install torch - From Module 6 onward:
pip install stable-baselines3(optional but recommended for PPO/SAC labs)
Ready?
Open the first technical lesson: Why learn policies directly