第二课:Bellman Equation
Return: Evaluate policies
the sum of the rewards obtained along a trajectory.
How to calculate?
- Definition
- Bootstrapping

Some simplified notes and thoughts when learning math foundation of RL
the sum of the rewards obtained along a trajectory.