What is “temporal difference learning” in reinforcement learning?
a) A supervised learning method
b) A way to learn the difference between consecutive states
c) A combination of Monte Carlo methods and dynamic programming
d) A method to learn from a fixed dataset
Answer:
c) A combination of Monte Carlo methods and dynamic programming
Explanation:
Temporal difference (TD) learning is a combination of Monte Carlo methods and dynamic programming. It updates the value of states based on the difference between estimated future rewards and actual rewards observed over time.
Reference:
Reinforcement Learning (RL) Quiz – MCQ Questions and Answers