What is the “Markov decision process” (MDP) in reinforcement learning?
a) A process used to make supervised learning predictions
b) A mathematical framework to model decision-making with rewards and states
c) An algorithm for training deep neural networks
d) A system for classifying data
Answer:
b) A mathematical framework to model decision-making with rewards and states
Explanation:
An MDP is a mathematical model used in reinforcement learning to define the environment in terms of states, actions, rewards, and transitions.
Reference:
Reinforcement Learning (RL) Quiz – MCQ Questions and Answers