What is the exploration-exploitation tradeoff in reinforcement learning?
a) A balance between trying new actions and using known actions to maximize rewards
b) A way to decrease the loss function
c) A method to avoid overfitting
d) A function used to update the policy
Answer:
a) A balance between trying new actions and using known actions to maximize rewards
Explanation:
The exploration-exploitation tradeoff is the balance between exploring new actions (to discover potentially better rewards) and exploiting the best-known actions to maximize rewards.
Reference:
Reinforcement Learning (RL) Quiz – MCQ Questions and Answers