What is the main purpose of cross-validation in machine learning?
a) To evaluate the performance of a model on unseen data
b) To reduce the size of the dataset
c) To generate more data for training
d) To choose the features for the model
Answer:
a) To evaluate the performance of a model on unseen data
Explanation:
Cross-validation is a technique used in machine learning to evaluate how well a model will perform on unseen data. It involves splitting the dataset into multiple subsets, training the model on some subsets, and testing it on the remaining subsets.
The most common form is k-fold cross-validation, where the data is divided into k subsets (folds), and the model is trained and evaluated k times, each time using a different fold as the test set and the remaining as the training set.
This process helps in assessing the model’s performance more accurately and prevents overfitting, as it ensures that the model is evaluated on different subsets of the data rather than just a single train-test split.