What is “cross-validation” used for in machine learning?
a) To evaluate the performance of a model on different subsets of data
b) To cluster data points
c) To reduce the dimensionality of the dataset
d) To generate new features from existing data
Answer:
a) To evaluate the performance of a model on different subsets of data
Explanation:
Cross-validation is a technique used to evaluate the performance of a machine learning model by splitting the dataset into several subsets or “folds.” The model is trained on some of these subsets and tested on the remaining subsets, and this process is repeated multiple times.
The most common form of cross-validation is k-fold cross-validation, where the data is split into k equally-sized folds. Each fold is used as the validation set once, and the final performance is averaged over all k iterations.
Cross-validation helps prevent overfitting by providing a more accurate estimate of the model’s performance on unseen data. It also allows for better hyperparameter tuning by giving feedback on how well the model generalizes.