What is “dimensionality reduction” in machine learning?
a) A process of reducing the number of features in the dataset
b) A technique to increase the size of the dataset
c) A method to balance the training and test sets
d) A process to generate new features from existing ones
Answer:
a) A process of reducing the number of features in the dataset
Explanation:
Dimensionality reduction is a technique used in machine learning to reduce the number of features (dimensions) in a dataset. By removing irrelevant or redundant features, the model becomes simpler, less prone to overfitting, and more efficient to compute.
Common techniques for dimensionality reduction include Principal Component Analysis (PCA) and t-Distributed Stochastic Neighbor Embedding (t-SNE). These methods help maintain the most important information in the data while reducing complexity.
Dimensionality reduction is especially useful in high-dimensional datasets, where too many features can lead to the “curse of dimensionality,” making it difficult for models to learn effectively.