## 1. What is the main goal of data analysis?

### Answer:

### Explanation:

The main goal of data analysis is to transform raw data into meaningful insights, helping in decision making and understanding patterns and trends.

## 2. Which method is commonly used for finding the average value in a dataset?

### Answer:

### Explanation:

The mean, or average, is calculated by adding up all the numbers and then dividing by the count of numbers, providing a central value for the dataset.

## 3. What is a pivot table used for in data analysis?

### Answer:

### Explanation:

Pivot tables are used in data analysis for summarizing, sorting, reorganizing, grouping, counting, or averaging data stored in a database.

## 4. What is 'data cleansing'?

### Answer:

### Explanation:

Data cleansing involves detecting and correcting (or removing) corrupt or inaccurate records from a record set, table, or database, ensuring data quality and reliability.

## 5. What does a scatter plot show?

### Answer:

### Explanation:

Scatter plots are used to plot data points on a horizontal and a vertical axis to show how much one variable is affected by another, representing the relationship between two continuous variables.

## 6. In data analysis, what is 'regression analysis' used for?

### Answer:

### Explanation:

Regression analysis is a set of statistical processes for estimating the relationships among variables, often used to predict the value of a dependent variable based on independent variables.

## 7. What is 'time series analysis'?

### Answer:

### Explanation:

Time series analysis involves analyzing data points collected or sequenced at specific time intervals to identify trends, cycles, or seasonal variations.

## 8. What is a 'hypothesis test' in data analysis?

### Answer:

### Explanation:

Hypothesis testing is a statistical method that is used in making statistical decisions using experimental data. It is basically an assumption that we make about the population parameter.

## 9. What is 'data normalization'?

### Answer:

### Explanation:

Data normalization is a process in which data attributes within a data model are organized to increase the cohesion of entity types, reducing redundancy and dependency by dividing a table into two or more tables and defining relationships between the tables.

## 10. What does 'data mining' refer to in data analysis?

### Answer:

### Explanation:

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems.

## 11. What is a 'correlation coefficient'?

### Answer:

### Explanation:

The correlation coefficient is a statistical measure that calculates the strength of the relationship between the relative movements of two variables. The values range between -1.0 and 1.0.

## 12. What is 'qualitative data'?

### Answer:

### Explanation:

Qualitative data is data that approximates and characterizes. Qualitative data can be observed and recorded. This data type is non-numerical in nature.

## 13. What is 'quantitative data'?

### Answer:

### Explanation:

Quantitative data is data expressing a certain quantity, amount, or range. Usually, there are measurement units associated with the data, such as meters, in the case of the height of a person.

## 14. In data analysis, what is 'clustering'?

### Answer:

### Explanation:

Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group than those in other groups.

## 15. What is a 'bar chart' used for in data analysis?

### Answer:

### Explanation:

A bar chart or bar graph is a chart or graph that presents categorical data with rectangular bars with heights or lengths proportional to the values that they represent. The bars can be plotted vertically or horizontally.

## 16. What is 'big data'?

### Answer:

### Explanation:

Big data refers to data that is so large, fast, or complex that it's difficult or impossible to process using traditional methods.

## 17. What is 'descriptive statistics'?

### Answer:

### Explanation:

Descriptive statistics is the discipline of quantitatively describing the main features of a collection of data. Descriptive statistics are distinguished from inferential statistics (or inductive statistics) by its aim to summarize a sample, rather than use the data to learn about the population that it is intended to represent.

## 18. What is the purpose of a 'line chart' in data analysis?

### Answer:

### Explanation:

A line chart or line plot is a type of chart which displays information as a series of data points called 'markers' connected by straight line segments. It is a basic type of chart common in many fields and is used to visualize trends in data over intervals of time.

## 19. What is 'data visualization'?

### Answer:

### Explanation:

Data visualization is the graphic representation of data. It involves producing images that communicate relationships among the represented data to viewers of the images. This communication is achieved through the use of a systematic mapping between graphic marks and data values in the creation of the visualization.

## 20. In data analysis, what is 'dimensionality reduction'?

### Answer:

### Explanation:

Dimensionality reduction is the process of reducing the number of random variables under consideration by obtaining a set of principal variables, often used in data analysis to simplify models and avoid overfitting.

## 21. What is 'ANOVA' (Analysis of Variance) used for in data analysis?

### Answer:

### Explanation:

ANOVA is a statistical method used to test differences between two or more means. It may seem like a confusing statistic, but it’s based on quite a simple idea, providing a technique for testing the hypothesis that there is no significant difference between means and their associated treatments or groups.

## 22. What does 'outlier detection' involve in data analysis?

### Answer:

### Explanation:

Outlier detection involves identifying data points that differ significantly from the majority of the data, which could indicate variability in measurement or experimental errors.

## 23. What is 'predictive analytics'?

### Answer:

### Explanation:

Predictive analytics is the process of using data, statistical algorithms, and machine learning techniques to identify the likelihood of future outcomes based on historical data.

## 24. What is a 'heatmap' used for in data analysis?

### Answer:

### Explanation:

A heatmap is a data visualization technique that shows magnitude of a phenomenon as color in two dimensions. The variation in color may be by hue or intensity, giving clear visual cues to the reader about how the phenomenon is clustered or varies over space.

## 25. What is 'principal component analysis' (PCA)?

### Answer:

### Explanation:

Principal Component Analysis (PCA) is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.