About searching for missing values and outliers
This is an article about searching for missing values and outliers.
hello!
Today we will learn about searching for missing values and outliers.
Missing values and outliers are major issues in data analysis, and detecting and processing them is important to increase the accuracy and reliability of data.
Below is information on searching for missing values and outliers.
Search for Missing Values
Search through visualization
Missing values can be visually identified mainly through heat maps, graphs showing the distribution of missing data, and graphs showing patterns of missing data.
Check basic statistics
Check the percentage of missing values for each variable, and if there are missing values, check basic statistics such as the mean, median, and standard deviation of the variable to determine the impact of the missing values.
Missing value pattern analysis
Check whether missing values are random or have a pattern, and determine how missing values are related to other variables.
Search for outliers
Search through visualization
Visually check the distribution of outliers through boxplots, scatter plots, and histograms.
Check basic statistics
Check the basic statistics of the variable, especially the mean, standard deviation, minimum value, maximum value, etc. to check the possibility of outliers.
Outlier pattern analysis
Check outliers by analyzing whether they have a specific pattern or strange behavior in relationships with other variables.
Conclusion
Missing values and outliers are elements that compromise the accuracy and reliability of data, and detecting and processing them is important to improve data quality.
You can increase the reliability of data by identifying the characteristics of the data and how to process it.
thank you!