Exploratory Data Analysis (EDA) helps analysts to understand the data better and forms the foundation of better models.
Visualization
- Univariate visualization
- Bivariate visualization
- Multivariate visualization
Missing Value Treatment – Replace missing values with Either Mean/Median
Outlier Detection – Use Boxplot to identify the distribution of Outliers, then Apply IQR to set the boundary for IQR
Transformation – Based on the distribution, apply a transformation on the features
Scaling the Dataset – Apply MinMax, Standard Scaler or Z Score Scaling mechanism to scale the data.
Feature Engineering – Need of the domain, and SME knowledge helps Analyst find derivative fields which can fetch more information about the nature of the data
Dimensionality reduction — Helps in reducing the volume of data without losing much information