Different types of data require different types of cleaning, the most important steps of Data Cleaning are:
- Data Quality
- Removing Duplicate Data (also irrelevant data)
- Structural errors
- Outliers
- Treatment for Missing Data
Data Cleaning is an important step before analysing data, it helps to increase the accuracy of the model. This helps organisations to make an informed decision.
Data Scientists usually spends 80% of their time cleaning data.