Explain why Data Cleansing is essential and which method you use to maintain clean data?

Explain why Data Cleansing is essential and which method you use to maintain clean data?

Data ,when collected from various resources as stated in (https://medium.com/@TheDataGyan/day-6-getting-data-in-r-9b704ac9c31d)can be really untidy.

  1. It may not be segregated in terms of it’s feature values, neither it might be available in a clean tabular format.
  2. It can be redundant, full of missing values and outliers( Values which are very far from the desired range of a feature).
  3. It may not be understandable.
  4. It may not have well defined format.

so before appliying it to the model it needs to be processed. This is called as Data cleaning
REF: https://medium.com/sciforce/data-cleaning-and-preprocessing-for-beginners-25748ee00743