Feature Engineering Techniques

Feature Engineering is the process of extracting features from row data using the domain knowledge of the problem. These features can be used to improve the performance of machine learning algorithm and if the performance increases then it will give the best accuracy. Feature engineering is most important art in machine learning which creates huge difference between a good model OR bad model.
Feature Engineering is one of most important step in Data Science project life cycle.

  • Continuous data
  • Categorical data
  • Missing values
  • Normalization
  • Dates and Time
    Categorical features are those features in which data type is an object type. The value of any data point in any categorical feature is not in numerical form rather than it was object form.
    Below are the some important techniques used for handling the categorical data
  1. Label Encoding or Ordinal Encoding
  2. One-hot Encoding
  3. Dummy Encoding
  4. Effect Encoding
  5. Binary Encoding
  6. Basel Encoding
  7. Hash Encoding
  8. Target Encoding
    Below are the types of Encoding techniques when you’re dealing with categorical data
     Nominal Encoding
     One-hot Encoding
     One hot encoding with many categorical variables
     Mean Encoding
     Ordinal Encoding
     Label Encoding
     Target guided ordinal encoding
    One-hot Encoding
    As we know that categorical data must be converted to a numerical form. If the categorical variable is an output variable, you may also want to convert predictions by the model back into a categorical form to present them or use them in some application.
    The best example is one – hot encoding is Sex  male and female to converted to 0 and 1

Label Encoding
In label encoding in Python, we replace the categorical value with a numeric value between 0 and the number of classes minus 1. If the categorical variable value contains 5 distinct classes, we use (0, 1, 2, 3, and 4).
Library used in python –
From sklearn.preprocessing import LabelEncoder