What are some advantages and disadvantages of Logistic Regression?

Advantages of Logistic Regression:

  • Logistic regression is easier to implement, and interpret, and very efficient to train.
  • It makes no assumptions about distributions of classes in feature space.
  • It can easily extend to multiple classes(multinomial regression) and a natural probabilistic view of class predictions.
  • It not only provides a measure of how appropriate a predictor(coefficient size)is, but also its direction of association (positive or negative).
  • It is very fast at classifying unknown records.
  • Good accuracy for many simple data sets and it performs well when the dataset is linearly separable.
  • It can interpret model coefficients as indicators of feature importance.
  • Logistic regression is less inclined to over-fitting but it can overfit in high dimensional datasets. One may consider Regularization (L1 and L2) techniques to avoid over-fitting in these scenarios.

Disadvantages of Logistic Regression:

  • If the number of observations is lesser than the number of features, Logistic Regression should not be used, otherwise, it may lead to overfitting.
  • It constructs linear boundaries.
  • The major limitation of Logistic Regression is the assumption of linearity between the dependent variable and the independent variables.
  • It can only be used to predict discrete functions. Hence, the dependent variable of Logistic Regression is bound to the discrete number set.
  • Non-linear problems can’t be solved with logistic regression because it has a linear decision surface. Linearly separable data is rarely found in real-world scenarios.
  • Logistic Regression requires average or no multicollinearity between independent variables.
  • It is tough to obtain complex relationships using logistic regression. More powerful and compact algorithms such as Neural Networks can easily outperform this algorithm.
  • In Linear Regression independent and dependent variables are related linearly. But Logistic Regression needs that independent variables are linearly related to the log odds (log(p/(1-p)).