Bias-Variance Trade-off💡

The Bias vs Variance issue comes when deciding which algorithm to choose for your ML model. As the complexity of Machine Learning algorithms increase, their Bias decreases.:small_red_triangle_down:

By bias, I mean the bias towards the data. Simpler algorithms like Logistic Regression assume a linear relationship between features, which results in a lot of bias, as the model will always fit a linear line no matter what.

However, more complex algorithms like Random Forests, don’t make such assumptions and have lesser bias. :bulb:

Variance comes into picture when we make predictions multiple times with different test data and the results are very different each time. This means that the algorithm might give a score as high as 98% sometimes and even as low as 85% sometimes, suggesting over-fitting.:chart_with_downwards_trend:

This shows that the model is accurate but not precise, hence the variation in predictions is large. This is usually the case with complex algorithms. :chart_with_upwards_trend:
However, with simpler algorithms, the variance is less. They might persistently give a low score of, say 88%, but the score won’t vary much, hence giving not accurate, but precise predictions. :bulb:

So, there is a trade-off and we need to find a middle ground between both the extreme situations.

#machinelearning #datascience