Logistic regression and its implementation

swapneel-panda-419bc751 · 28 May 2021 16:03

In simple words , Logistic Regression is a classification algorithm for categorical variables like Yes/No, True/False, 0/1, etc. Logistic regression is a supervised learning algorithm that is widely used by Data Scientists for classification purposes as well as for calculating probabilities. This is a very useful and easy algorithm. So, if you are new to the world of data science, then you will definitely enjoy learning this algorithm. This algorithm is used for classifying both binary and multiclass datasets.

How can we implement it?

This algorithm can be implemented in two ways. The first way is to write your own functions i.e. you code your own sigmoid function, cost function, gradient function, etc. instead of using some library. The second way is, of course as I mentioned, to use the Scikit-Learn library. The Scikit-Learn library makes our life easier and pretty good. All functions are already built-in, you just need to call those functions by passing the required parameters into it.

Making predictions and implementing sigmoid function

In logistic regression, we have to find the probability of each entry in the training set using the sigmoid function. The sigmoid function is a very important topic and must be clear to you if you have read my article (link given above.) So let me introduce a vector X and we will call it a ‘feature’ vector from now. Theta is a vector and we will call it the ‘weights’ vector. Look at the image below