Binning with Pandas📊🐼

Binning means converting a numerical or continuous feature into a discrete set of values, based on the ranges of the continuous values.:bulb:

This comes in handy when you want to see the trends based on what range the data point falls in.:zap:

Let’s take a look at the code in the image.

We have marks for 7 kids ranging from 0-100.

Now, we can assign every kid’s marks to a particular “bin”.:bulb:

So, when I say bins=[0,50,70,100], it means that there are 3 ranges:

0 to 50, 51-70, and 71-100 belonging to bins 1,2, and 3 respectively.

Then I just append the output as a new feature, and the marks feature can be dropped.:bar_chart:

Pandas offers 2 nice functions to achieve binning quickly: qcut and cut.

Pandas qcut takes in the number of quantiles, and divides the data points to each bin based on the data distribution.

Pandas cut, on the other hand, takes in the custom ranges defined by us, and divides the data points in those ranges.

Cool! Isn’t it?:zap:

#python #machinelearning #datascience