Pandas DataFrame.describe()
The describe() method is used for calculating some statistical data like percentile, mean and std of the numerical values of the Series or DataFrame. It analyzes both numeric and object series and also the DataFrame column sets of mixed data types.
Syntax
DataFrame.describe(percentiles=None, include=None, exclude=None)
Parameters
- percentile: It is an optional parameter which is a list like data type of numbers that should fall between 0 and 1. Its default value is [.25, .5, .75], which returns the 25th, 50th, and 75th percentiles.
- include: It is also an optional parameter that includes the list of the data types while describing the DataFrame. Its default value is None.
- exclude: It is also an optional parameter that exclude the list of data types while describing DataFrame. Its default value is None.
Returns
It returns the statistical summary of the Series and DataFrame.
Example1
import pandas as pd import numpy as np a1 = pd.Series([1, 2, 3]) a1.describe()
Output
count 3.0
mean 2.0
std 1.0
min 1.0
25% 1.5
50% 2.0
75% 2.5
max 3.0
dtype: float64
Example2
import pandas as pd import numpy as np a1 = pd.Series(['p', 'q', 'q', 'r']) a1.describe()
Output
count 4
unique 3
top q
freq 2
dtype: object
Example3
import pandas as pd import numpy as np a1 = pd.Series([1, 2, 3]) a1.describe() a1 = pd.Series(['p', 'q', 'q', 'r']) a1.describe() info = pd.DataFrame({'categorical': pd.Categorical(['s','t','u']), 'numeric': [1, 2, 3], 'object': ['p', 'q', 'r'] }) info.describe(include=[np.number]) info.describe(include=[np.object]) info.describe(include=['category'])
Output
categorical
count 3
unique 3
top u
freq 1
Example4
import pandas as pd import numpy as np a1 = pd.Series([1, 2, 3]) a1.describe() a1 = pd.Series(['p', 'q', 'q', 'r']) a1.describe() info = pd.DataFrame({'categorical': pd.Categorical(['s','t','u']), 'numeric': [1, 2, 3], 'object': ['p', 'q', 'r'] }) info.describe() info.describe(include='all') info.numeric.describe() info.describe(include=[np.number]) info.describe(include=[np.object]) info.describe(include=['category']) info.describe(exclude=[np.number]) info.describe(exclude=[np.object])
Output
categorical numeric
count 3 3.0
unique 3 NaN
top u NaN
freq 1 NaN
mean NaN 2.0
std NaN 1.0
min NaN 1.0
25% NaN 1.5
50% NaN 2.0
75% NaN 2.5
max NaN 3.0