Pandas map() and reduce() Operations
In this article, we will focus on the map() and reduce() operations in Pandas and how they are used for Data Manipulation.
map()
Pandas map() operation is used to map the values of a Series according to the given input value which can either be another Series, a dictionary, or a function. map() operation does not work on a DataFrame.
Syntax:
Series.map(arg, na_action=None)
The parameters are:
- arg : (Series, dict, or function) mapping correspondence
- na_action : (None, ‘ignore’) If ‘ignore’, then propagate NaN values, without passing them to the mapping correspondence (default: None)
Let us look at few examples of map() operation on the following Series:
import numpy as np
import pandas as pd
country = [‘Germany’, ‘Canada’, np.nan, ‘Japan’, ‘Australia’]
series = pd.Series(country)
print(series)
This gives the following Series:
0 Germany
1 Canada
2 NaN
3 Japan
4 Australia
dtype: object
Now applying map() operations on this Series, by using a dictionary as an argument:
series.map({‘Canada’: ‘Ottawa’, ‘Japan’: ‘Tokyo’, ‘Australia’:’Canberra’})
Output:
0 NaN
1 Ottawa
2 NaN
3 Tokyo
4 Canberra
dtype: object
You can also map it to a function, for example:
print(series.map(‘He is from {}’.format, na_action=’ignore’))
Output:
0 He is from Germany
1 He is from Canada
2 NaN
3 He is from Japan
4 He is from Australia
dtype: object
If we don’t use na_action=‘ignore’ here, then it would change the line at index 2 as – “He is from nan” .
reduce()
reduce() operation is used on a Series to apply the function passed in its argument to all elements on the Series. reduce() is defined in the functools module of Python.
The way the algorithm of this function works is that initially, the function is called with the first two elements from the Series and the result is returned. The function is now applied to this result and the next element in the Series. The process keeps repeating itself until there are items in the sequence. The final result is ultimately returned by the function.
For example, consider the following series:
import pandas as pd
data = [11,6,7,3,28,1]
series = pd.Series(data) print(series)
The series is:
0 11
1 6
2 7
3 3
4 28
5 1
dtype: int64
Now, let’s apply a function on this Series that uses reduce to find the product of all elements in the list:
import functools module
import functools
using reduce operation to apply function on the series
product = functools.reduce(lambda x,y : x*y,series)
print (“Product: “,product,sep=””)
Output:
Product: 38808
Look at another example which uses reduce() to find minimum element of the Series:
import functools module
import functools
using reduce operation to apply function on the series
minimum = functools.reduce(lambda x,y : x if x < y else y,series)
print (“Minimum value: “,minimum,sep=””)
Output:
Minimum value: 1
Summary
In this article, we looked at map() and reduce() functions.