Pandas DataFrame.replace()
Pandas replace() is a very rich function that is used to replace a string, regex, dictionary, list, and series from the DataFrame. The values of the DataFrame can be replaced with other values dynamically. It is capable of working with the Python regex(regular expression).
It differs from updating with .loc or .iloc , which requires you to specify a location where you want to update with some value.
Syntax:
DataFrame.replace(to_replace=None, value=None, inplace=False, limit=None, regex=False, method=‘pad’, axis=None)
Parameters:
- to_replace: Defines a pattern that we are trying to replace in dataframe.
- value: It is a value that is used to fill holes in the DataFrame (e.g., 0), alternately a dict of values that specify which value to use for each column (columns not in the dict will not be filled).
It also allow such objects of regular expressions, strings, and lists or dicts, etc. - inplace: If it is True, then it replaces in place.
Note: It will also modify any other views on this object (e.g., a column from a DataFrame). Returns the caller if this is True.
- limit: It defines the maximum size gap to forward or backward fill.
- regex: It checks whether to interpret to_replace and/or value as regular expressions. If it is True, then to_replace must be a string. Otherwise, to_replace must be None because this parameter will be interpreted as a regular expression or a list, dict, or array of regular expressions.
- method: It is a method to use for replacement when to_replace is a list.
Returns: It returns a DataFrame object after the replacement.
Example1:
import pandas as pd info = pd.DataFrame({'Language known': ['Python', 'Android', 'C', 'Android', 'Python', 'C++', 'C']}, index=['Parker', 'Smith', 'John', 'William', 'Dean', 'Christina', 'Cornelia']) print(info) dictionary = {"Python": 1, "Android": 2, "C": 3, "Android": 4, "C++": 5} info1 = info.replace({"Language known": dictionary}) print("\n\n") print(info1)
Output
Language known
Parker Python
Smith Android
John C
William Android
Dean Python
Christina C++
Cornelia C
Language known
Parker 1
Smith 4
John 3
William 4
Dean 1
Christina 5
Cornelia 3
Example2:
The below example replaces a value with another in a DataFrame.
import pandas as pd info = pd.DataFrame({ 'name':['Parker','Smith','John'], 'age':[27,34,31], 'city':['US','Belgium','London'] }) info.replace([29],38)
Output
name age City
0 Parker 27 US
1 Smith 34 Belgium
2 John 38 London
Example3:
The below example replaces the values from a dict:
import pandas as pd info = pd.DataFrame({ 'name':['Parker','Smith','John'], 'age':[27,34,31], 'city':['US','Belgium','London'] }) info.replace({ 34:29, 'Smith':'William' })
** Output**
name age City
0 Parker 27 US
1 William 29 Belgium
2 John 31 London
Example4:
The below example replaces the values from regex:
import pandas as pd info = pd.DataFrame({ 'name':['Parker','Smith','John'], 'age':[27,34,31], 'city':['US','Belgium','London'] }) info.replace('Sm.+','Ela',regex=True)
Output
name age City
0 Parker 27 US
1 Ela 34 Belgium
2 John 31 London