Concatenating data in Pandas
Concatenation combines one or more different DataFrames into one. The concat() function of Pandas for combining DataFrames across rows or columns. Consider the following DataFrames:
import pandas as pd
record1 = [[‘John’,14,82.5],[‘Maria’,12,90.0],[‘Tom’,13,77.0]]
df1 = pd.DataFrame(record1,columns=[‘Name’,’Age’,’Marks’],index=[0, 1, 2])
print(df1)
record2 = [[‘Ben’,12,65.5],[‘Amy’,12,71.0],[‘Tina’,14,63.5]]
df2 = pd.DataFrame(record2,columns=[‘Name’,’Age’,’Marks’],index=[3, 4, 5])
print(df2)
record3 = [[‘Adam’,15,87.0],[‘Carla’,14,73.0]]
df3 = pd.DataFrame(record3,columns=[‘Name’,’Age’,’Marks’],index=[6, 7])
print(df3)
The three DataFrames are:
Name Age Marks
0 John 14 82.5
1 Maria 12 90.0
2 Tom 13 77.0
Name Age Marks
3 Ben 12 65.5
4 Amy 12 71.0
5 Tina 14 63.5
Name Age Marks
6 Adam 15 87.0
7 Carla 14 73.0
Now, to concatenate them into one, we use:
df = pd.concat([df1,df2,df3])
Name Age Marks
0 John 14 82.5
1 Maria 12 90.0
2 Tom 13 77.0
3 Ben 12 65.5
4 Amy 12 71.0
5 Tina 14 63.5
6 Adam 15 87.0
7 Carla 14 73.0
Using concat() , we can also concatenate along the columns. We just need to change the parameter “ axis=1 ”. The type of join can also be specified. For example:
import pandas as pd
record1 = [[‘S1′,’John’,14], [‘S2′,’Maria’,12], [‘S3′,’Tom’,13], [‘S4′,’Adam’,15]]
df1 = pd.DataFrame(record1,columns=[‘S_Id’,’Name’,’Age’])
print(df1)
record2 = [[‘S1’,14,82.5], [‘S2’,13,90.0], [‘S3’,14,77.0], [‘S4’,15,87.0]]
df2 = pd.DataFrame(record2,columns=[‘S_Id’,’Age’,’Marks’])
print(df2)
df = pd.concat([df1,df2],axis=1)
print(df)
The two DataFrames are:
S_Id Name Age
0 S1 John 14
1 S2 Maria 12
2 S3 Tom 13
3 S4 Adam 15
S_Id Age Marks
0 S1 14 82.5
1 S2 13 90.0
2 S3 14 77.0
3 S4 15 87.0
And the concatenated one is:
S_Id Name Age S_Id Age Marks
0 S1 John 14 S1 14 82.5
1 S2 Maria 12 S2 13 90.0
2 S3 Tom 13 S3 14 77.0
3 S4 Adam 15 S4 15 87.0
Summary
This article focused on Concatenating data in Pandas.