How do you merge 2 data frames?

Merging Data Frames using Python

Pandas provides a single function merge(), as the entry point for all standard database join operations between DataFrame or named Series objects:


ou can use a function called “pd.merge()”. It’s used for combining two dataframes based on one or more columns. This function is similar to joins in SQL queries.

Let’s say you have two dataframes. The one is called customers, the other one is called orders.

  1. new_df = pd.merge(customers, orders, on=‘customer_id’, how=‘left’)
  • on: means a key column that exists in both dataframes. The function will match values in the key columns and combine tables together. In this example, the key column is customer_id.
  • how: refers to a join method. You can choose one of the following methods: ‘left’, ‘right’, ‘outer’, ‘inner’ or ‘cross’. In this example, I choose “left” because I want to keep all records on the left table (that is, the customers table)

Here is the result: