Skip to Content
Learn
Working with Multiple DataFrames
Left and Right Merge

Let’s return to the merge of Company A and Company B.


Left Merge

Suppose we want to identify which customers are missing phone information. We would want a list of all customers who have email, but don’t have phone.

We could get this by performing a Left Merge. A Left Merge includes all rows from the first (left) table, but only rows from the second (right) table that match the first table.

For this command, the order of the arguments matters. If the first DataFrame is company_a and we do a left join, we’ll only end up with rows that appear in company_a.

By listing company_a first, we get all customers from Company A, and only customers from Company B who are also customers of Company A.

pd.merge(company_a, company_b, how='left')

The result would look like this:

name email phone
Sally Sparrow [email protected] None
Peter Grant [email protected] 212-345-6789
Leslie May [email protected] 626-987-6543

Now let’s say we want a list of all customers who have phone but no email. We can do this by performing a Right Merge.


Right Merge

Right merge is the exact opposite of left merge. Here, the merged table will include all rows from the second (right) table, but only rows from the first (left) table that match the second table.

By listing company_a first and company_b second, we get all customers from Company B, and only customers from Company A who are also customers of Company B.

pd.merge(company_a, company_b, how="right")

The result would look like this:

name email phone
Peter Grant [email protected] 212-345-6789
Leslie May [email protected] 626-987-6543
Aaron Burr None 303-456-7891

Instructions

1.

Let’s return to the two hardware stores, Store A and Store B. They’re not quite sure if they want to merge into a big Super Store just yet.

Store A wants to find out what products they carry that Store B does not carry. Using a left merge, combine store_a to store_b and save the results to store_a_b_left.

The items with null in store_b_inventory are carried by Store A, but not Store B.

2.

Now, Store B wants to find out what products they carry that Store A does not carry. Use a left join, to combine the two DataFrames but in the reverse order (i.e., store_b followed by store_a) and save the results to the variable store_b_a_left.

Which items are not carried by Store A, but are carried by Store B?

3.

Paste the following code into script.py:

print(store_a_b_left) print(store_b_a_left)

What do you notice about these two DataFrames?

How are they different?

How are they the same?

Folder Icon

Sign up to start coding

Already have an account?