Pandas: How to merge two data frames?

Asked

Viewed 549 times

0

Good morning Everybody! I count on your help again.

I have 2 CSV, as below:

# f1.csv
num   ano
76971  1975
76969  1975
76968  1975
76966  1975
76964  1975
76963  1975
76960  1975

and

# f2.csv
num   ano   dou  url
76971  1975 p1   http://exemplo.com/page1
76968  1975 p2   http://exemplo.com/page10
76966  1975 p2   http://exemplo.com/page100

How can I merge the two in this way?

# resultado esperado
num   ano   dou  url
76971  1975 p1   http://exemplo.com/page1
76969  1975
76968  1975 p2   http://exemplo.com/page10
76966  1975 p2   http://exemplo.com/page100
76964  1975
76963  1975
76960  1975

1 answer

2


You have the most direct way, whose solution was inspired here (several examples of sql converted into pandas), in this case we want the left join or outer join in this case:

import pandas as pd

df1 = pd.read_csv('f1.csv')
df2 = pd.read_csv('f2.csv')

df = pd.merge(df1, df2, on=['num', 'ano'], how="left") # colocamos o ano só para ser ignorado, em vez disto podiamos fazer df2.drop(['ano'], axis=1, inplace=True) para dropar a coluna do ano de df2
print(df)

Output:

     num   ano  dou                         url
0  76971  1975   p1    http://exemplo.com/page1
1  76969  1975  NaN                         NaN
2  76968  1975   p2   http://exemplo.com/page10
3  76966  1975   p2  http://exemplo.com/page100
4  76964  1975  NaN                         NaN
5  76963  1975  NaN                         NaN
6  76960  1975  NaN                         NaN

If instead of NaN you want a string would do, you can then just:

df.fillna('', inplace=True)

DEMONSTRATION

DOCS merge
DOCS fillna

Browser other questions tagged

You are not signed in. Login or sign up in order to post.