Concatenate two Dataframes

Asked

Viewed 1,391 times

1

I need to generate a new dataframe with the concatenation of two dataframes. The following code works, but it takes a long time to run.

df_concatena = pd.DataFrame()

for x in range(len(df)):
    for y in range(len(data)):
        df_concatena = df_concatena.append(pd.concat([df.iloc[x], data.iloc[y]]), ignore_index=True)

I tried to use apply but was unsuccessful.

Example df: df.Shape -> 81476

'Valor','Clase','Tempo'
44.99  , 'A'   , 5
61.49  , 'B'   , 8
102.24 , 'C'   , 6
51.07  , 'B'   , 8
32.78  , 'B'   , 12
30.05  , 'B'   , 10

Example date: date.Shape -> 21

'Dia_Semana','Faixa'
    0       , 'A'
    0       , 'B'  
    0       , 'C'  
    1       , 'A'  
    1       , 'B'  
    1       , 'C'  

For each df row, I need to add all 21 date lines.

  • Can you give us an example of how they are df and data?

  • Good tad, @N.Peterson. If your goal is just to concatenate pandas dataframes, I suggest the module’s "Concat" function. The documentation is here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html

2 answers

0

You need to make the Cartesian product in those two dataframes. One option would be to use the merge, but it is necessary to create a key to connect these two dataframes, in case you call her k. And at the end of the one-man operation drop in that key. follow example code below:

import pandas as pd

df1 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                     'B': ['B0', 'B1', 'B2', 'B3']})


df2 = pd.DataFrame({'C': ['C0', 'C1'],
                      'D': ['D0', 'D1']})

df1['k'] = 1
df2['k'] = 1

resultado = pd.merge(df1, df2, on=['k'])

resultado = resultado.drop(["k"],axis=1)
print(resultado.head(100))

##########
resultado
    A   B   C   D
0  A0  B0  C0  D0
1  A0  B0  C1  D1
2  A1  B1  C0  D0
3  A1  B1  C1  D1
4  A2  B2  C0  D0
5  A2  B2  C1  D1
6  A3  B3  C0  D0
7  A3  B3  C1  D1

0

Another option is to use the pd.concat:

dfA = pd.read_csv('yourfile_A.csv')
dfB = pd.read_csv('yourfile_B.csv')

df = pd.concat([dfA, dfB], axis=1)

Axis = 0 or Axis = 1 defines whether you want to join the dataframes per line (equal to append) or per column.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.