How to turn every two records of a Datframe into a single in Python based on two columns

Asked

Viewed 40 times

0

Expensive,

I have a Dataframe in which every two lines refer to a single record. I need to make a union of these lines based on the "id" and the "boss" to turn into one. The "principal" column has value 1 indicating that the team is the principal.

lst = [["1001","1","LA Lakers", 105, 12],["1001","0","Utah Jazz", 99, 10], ["1002","1","Chicago Bulls", 95, 8], ["1002","0","Orlando Magic", 90, 9], ["1003","1","Denver Nuggets", 101, 17], ["1003","0","Miami Heat", 84, 6]]
df = pd.DataFrame(lst, columns = ["id", "mandante", "time", "pontuacao", "faltas"])

   id    mandante  time            pontuacao  faltas
0  1001  1         LA Lakers       105        12   
1  1001  0         Utah Jazz       99         10
2  1002  1         Chicago Bulls   95         8   
3  1002  0         Orlando Magic   90         9
4  1003  1         Denver Nuggets  101        17   
5  1004  0         Miami Heat      84         6

I need Dataframe to look like this:

   id    mandante       visitante       pts_mandante  pts_visitante  flts_mandante  flts_visitante
0  1001  LA Lakers      Utah Jazz       105           99             12               10
2  1002  Chicago Bulls  Orlando Magic   95            90             8                9   
4  1003  Denver Nuggets Miami Heat      101           84             17               6

It’s okay if I keep my spine repeated, then I delete easily. The main thing is to be able to play all the information of the same "id" in an index record.

1 answer

3


One way would be to divide the DF into 2, between patrons and visitors, rename the columns of each DF according to your wishes and with merge unite them again, in this way:

mask = df['mandante'] == '1'

df = df.drop(columns = ['mandante'])

mandante = df.loc[mask]
visitante = df.loc[~mask]

colunas_mandante = ['id', 'mandante', 'pts_mandante',  'flts_mandante']
colunas_visitante = ['id', 'visitante', 'pts_visitante',  'flts_visitante']


mandante.columns = colunas_mandante
visitante.columns = colunas_visitante

df_new = mandante.merge(visitante, on='id')

#saida:

    id          mandante        pts_mandante    flts_mandante   visitante       pts_visitante   flts_visitante
0   1001        LA Lakers       105             12              Utah Jazz       99              10
1   1002        Chicago Bulls   95              8               Orlando Magic   90              9
2   1003        Denver Nuggets  101             17              Miami Heat      84              6

Browser other questions tagged

You are not signed in. Login or sign up in order to post.