How to add a new column with the group average in pandas?

Asked

Viewed 63 times

0

I’m trying to insert a new column with average home goals per team .

I calculated the average, but I cannot insert this value in a new column.

database: https://www.football-data.co.uk/brazil.phpinserir a descrição da imagem aqui

My code:

dataset = dataset.drop ('League', axis = 1)
dataset = dataset.drop ('Country', axis = 1)
dataset = dataset.drop ('Time', axis = 1)
display(dataset)

dataset = dataset.drop ('Date', axis = 1)
dataset.columns = ["Temporada","Home","Away","Gols Casa", "Gols Fora", "Resultado", "Odds Casa","Odds Empate","Odds Fora"]

dataset.head()
Media_GolsCasa = dataset[['Home','Gols Casa']].groupby('Home').mean()
Media_GolsCasa
  • can you provide the database? Only with the description is difficult to understand the problem. Also consider replacing the image with a markdown formatted table

  • Thanks for the tip , insert the bank to make it easier to understand .

  • If one of the answers below solved your problem and there was no doubt left, choose the one you liked the most and mark it as correct/accepted by clicking on the " " that is next to it, which also marks your question as solved. If you still have any doubt or.

2 answers

2

Importing the pandas

import pandas as pd

Reading the dataset

dataset = pd.read_excel('./BRA.xlsx')

Removing unnecessary columns

dataset.drop(columns=['League','Country','Time','Date'], inplace=True)

Renaming columns

dataset.columns = ["Temporada","Home","Away","Gols Casa", "Gols Fora", 
                   "Resultado", "Odds Casa","Odds Empate","Odds Fora",
                  'MaxH','MaxD','MaxA','AvgH','AvgD','AvgA']

Grouping to calculate the average

Media_GolsCasa = dataset[['Home','Gols Casa']].groupby('Home').mean()
Media_GolsCasa.rename(columns={'Gols Casa':'Média gols Casa'}, inplace=True)

Joining one data frame with the other

dataset.join(Media_GolsCasa, on='Home')

Exit

|     | Temporada | Home           | Away           | Gols Casa | Gols Fora | Resultado | Odds Casa | Odds Empate | Odds Fora | MaxH | MaxD | MaxA  | AvgH | AvgD | AvgA  | Média gols Casa    |
|-----|-----------|----------------|----------------|-----------|-----------|-----------|-----------|-------------|-----------|------|------|-------|------|------|-------|--------------------|
| 0   | 2012      | Palmeiras      | Portuguesa     | 1.0       | 1.0       | D         | 1.75      | 3.86        | 5.25      | 1.76 | 3.87 | 5.31  | 1.69 | 3.5  | 4.9   | 1.7133333333333334 |
| 1   | 2012      | Sport Recife   | Flamengo RJ    | 1.0       | 1.0       | D         | 2.83      | 3.39        | 2.68      | 2.83 | 3.42 | 2.7   | 2.59 | 3.23 | 2.58  | 1.3282442748091603 |
| 2   | 2012      | Figueirense    | Nautico        | 2.0       | 1.0       | H         | 1.6       | 4.04        | 6.72      | 1.67 | 4.05 | 7.22  | 1.59 | 3.67 | 5.64  | 1.131578947368421  |
| 3   | 2012      | Botafogo RJ    | Sao Paulo      | 4.0       | 2.0       | H         | 2.49      | 3.35        | 3.15      | 2.49 | 3.39 | 3.15  | 2.35 | 3.26 | 2.84  | 1.3377483443708609 |
| 4   | 2012      | Corinthians    | Fluminense     | 0.0       | 1.0       | A         | 1.96      | 3.53        | 4.41      | 1.96 | 3.53 | 4.41  | 1.89 | 3.33 | 3.89  | 1.488235294117647  |
| 5   | 2012      | Internacional  | Coritiba       | 2.0       | 0.0       | H         | 1.76      | 3.63        | 5.66      | 1.77 | 3.9  | 6.0   | 1.66 | 3.55 | 5.1   | 1.576158940397351  |
| 6   | 2012      | Ponte Preta    | Atletico-MG    | 0.0       | 1.0       | A         | 3.12      | 3.45        | 2.44      | 3.12 | 3.45 | 2.62  | 2.74 | 3.23 | 2.45  | 1.2842105263157895 |

...

2

An alternative to the lmonferrari solution is to use the method transform:

dataset['Media_GolsCasa']=dataset.groupby("Home")["Gols Casa"].transform("mean")
  • Very good the solution!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.