How to create a variable by averaging another variable from the same dataset?

Asked

Viewed 287 times

1

Imagine I have the following basis:

       Country <- c("Brazil", "Brazil", "Brazil", "Brazil", "Brazil","Brazil", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina")
Year <- c(91, 92, 93, 94, 95, 96, 91, 92, 93, 94, 95, 96)
period <- c(1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3)
values <- c(5,3, 4, 2, 1, 1, 5, 7, 4, 4, 3, 7)
df <- data.frame(country = Country, year = Year, period = period, pib = values)

    country year period value
    Brazil   91      1     5
    Brazil   92      1     3
    Brazil   93      2     4
    Brazil   94      2     2
    Brazil   95      3     1
    Brazil   96      3     1
 Argentina   91      1     5
 Argentina   92      1     7
 Argentina   93      2     4
 Argentina   94      2     4
 Argentina   95      3     3
 Argentina   96      3     7       

From this base I want to create a new variable called media, where will be calculated the average of the GDP for each country in each period, in such a way that the final result would be:

   country year period pib media
   Brazil   91      1   5     4
   Brazil   92      1   3     4
   Brazil   93      2   4     3
   Brazil   94      2   2     3
   Brazil   95      3   1     1
   Brazil   96      3   1     1
Argentina   91      1   5     6
Argentina   92      1   7     6
Argentina   93      2   4     4
Argentina   94      2   4     4
Argentina   95      3   3     5
Argentina   96      3   7     5

I have no idea how to do this, but I believe there is a way. Someone can give me a light?

PS: I tried to create the best possible example, but I’m still a beginner.

1 answer

1


You can use the dplyr as follows:

library(dplyr)
df %>%
  group_by(country, period) %>%
  mutate(media = mean(pib))

Source: local data frame [12 x 5]
Groups: country, period [6]

     country  year period   pib media
      <fctr> <dbl>  <dbl> <dbl> <dbl>
1     Brazil    91      1     5     4
2     Brazil    92      1     3     4
3     Brazil    93      2     4     3
4     Brazil    94      2     2     3
5     Brazil    95      3     1     1
6     Brazil    96      3     1     1
7  Argentina    91      1     5     6
8  Argentina    92      1     7     6
9  Argentina    93      2     4     4
10 Argentina    94      2     4     4
11 Argentina    95      3     3     5
12 Argentina    96      3     7     5

To create the variable and save it to the object df, use:

df <- df %>%
  group_by(country, period) %>%
  mutate(media = mean(pib)) %>%
  ungroup()

The ungroup is not necessary, but is recommended so that the other operations are not carried out by group.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.