A - How to create a delayed variable (lag) conditioned to the individual?

Asked

Viewed 611 times

3

I need to delay a variable from my db (dCoopCred). However, it cannot mix the delay of two individuals (CNPJ). I would like that LAG_Result_ant_desp were Result_ant_desp in t-1 (previous period).

Example:

structure(list(CNPJ = c(5834, 5834, 5834, 5834, 5834, 9797, 9797, 
9797, 9797, 9797), ano = c(2006, 2007, 2008, 2009, 2010, 2006, 
2007, 2008, 2009, 2010), PIB = c(4, 6, 5, 1, 7, 4, 6, 5, 1, 7
), Result_ant_desp = c(5000, 7000, 6000, 2000, 3500, 1500, 2600, 
3000, 2100, 3100), LAG_Result_ant_desp = structure(c(9L, 6L, 
8L, 7L, 2L, 9L, 1L, 4L, 5L, 3L), .Label = c("1500", "2000", "2100", 
"2600", "3000", "5000", "6000", "7000", "N/A"), class = "factor")), class = "data.frame", row.names = c(NA, 
-10L))

I managed to delay a period using the package Hmisc and the command

dCoopCred$LAG_result_ant_desp <- Lag(dCoopCred$result_ant_desp, +1)

However, only this command ends up mixing result_ant_desp of different years and CNPJ.

I’m also using the code

teste <- dCoopCred %>% 
  distinct(CNPJ, ano, .keep_all = TRUE) %>% 
  group_by(CNPJ) %>% 
  mutate(LAG_result_ant_desp = lead(result_ant_desp, n = 1L)) %>% 
  select(-result_ant_desp) %>% 
  ungroup() %>% 
  left_join(dCoopCred, ., by = c("ano", "CNPJ")) 

Did what I wanted, but this generating another db, I would like the variable to be created in dCoopCred

  • Unfortunately, this question cannot be reproduced by anyone trying to answer it. Please, take a look at this link and see how to ask a reproducible question in R. So, people who wish to help you will be able to do this in the best possible way.

  • Estou usando o seguinte código:&#xA;&#xA;teste <- dCoopCred %>%&#xA; distinct(CNPJ, ano, .keep_all = TRUE) %>%&#xA; group_by(CNPJ) %>%&#xA; mutate(LAG_result_ant_desp = lead(result_ant_desp, n = 1L)) %>%&#xA; select(-result_ant_desp) %>%&#xA; ungroup() %>%&#xA; left_join(dCoopCred, ., by = c("year", "CNPJ")) .

  • About the previous comment: where "lead" is "lag"

  • @Marcusnunes edited, it worked?

  • 1

    If the code of the second comment is generating the desired result, why don’t you dCoopCred <- ... instead of teste <- ... to prevent another object from being created?

1 answer

4


There is a simpler way to do what the question asks. Instead of Pipes %>%, use ave.
Note: the function lag which will be executed is that of the package dplyr.

library(dplyr)

dCoopCred$LAG_Result_ant_desp <- with(dCoopCred, ave(Result_ant_desp, CNPJ, FUN = lag, -1))

dCoopCred
#   CNPJ  ano PIB Result_ant_desp LAG_Result_ant_desp
#1  5834 2006   4            5000                  NA
#2  5834 2007   6            7000                5000
#3  5834 2008   5            6000                7000
#4  5834 2009   1            2000                6000
#5  5834 2010   7            3500                2000
#6  9797 2006   4            1500                  NA
#7  9797 2007   6            2600                1500
#8  9797 2008   5            3000                2600
#9  9797 2009   1            2100                3000
#10 9797 2010   7            3100                2100

Dice.
As the data in the question already have the new column, here it goes only with the first four columns, in format dput.

dCoopCred <-
structure(list(CNPJ = c(5834, 5834, 5834, 5834, 5834, 9797, 9797, 
9797, 9797, 9797), ano = c(2006, 2007, 2008, 2009, 2010, 2006, 
2007, 2008, 2009, 2010), PIB = c(4, 6, 5, 1, 7, 4, 6, 5, 1, 7
), Result_ant_desp = c(5000, 7000, 6000, 2000, 3500, 1500, 2600, 
3000, 2100, 3100)), .Names = c("CNPJ", "ano", "PIB", "Result_ant_desp"
), row.names = c(NA, -10L), class = "data.frame")
  • our well better ! Thank you very much !

Browser other questions tagged

You are not signed in. Login or sign up in order to post.