Data manipulation with R

Asked

Viewed 49 times

0

Hi, I have a school data frame and I need to calculate the evasion number. The program would be more or less that, only this way it takes a long time to finish executing. Does anyone know how I can do it otherwise using some library (preferably ddply)?

Evasao <- function(row1, df){
  for(row2 in 1:nrow(df)){
    if(df[row1,"Id"] == df[row2,"Id"] & (df[row2, "NU_ANO_CENSO"]+1) == df[row1, "NU_ANO_CENSO"])
      FALSE
  }
  TRUE
}

#1 - evasão, 2 - não evasão
df <- dados_padronizados_ensino_medio_com_eja_2007_a_2018_com_id
df$EVASAO <- 0
for(row in 1:nrow(df)){
  #print(df["TP_ETAPA_ENSINO"])
  if(df[row,"TP_ETAPA_ENSINO"] == 25 | df[row,"TP_ETAPA_ENSINO"] == 26 ){
    if(Evasao(row, df)){
      df[row,"EVASAO"] <- 1
    }
  }
}
  • Try to do Evasao <- cmpfun(Evasao) and after you have created the Evasion function. It creates a compiled version lower level than your Evasion, and compare computational time. And come back here, Bl?

  • @Guilhermeparreira vlw man! I’ll try here

  • 1

    Can you explain some of the function tests? Or post one dput(head(dados))? Which should improve with some vector operation instead of for

  • The question code seems to be a clear case of easily vectorable code. It does not need cycles for of certainty, nor (almost certainly) of cmpfun to be much faster. But we need test data.

  • This code seems to have been written by a programmer not used to r, you can replace this Row check passed with some lag or lead function for example

  • Hello @Noisy, when it would be interesting to use the cmpfun? Once my friend used this, and his code got much faster (his functions involved integration of Laplace and optimization). When I tried to apply a function I created, where I compare 7 different prediction methods via forecast package, no computational gain

Show 1 more comment
No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.