How to create a column with specific conditions?

Asked

Viewed 84 times

1

I have a database that way:

      cod  ano             MUNICIPIO pmat
1  110001 1998 ALTA FLORESTA D OESTE   NA
2  110001 1999 ALTA FLORESTA D OESTE   NA
3  110001 2000 ALTA FLORESTA D OESTE   NA
4  110001 2001 ALTA FLORESTA D OESTE   NA
5  110001 2002 ALTA FLORESTA D OESTE   NA
6  110001 2003 ALTA FLORESTA D'OESTE   NA
7  110001 2004 ALTA FLORESTA D OESTE   NA
8  110001 2005 ALTA FLORESTA D OESTE   NA
9  110001 2006 ALTA FLORESTA D OESTE   NA
10 110001 2007 ALTA FLORESTA D OESTE   NA
11 110001 2008 ALTA FLORESTA D OESTE   NA
12 110001 2009 ALTA FLORESTA D OESTE   NA
13 110001 2010 ALTA FLORESTA D OESTE   NA
14 110001 2011 ALTA FLORESTA D OESTE   NA
15 110001 2012 ALTA FLORESTA D OESTE   NA
16 110001 2013 ALTA FLORESTA D OESTE   NA
17 110001 2014 ALTA FLORESTA D OESTE   NA
18 110001 2015 ALTA FLORESTA D OESTE   NA
19 110001 2016 ALTA FLORESTA D OESTE   NA
20 110002 1998             ARIQUEMES    0
21 110002 1999             ARIQUEMES    0
22 110002 2000             ARIQUEMES    0
23 110002 2001             ARIQUEMES    0
24 110002 2002             ARIQUEMES    0
25 110002 2003             ARIQUEMES    0
26 110002 2004             ARIQUEMES    1
27 110002 2005             ARIQUEMES    1
28 110002 2006             ARIQUEMES    1
29 110002 2007             ARIQUEMES    1
30 110002 2008             ARIQUEMES    1
31 110002 2009             ARIQUEMES    1
32 110002 2010             ARIQUEMES    1
33 110002 2011             ARIQUEMES    1
34 110002 2012             ARIQUEMES    1
35 110002 2013             ARIQUEMES    1
36 110002 2014                          1
37 110002 2015             ARIQUEMES    1
38 110002 2016             ARIQUEMES    1
39 110003 1998                CABIXI   NA
40 110003 1999                CABIXI   NA
41 110003 2000                CABIXI   NA
42 110003 2001                CABIXI   NA
43 110003 2002                CABIXI   NA
44 110003 2003                CABIXI   NA
45 110003 2004                CABIXI   NA
46 110003 2005                CABIXI   NA
47 110003 2006                CABIXI   NA
48 110003 2007                CABIXI   NA
49 110003 2008                CABIXI   NA
50 110003 2009                CABIXI   NA

I would like to create a new variable ano_contrat1, in which, for each municipality, if pmat is one in a given year, this new variable is constant in the period for that given municipality i with the year in which it becomes 1. What would look more or less like this:

      cod  ano             MUNICIPIO pmat ano_contrat1
1  110001 1998 ALTA FLORESTA D OESTE   NA            0
2  110001 1999 ALTA FLORESTA D OESTE   NA            0
3  110001 2000 ALTA FLORESTA D OESTE   NA            0
4  110001 2001 ALTA FLORESTA D OESTE   NA            0
5  110001 2002 ALTA FLORESTA D OESTE   NA            0
6  110001 2003 ALTA FLORESTA D'OESTE   NA            0
7  110001 2004 ALTA FLORESTA D OESTE   NA            0
8  110001 2005 ALTA FLORESTA D OESTE   NA            0
9  110001 2006 ALTA FLORESTA D OESTE   NA            0
10 110001 2007 ALTA FLORESTA D OESTE   NA            0
11 110001 2008 ALTA FLORESTA D OESTE   NA            0
12 110001 2009 ALTA FLORESTA D OESTE   NA            0
13 110001 2010 ALTA FLORESTA D OESTE   NA            0
14 110001 2011 ALTA FLORESTA D OESTE   NA            0
15 110001 2012 ALTA FLORESTA D OESTE   NA            0
16 110001 2013 ALTA FLORESTA D OESTE   NA            0
17 110001 2014 ALTA FLORESTA D OESTE   NA            0
18 110001 2015 ALTA FLORESTA D OESTE   NA            0
19 110001 2016 ALTA FLORESTA D OESTE   NA            0
20 110002 1998             ARIQUEMES    0         2004
21 110002 1999             ARIQUEMES    0         2004
22 110002 2000             ARIQUEMES    0         2004
23 110002 2001             ARIQUEMES    0         2004
24 110002 2002             ARIQUEMES    0         2004
25 110002 2003             ARIQUEMES    0         2004
26 110002 2004             ARIQUEMES    1         2004
27 110002 2005             ARIQUEMES    1         2004
28 110002 2006             ARIQUEMES    1         2004
29 110002 2007             ARIQUEMES    1         2004
30 110002 2008             ARIQUEMES    1         2004
31 110002 2009             ARIQUEMES    1         2004
32 110002 2010             ARIQUEMES    1         2004
33 110002 2011             ARIQUEMES    1         2004
34 110002 2012             ARIQUEMES    1         2004
35 110002 2013             ARIQUEMES    1         2004
36 110002 2014                          1         2004
37 110002 2015             ARIQUEMES    1         2004
38 110002 2016             ARIQUEMES    1         2004
39 110003 1998                CABIXI   NA            0
40 110003 1999                CABIXI   NA            0
41 110003 2000                CABIXI   NA            0
42 110003 2001                CABIXI   NA            0
43 110003 2002                CABIXI   NA            0
44 110003 2003                CABIXI   NA            0
45 110003 2004                CABIXI   NA            0
46 110003 2005                CABIXI   NA            0
47 110003 2006                CABIXI   NA            0
48 110003 2007                CABIXI   NA            0
49 110003 2008                CABIXI   NA            0
50 110003 2009                CABIXI   NA            0

Can you help me? Thank you!

  • Do you want to create a new variable based on another? For example, if pmat is 0 the year is 2004. That would be it?

  • Yes, based on another, but not that way. Take for example the first city: pmat is always 0, so the ano_contrat1 variable is 0 for that city. Now take for example the city ARIQUEMES, pmat is 1 from 2004, then the new variable will be 2004 for it (is constant in time)

1 answer

5


The following solves the question with lapply/split.
Divide the table dados for cod with the function split and an anonymous function is applied to each sub-dataframe.

res <- lapply(split(dados, dados$cod), function(DF){
  DF[["ano_contrat1"]] <- 0
  i <- DF[["pmat"]] == 1
  if(any(!is.na(i)) && any(i)) i <- min(which(i)) else i <- Inf
  if(is.finite(i)){
    DF[["ano_contrat1"]] <- DF[["ano"]][i]
  }
  DF
})

res <- do.call(rbind, res)
row.names(res) <- NULL
res

DICE format dput.

dados <-
structure(list(cod = c(110001L, 110001L, 110001L, 110001L, 110001L, 
110001L, 110001L, 110001L, 110001L, 110001L, 110001L, 110001L, 
110001L, 110001L, 110001L, 110001L, 110001L, 110001L, 110001L, 
110002L, 110002L, 110002L, 110002L, 110002L, 110002L, 110002L, 
110002L, 110002L, 110002L, 110002L, 110002L, 110002L, 110002L, 
110002L, 110002L, 110002L, 110002L, 110002L, 110003L, 110003L, 
110003L, 110003L, 110003L, 110003L, 110003L, 110003L, 110003L, 
110003L, 110003L, 110003L), ano = c(1998L, 1999L, 2000L, 2001L, 
2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 
2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 1998L, 1999L, 2000L, 
2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 
2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 1998L, 1999L, 
2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 
2009L), MUNICIPIO = c("FLORESTA D OESTE", "FLORESTA D OESTE", 
"FLORESTA D OESTE", "FLORESTA D OESTE", "FLORESTA D OESTE", "FLORESTA D OESTE", 
"FLORESTA D OESTE", "FLORESTA D OESTE", "FLORESTA D OESTE", "FLORESTA D OESTE", 
"FLORESTA D OESTE", "FLORESTA D OESTE", "FLORESTA D OESTE", "FLORESTA D OESTE", 
"FLORESTA D OESTE", "FLORESTA D OESTE", "FLORESTA D OESTE", "FLORESTA D OESTE", 
"FLORESTA D OESTE", "ARIQUEMES", "ARIQUEMES", "ARIQUEMES", "ARIQUEMES", 
"ARIQUEMES", "ARIQUEMES", "ARIQUEMES", "ARIQUEMES", "ARIQUEMES", 
"ARIQUEMES", "ARIQUEMES", "ARIQUEMES", "ARIQUEMES", "ARIQUEMES", 
"ARIQUEMES", "ARIQUEMES", NA, "ARIQUEMES", "ARIQUEMES", "CABIXI", 
"CABIXI", "CABIXI", "CABIXI", "CABIXI", "CABIXI", "CABIXI", "CABIXI", 
"CABIXI", "CABIXI", "CABIXI", "CABIXI"), pmat = c(NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), row.names = c(NA, 
-50L), spec = structure(list(cols = list(X1 = structure(list(), class = c("collector_integer", 
"collector")), cod = structure(list(), class = c("collector_integer", 
"collector")), ano = structure(list(), class = c("collector_integer", 
"collector")), X4 = structure(list(), class = c("collector_character", 
"collector")), MUNICIPIO = structure(list(), class = c("collector_character", 
"collector")), pmat = structure(list(), class = c("collector_integer", 
"collector"))), default = structure(list(), class = c("collector_guess", 
"collector"))), class = "col_spec"), class = c("tbl_df", "tbl", 
"data.frame"))

Browser other questions tagged

You are not signed in. Login or sign up in order to post.