Sort column of a data frame in R

Asked

Viewed 416 times

2

I have a data frame with 89000 lines and in one column appears the degree of kinship with the employee. I need to divide into 4 classes, namely:

  • Class 1 - Spouse/Children
  • Class 2 - Mother/Father
  • Class 3 - Brothers
  • OTHERS - Other Kinship

I need to create a column in the date frame that inserts the employee’s kinship class (I need to keep the original kinship in the database). I did it with a series of ifelse nestled, but I wonder if there is any more "elegant solution".

ifelse(base.dados$Parentesco %in% classe1, base.dados$CLASSE <- "CLASSE 1",
                              ifelse(base.dados$Parentesco %in% classe2, base.dados$CLASSE <- "CLASSE 2",
                                     ifelse(base.dados$Parentesco %in% classe3, base.dados$CLASSE <- "CLASSE 3", "OUTRAS")))
  • Put the tags to identify the language, and try to be clearer, I couldn’t quite understand your question

  • See if it’s clearer Samuel

  • Instead of base.dados$CLASSE <- "CLASSE 1" do just "CLASSE 1". And the same for others. More exactly, base.dados$CLASSE <- ifelse(...etc...).

2 answers

3

How do we not have an example of base.dados, I created a data.frame. If you want to avoid so many ifelse can do something like this.

set.seed(6399)  # Torma o código reprodutível

classe1 <- c("Conjugue", "Filho", "Filha")
classe2 <- c("Mãe", "Pai")
classe3 <- c("Irmão", "Irmã")
classe4 <- c("Tio", "Tia", "Avô", "Avó")

base.dados <- data.frame(
    ID = 1:20,
    Parentesco = sample(c(classe1, classe2, classe3, classe4), 20, TRUE)
)
base.dados

base.dados$CLASSE <- "OUTRAS"
base.dados$CLASSE[base.dados$Parentesco %in% classe1] <- "CLASSE 1"
base.dados$CLASSE[base.dados$Parentesco %in% classe2] <- "CLASSE 2"
base.dados$CLASSE[base.dados$Parentesco %in% classe3] <- "CLASSE 3"

If you have values NA at the base, should use which in the logical input. The first line is maintained, only the others are changing.

base.dados$CLASSE <- "OUTRAS"
base.dados$CLASSE[which(base.dados$Parentesco %in% classe1)] <- "CLASSE 1"

And the same for other classes.

2


For me, the most elegant way would be to create a function that simplifies the chain of ifelse, and even generalize the transformation of the function to other situations. Example:

classes_parentescos <- list("CLASSE 1"=c("conjuge", "filho"), 
                "CLASSE 2"=c("mae", "pai"), 
                "CLASSE 3"=c("outros")
                )

get_class_name <- function(x, classes=classes_parentescos){
        pos <- grep(x, classes)
        names(classes[pos])
}

base.dados$CLASSE <- sapply(base.dados$Parentesco, get_class_name)
  • You’re making a mistake: promise already under evaluation: recursive default argument reference or earlier problems?. Will it be for having classes=classes?

  • Strange, on my machine the code runs without any warning, and usually this type of approach in the arguments also brings me no problem, . You tried to take away the argument classes function to see if it eliminates the problem ?

  • Yes, when I took it executed well. Then I changed the name of the argument, keeping the default value, to class = classes and there was also no problem. (in grep became class, clear-cut.)

  • The following code, simpler and has nothing to do with the OP issue, reproduces the problem: f<-function(y,x=x)2*y+x;x<-1:5;sapply(6:10,f).

  • Indeed, now you have given me this error... seeing this answer here, seems to be a conflict in environment. It was not clear to me what happens, but good to know that the R does not treat these situations very well.

  • I edited the answer so other people don’t go through this problem.

Show 1 more comment

Browser other questions tagged

You are not signed in. Login or sign up in order to post.