How to consist of a data frame against a valid value array?

Asked

Viewed 26 times

-3

I have an array of all possible values that appear in a set of columns of a data frame. The number of components in the vector is different from the number of observations of the data frame.

My goal is to identify the invalid observations in the data frame. The solution below worked, but I am not satisfied as I would like to not use the for loop.

DF is a data frame with 4 columns named da1 to da4 and 9 lines of observations. DF was imported from an excel spreadsheet as well as "check" which is an excel table with a single column.

str(DF)
DF1 <- DF
DF1 <- as.data.frame(DF) # transformo DF em data frame
str(DF1)

result is a logic vector of length equal to the number of lines of observations of DF1.

result1 is a logical data frame obtained from DF.

result <- logical(length=9) 
result1 <- as.data.frame(result)

The loop below checks whether each column of DF1 whether or not it has some element of "check". result1 is the resulting logical data frame.

for (col in 1:4) {
  diag <- DF1[, col] 
  result1[,col] <- is.element(diag,check) 
  }
result1
  • result1 <- sapply(DF1, function(x) x %in% check).

1 answer

1

No explicit loop, loop is required *apply is simpler.

result1 <- sapply(DF1, function(x) x %in% check)
result1 <- as.data.frame(result1)

result1
#     V1    V2    V3    V4
#1 FALSE FALSE FALSE  TRUE
#2  TRUE FALSE FALSE  TRUE
#3 FALSE  TRUE  TRUE  TRUE
#4 FALSE  TRUE FALSE  TRUE
#5 FALSE  TRUE FALSE FALSE
#6 FALSE  TRUE  TRUE  TRUE
#7  TRUE  TRUE FALSE FALSE
#8  TRUE  TRUE  TRUE  TRUE
#9 FALSE FALSE FALSE  TRUE

In view of the fact that the function is used in the question is.element, see that gives identical results.

identical(
  sapply(DF1, is.element, check), 
  sapply(DF1, function(x) x %in% check)
)
#[1] TRUE

Dice

set.seed(2020)
n <- 9
DF1 <- replicate(4, sample(10, n, TRUE))
DF1 <- as.data.frame(DF1)
check <- sample(10, 6)
  • 1

    What the is.element check? If it is an element? Element is some kind of R data?

  • 1

    In R a vector can be what we imagine to be a vector (of mathematics) but can also be a class object "list". Element is not a type of data, it is a generic designation, which we are used to and in R is preferred for "vectors", for "lists" it is preferred "Member/member". Function is.element, see help('is.element') or in the ONLY. Good question, by the way.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.