How to optimize the removal of lines in an array?

Asked

Viewed 37 times

6

I have a matrix of dimensions:

> dim(filtro1)
[1] 2806519      31

I need to remove the lines of this matrix that meet a condition. So far so good. However, computationally the loop for this has been very expensive (time consuming - 8 to 10 hours). I’ve tried rbind, but this is longer than the solution below:

for(i in 1:length(filtro1[,1])){
  if(filtro1[i,31] == 0){
    filtro1 <- filtro1[-i,]
    print(i)
  }
}
  • print(i) is only for me to follow the execution of the loop

I tried to run the code in parallel with foreach and %dopar%, but apparently it does not work, because the above reasoning depends on the index

Does anyone know how to perform matrix line removal in a faster and more efficient way?

  • Besides time consuming this code is not right. So remove a line, filtro1 no longer has 2806519 lines.

1 answer

5


R is a vector language and the best way to do that would be something like:

filtro1 <- filtro1[filtro1[,31] != 0, ]

I think the best place to learn about vectorization is Chapter 3 of R Inferno.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.