Number of queries in a dataframe

Asked

Viewed 35 times

6

I have a data frame where in each column is the value of events in the period, I need to calculate the increment and the decrease of these periods for a larger basis and I could not do:

p1<- rep(2:11)
p2<- rep(3:12)
p3<- rep(1:10)
p4<- rep(4:13)

base<- cbind(p1,p2,p3,p4)

base
      p1 p2 p3 p4
 [1,]  2  3  1  4
 [2,]  3  4  2  5
 [3,]  4  5  3  6
 [4,]  5  6  4  7
 [5,]  6  7  5  8
 [6,]  7  8  6  9
 [7,]  8  9  7 10
 [8,]  9 10  8 11
 [9,] 10 11  9 12
[10,] 11 12 10 13

expected:

       p1 p2 p3 p4 in dc
  [1,]  2  3  1  4  2  1
  [2,]  3  4  2  5  2  1
  [3,]  4  5  3  6  2  1
  [4,]  5  6  4  7  2  1
  [5,]  6  7  5  8  2  1
  [6,]  7  8  6  9  2  1
  [7,]  8  9  7 10  2  1
  [8,]  9 10  8 11  2  1
  [9,] 10 11  9 12  2  1
 [10,] 11 12 10 13  2  1

That is, taking as an example line 9:

[9,] 10 11  9 12  2  1

comparing P2 with P1 there was an increment (10 > 11) for period 3 there was a decrease (11 > 9) for period 4 there was another increment (9 > 12) totaling 2 increments and a decrease.

The idea is to run this for a set of 500 variables viewed in 10 periods.

  • Could you explain more clearly what the increment and the decrease in question would be?

  • I agree that it was not clear, thanks for the feedback, I made an amendment see if I made myself understood!

  • What should happen if I don’t increment? For example, if originally line 9 was 10 11 11 12, what should be the result? 10 11 11 12 2 0? Or something else?

  • correct, since there was no increment is 0, only need to count the increments between the lines of the data frame

1 answer

4


First I’ll use the function diff to calculate the difference between two consecutive columns. It is necessary to transpose the result to be in the same pattern as base:

diferencas <- t(apply(base, 1, diff))
diferencas
      p2 p3 p4
 [1,]  1 -2  3
 [2,]  1 -2  3
 [3,]  1 -2  3
 [4,]  1 -2  3
 [5,]  1 -2  3
 [6,]  1 -2  3
 [7,]  1 -2  3
 [8,]  1 -2  3
 [9,]  1 -2  3
[10,]  1 -2  3

That being said, I will count how many increments there are per line. That is, I will count how many values of diferencas are greater than 0:

in <- apply(t(apply(diferencas, 1, function(x) x > 0)), 1, sum)

Of mono analogue, I will count how many Decrees there are per line. That is, I will count how many values of diferencas are less than 0:

dc <- apply(t(apply(diferencas, 1, function(x) x < 0)), 1, sum)

Note that this works because, for the R, TRUE has value 1. Now I just need to join the results:

cbind(base, in, dc)
      p1 p2 p3 p4 in dc
 [1,]  2  3  1  4  2  1
 [2,]  3  4  2  5  2  1
 [3,]  4  5  3  6  2  1
 [4,]  5  6  4  7  2  1
 [5,]  6  7  5  8  2  1
 [6,]  7  8  6  9  2  1
 [7,]  8  9  7 10  2  1
 [8,]  9 10  8 11  2  1
 [9,] 10 11  9 12  2  1
[10,] 11 12 10 13  2  1

Browser other questions tagged

You are not signed in. Login or sign up in order to post.