Somatoria accumulating columns in a Matrix in R

Asked

Viewed 87 times

7

I have the following situation :

In a numeric type Matrix :

Temp <- matrix(runif(25, 10, 100),nrow=5,ncol=5)

V1    V2    V3    V4    V5

11    34    45    54    55
16    21    45    75    61
88    49    85    21    22
12    13    12    11    10
69    45    75    78    89

How to transform this Matrix into a Matrix that is the accumulated sum of the columns ? The result would be the following

V1    V2    V3    V4    V5

11    45    90    144   199
16    37    82    157   218
88    137   222   243   265
12    25    37    48    58
69    114   189   267   356

I achieved the goal using a for loop, but I believe that there should be a more efficient way to do it since I am working with a Matrix of 2580 lines by 253 columns and this taking a little to generate the result

Temp <- matrix(runif(25, 10, 100),nrow=5,ncol=5)
Temp <- round(Temp,0)
sum_matrix <- matrix(0,nrow=nrow(Temp),ncol=ncol(Temp))
sum_matrix[,1] <- Temp[,1]
    for (n in 2:nrow(Temp)) {
        sum_matrix[,n] <- sum_matrix[,n-1] + Temp[,n]
} 

2 answers

6


You can use the function cumsum to obtain the cumulative sum the elements of your Trix. And apply, added to the t (transpose) can be used to get the result you need:

temp <- t(matrix(c(11, 34, 45, 54, 55,
                   16, 21, 45, 75, 61,
                   88, 49, 85, 21, 22,
                   12, 13, 12, 11, 10,
                   69, 45, 75, 78, 89)
    ,nrow=5,ncol=5))
temp2 <- t(apply(temp, 1, cumsum))
temp2

     [,1] [,2] [,3] [,4] [,5]
[1,]   11   45   90  144  199
[2,]   16   37   82  157  218
[3,]   88  137  222  243  265
[4,]   12   25   37   48   58
[5,]   69  114  189  267  356

1

A way using the packages dplyr and tidyr would be the following:

> library(tidyr)
> library(dplyr)
> 
> temp <- matrix(1:25, 5, 5)
> temp <- data.frame(id = 1:5, temp)
> temp
  id X1 X2 X3 X4 X5
1  1  1  6 11 16 21
2  2  2  7 12 17 22
3  3  3  8 13 18 23
4  4  4  9 14 19 24
5  5  5 10 15 20 25
> temp %>%
+   gather(variavel, valor, -id) %>%
+   group_by(id) %>%
+   arrange(variavel) %>%
+   mutate(valor = cumsum(valor)) %>%
+   spread(variavel, valor)
Source: local data frame [5 x 6]

     id    X1    X2    X3    X4    X5
  (int) (int) (int) (int) (int) (int)
1     1     1     7    18    34    55
2     2     2     9    21    38    60
3     3     3    11    24    42    65
4     4     4    13    27    46    70
5     5     5    15    30    50    75

To use this format, your data must be stored in a data.frame.

Note that if this type of manipulation makes sense in your database it means that it is not in Tidy format. To understand more why the Tidy format is ideal to use in R, it is very worthwhile read this article by Hadley.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.