How to sum the values of selected columns of each observation (row) in R?

Asked

Viewed 12,366 times

4

I have a database with 10,000 observations (individuals/lines).

I want to add the values of three variables (columns) for each individual (row).

Ex.:

   x1  x2   x3    x4   x5   x6
1   0   1    0    40   45   76
2   1   1    0    31   86   76
3   0   0    1    38   79   82
4   1   0    1    42   81   74

...

For each individual, I want to add up the values of columns X1, x2 and X3. So the sum for each individual should be (y):

   x1  x2   x3    x4   x5   x6   y
1   0   1    0    40   45   76   1
2   1   1    0    31   86   76   2
3   0   0    1    38   79   82   1
4   1   1    1    42   81   74   3

...

I’ve tried the function colSums, sum and apply, but these seem to be adding up all the columns, because the values are wrong.

2 answers

6


Use the function apply restricted only to columns that matter in your database. For example, using the object USArrests as an example, the command

USArrests[, 1:3]

will display only the first three columns of this dataset. The command

apply(USArrests[, 1:3], 1, sum)

will add the values found in columns 1 to 3, for each line present. The control

apply(USArrests[, 1:3], 2, sum)

will do something similar, but the total sum will be calculated per column.

  • Hi Marcus, thank you for your reply. However, when applying the function suggested by vc, an error occurred with the column selection: banco3$ali<- apply(Usarrests[, 43:57], 1, sum) Error in [.data.frame(Usarrests, 43:57) selected undefined columns

  • 1

    Replacing Usarrests with the data.frame worked! Thank you very much, Marcus!

6

As you are adding up, you can also directly use the rowSums, which is usually a little faster than the apply:

rowSums(USArrests[,1:3])

Browser other questions tagged

You are not signed in. Login or sign up in order to post.