How to sum the values of selected columns of each observation (row) in R?


I have a database with 10,000 observations (individuals/lines).

I want to add the values of three variables (columns) for each individual (row).


   x1  x2   x3    x4   x5   x6
1   0   1    0    40   45   76
2   1   1    0    31   86   76
3   0   0    1    38   79   82
4   1   0    1    42   81   74


For each individual, I want to add up the values of columns X1, x2 and X3. So the sum for each individual should be (y):

   x1  x2   x3    x4   x5   x6   y
1   0   1    0    40   45   76   1
2   1   1    0    31   86   76   2
3   0   0    1    38   79   82   1
4   1   1    1    42   81   74   3


I’ve tried the function colSums, sum and apply, but these seem to be adding up all the columns, because the values are wrong.

Use the function apply restricted only to columns that matter in your database. For example, using the object USArrests as an example, the command

USArrests[, 1:3]

will display only the first three columns of this dataset. The command

apply(USArrests[, 1:3], 1, sum)

will add the values found in columns 1 to 3, for each line present. The control

apply(USArrests[, 1:3], 2, sum)

will do something similar, but the total sum will be calculated per column.

  • Hi Marcus, thank you for your reply. However, when applying the function suggested by vc, an error occurred with the column selection: banco3$ali<- apply(Usarrests[, 43:57], 1, sum) Error in [.data.frame(Usarrests, 43:57) selected undefined columns

  • 1

    Replacing Usarrests with the data.frame worked! Thank you very much, Marcus!


As you are adding up, you can also directly use the rowSums, which is usually a little faster than the apply:


