Unique numbering in data frame

Asked

Viewed 45 times

0

I have a database that contains information of consumption per animal and per day. However, the animals entered the experiment on different days. Initially I need to get all animals to start counting the days in experiment from 1. Follows part of the base

Animal  Dia Consumo
5       9   2533.96
5       10  2329.06
5       11  2943.79
5       12  3361.62
5       13  2890.82
5       14  2538.98
5       15  2978.81
5       16  3038.76
5       17  3038.76
6       10  2314.82
6       11  2434.75
6       12  2643.99
6       13  2320.58
6       14  2439.56
6       15  2139.6
6       16  2459.54
6       17  2339.59

After this unique numbering for all animals need to calculate the average consumption and standard deviation of all animals each day.

  • Have you tried anything? What? Put your code here

  • Let me get this straight: animal 5 starts the experiment on day 9, animal 6 on day 10. So, does that day become day 1 of the count? If so, in order to calculate the mean and standard deviation, do you want these statistics for day 1 of the counting of each animal, day 2 of the counting, etc.? The average result for day 1 will be then (2533.96 + 2314.82)/2 == 2424.390. That’s right?

  • Yeah, that’s right, that’s right

  • I even know how to calculate the average and deviation, I would only need to faer with each animal always start at 1. For example animal 5 would be 1 instead of 9.

1 answer

1


To solve the problem I will use the strategy of split-apply-combine several times, with a single instruction at a time.

First, we create the column Contagem with the function ave.

dados$Contagem <- ave(dados$Dia, dados$Animal, FUN = function(x) x - x[1] + 1)

The mean and standard deviation with tapply.

tapply(dados$Consumo, dados$Contagem, mean)
tapply(dados$Consumo, dados$Contagem, sd)

New columns can also be created in data.frame with these statistics, for this we use once again the function ave. This is because despite so much ave as tapply perform the same function calculations tapply returns only one value per group while the ave returns a value per baseline. (Values are equal and in the case of ave repeat on all lines in the same group.)

dados$Media <- ave(dados$Consumo, dados$Contagem, FUN = mean)
dados$DesvioPadrao <- ave(dados$Consumo, dados$Contagem, FUN = sd)

DICE.

dados <-
structure(list(Animal = c(5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L), Dia = c(9L, 10L, 11L, 12L, 13L, 
14L, 15L, 16L, 17L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L), 
    Consumo = c(2533.96, 2329.06, 2943.79, 3361.62, 2890.82, 
    2538.98, 2978.81, 3038.76, 3038.76, 2314.82, 2434.75, 2643.99, 
    2320.58, 2439.56, 2139.6, 2459.54, 2339.59)), .Names = c("Animal", 
"Dia", "Consumo"), class = "data.frame", row.names = c(NA, -17L
))

Browser other questions tagged

You are not signed in. Login or sign up in order to post.