Let X_1, X_2, ..., X_n be a sequence of numbers. Let X = X_1 + X_2 + ... + X_n. If I divide the value of each X_i by X, the sum X_1/X + X_2/X + ... + X_n/X will always have value 1. This is a normalization type. If I multiply each side of this equality by 200,.
So just apply this idea to R
to get the desired result. I created a function called amostra
who does this.
amostra <- function(x=1:20, size=20, replace=TRUE, limit=200){
estat <- sample(x, size, replace=replace)
estat <- round(estat/sum(estat)*limit)
if (sum(estat) == limit){
return(estat)
} else {
return(c(estat[1:(size-1)], limit-sum(estat[1:(size-1)])))
}
}
x <- amostra(1:20, 20, limit=200)
x
[1] 4 12 12 13 12 13 2 12 11 2 14 7 12 17 12 3 11 5 11 15
sum(x)
[1] 200
This function has 4 arguments:
x
: the possible values the sample can take (integers 1 to 20)
size
: the sample size to be created (the default is 20)
replace
: indicates (the default is to have replacement)
limit
: total sum limit (default is 200)
Due to rounding problems, I did a little trick in the algorithm. It draws n elements from the sample and tests whether the sum is equal to limit
. If equal, it returns the sample sought.
If different, the last element is determined by the formula limit-sum(estat[1:(size-1)])
, which is the difference between the target sum and the sum of the n-1 first elements of the sample.
If this were not done, there would be no guarantee that the final sum of the elements would be equal to limit
.
The command table
order the values and their respective frequencies:
table(x)
x
2 3 4 5 7 11 12 13 14 15 17
2 1 1 1 1 3 6 2 1 1 1
From this, finally, it is possible to calculate the desired statistics, creating a data frame with the answers:
as.data.frame(table(x))
x Freq
1 2 2
2 3 1
3 4 1
4 5 1
5 7 1
6 11 3
7 12 6
8 13 2
9 14 1
10 15 1
11 17 1
Marcus, thank you very much !!! Very good structuring and explanation. I will keep as CMD for posteriori. My idea was to be able to use the 'limiter' only to facilitate the Media calculations and the others. However, it needed to get the frequency data (as shown in the 2nd line of the output table(x)) as a new column. How to capture these results and put them into a data.frame, like a column ? xi fi 2 2 3 1
– Rodney Dieguez
See the edition I made.
– Marcus Nunes
Marcus, perfect !!! I want to set up a whole statistical table for students, step by step, to store the history of the calculations performed, mounting column by column, being given the values of x and its frequencies.
– Rodney Dieguez