Reduce Sample

Asked

Viewed 30 times

0

I want to train a neural network. RNA should have inputs of equal size in the input layer. I have plot data, but each with different sizes of information. I wonder if you have a way to reduce the plot information so that it matches the lower amount. Here is an example below with four simulated plots.

a <- rnorm(n = 10,mean = 20,sd = 2)
b <- rnorm(n = 15,mean = 23,sd = 2.5)
c <- rbinom(n = 13,prob = 0.8, size = 20)

Result for plots a, b and c.

to [1] 15.05996 21.09127 16.79843 21.89805 19.20879 18.50987 19.13300 19.31934 19.02064 21.08707

b [1] 18.04905 22.54312 26.81323 24.89401 26.89529 21.52093 27.50467 21.75318 21.08223 26.07979 22.29155 [12] 20.54107 24.84216 23.86484 23.06512

c [1] 11 14 18 17 18 18 19 18 17 17 17 17 17

Plot "a" has the least amount of information (n = 10). Is there any way for me to reduce the data (n) of the plots "b" and "c" to be with the same amount of the plot "a" (10) without losing the representation of that plot?

For example, my portion "c" has two values below 15 and eleven above. My sample of this plot, reducing three values should be comparable with the original sample.

  • Can reduce randomly with sample(b, 10) and sample(c, 10), which is the best thing to do, but there is no guarantee that the outcome will have, in the case of c, 2 values below 15. This does not mean that c and c.reduzido are not comparable.

  • Do you have any sample function that shows me how much that randomness represents the actual data? Because if I stratify the sample, I have higher values in one extract than in another.

  • This is already another problem. But if you just want to reduce the sample, with the data that has the sample maintains statistics such as average and variance. Not exactly equal, but such that hypothesis tests do not reject the null hypothesis of equality of means and variances.

No answers

Browser other questions tagged

You are not signed in. Login or sign up in order to post.