calculate a probability function in R

Asked

Viewed 121 times

4

Good afternoon.

It’s my first intervention in the stack, I’m a beginner in the R, and my doubts are quite basic.

I need to generate a sample of 1000 observations of a distribution function W.
W is a discrete random variable, which assumes the values of 1 to 6, represented by the sides of a data, given that it is addicted and whose probability function is p1=0.25, p2=0.16, p3=0.18, p4=0.17, p5=0.14, p6=0.10.

How can I write this function in R?

Thank you

2 answers

4

To sample a discrete variable that takes a finite number of values, one can use the base function sample.
Before running the function sample or another function that generates pseudo-random numbers, it is always better to call set.seed.

set.seed(7228)

W <- 1:6
p <- c(0.25, 0.16, 0.18, 0.17, 0.14, 0.10)
w <- sample(W, 1000, replace = TRUE, prob = p)

head(w, n = 20)
#[1] 4 4 1 3 6 3 5 4 1 1 2 4 4 4 4 2 6 6 5 6

See if result proportions are similar to given probabilities.

tw <- table(w)
print(tw/sum(tw), digits = 2)
#w
#   1    2    3    4    5    6 
#0.23 0.16 0.16 0.19 0.14 0.12

They do not seem to be very different. If necessary, one can always run a Kolmogorov-Smirnov test, since both p how the sample proportions come from a continuous distribution.

ks.test(p, tw/sum(tw))

With a p-value = 0.8928 Yes, the distributions are not significantly different.

2


The way you solve this problem is by using the inverse cumulative distribution function of this variable.

You know the probabilities of the variable X assume each of the possible values (of 1 to 6), then it is possible to construct the cumulative distribution function (F(x) = P(X <= x)):

F(x) =

  • 0,25 if X <= 1;
  • 0,41 if X <= 2;
  • 0,59 if X <= 3;
  • 0,76 if X <= 4;
  • 0,90 if X <= 5;
  • 1,00 if X <= 6

Then, using the concept of inverse function F(x)^(-1), generates a probability and has the variable value.

F(x)^(-1):

  • 1 if 0 < p < 0,25;
  • 2 if 0,25 < p < 0,41;
  • 3 if 0,41 < p < 0,59;
  • 4 if 0,59 < p < 0,76;
  • 5 if 0,76 < p < 0,90;
  • 6 if 0,90 < p < 1,00

In the R you generate "a probability" with the function runif(). Thus, it follows code that generates values of its random variable:

gerar.VA <- function(n = 1){
  #P(X=1) = 0,25; P(X=2) = 0,16; P(X=3) = 0,18 ;P(X=4) = 0,17; P(X=5) = 0,14; P(X=6) = 0,10
  resultado <- NULL
  aux <- runif(n)
  for(i in 1:n){
    if(aux[i] < 0.25){
      resultado[i] <- 1
    } else{
      if(aux[i] < 0.25 + 0.16){
        resultado[i] <- 2
      } else{
        if(aux[i] < 0.25 + 0.16 + 0.18){
          resultado[i] <- 3
        } else{
          if(aux[i] < 0.25 + 0.16 + 0.18 + 0.17){
            resultado[i] <- 4
          } else{
            if(aux[i] < 0.25 + 0.16 + 0.18 + 0.17 + 0.14){
              resultado[i] <- 5
            } else{
              resultado[i] <- 6
            }
          }
        }
      }
    }
  }
  return(resultado)
}
gerar.VA(n = 1000)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.