5
I am aware that, for example runif(1000,0,3)
generates 1000 random values for a uniform distribution for x between 0 and 3. But how to do this for any probability density function? Every clue is grateful!
5
I am aware that, for example runif(1000,0,3)
generates 1000 random values for a uniform distribution for x between 0 and 3. But how to do this for any probability density function? Every clue is grateful!
5
Every statistical distribution can be defined by a cumulative distribution function F(x).
A well-known result states that if you have a random variable U with uniform distribution in the range (0.1). Then follows the distribution defined by F.
The proof of this result is simple:
So if you can generate random numbers with the uniform distribution and know the cumulative distribution function, you can also generate random numbers according to that distribution.
In the R
, this is already programmed for several distributions: see list that Ailton posted. But, it’s not too complicated to program for another distribution if you can reverse your F(x).
If you only know the cumulative distribution function, you can write as an optimization problem. Define the cumulative distribution ( Here is the cumulative exponential distribution):
dF <- function(x, lambda = 0.5){
1 - exp(-lambda*x)
}
Draw a random number between 0 and 1 with uniform distribution. In my case I obtained:
n <- 0.917915
Then you have to find x
of distribution dF
which is closer to n. This can be done as follows:
op <- optim(runif(1), function(x){
abs(dF(x, 0.5) - n)
}, method = "BFGS")
See that:
> op$par
[1] 5
And that :
> dF(5, 0.5)
[1] 0.917915
This wikipedia article explains more fully what I said here: https://en.wikipedia.org/wiki/Inverse_transform_sampling
This procedure can be repeated for, from a random sample of a random variable with uniform distribution we obtain a sample with cumulative distribution defined on the object dF
. In this case we use the exponential distribution:
dF <- function(x, lambda = 0.5){
1 - exp(-lambda*x)
}
amostra <- runif(1000)
To obtain the sample with exponential distribution it is necessary to find the inverse value of dF
for each random number generated. This can be obtained as follows:
inverter_distribuicao <- function(x){
m <- nlm(function(y){
abs(dF(y, 0.5) - x)
}, p = 1
)
return(m$estimate)
}
amostra_exp <- sapply(amostra, inverter_distribuicao)
See now the histogram of the generated sample:
Browser other questions tagged r mathematics
You are not signed in. Login or sign up in order to post.
I’m no expert, but a quick search seems to indicate that the problem is more mathematical/statistical than programming. Apparently the solution involves manipulating its function so that it can be fed random (uniform) values, and hence generate the values with such a distribution.
– Molx
According to this link, you can get an approximation of this simply by reversing the function and solving to values with uniform distribution (if the function is discrete). Ideally you would post the function so we can try to solve the problem.
– Molx
Each distribution has a specific function in R to obtain samples. For example, to obtain values from a binomial distribution, the function is
rbinom
, for the distribution of Poisson,rpois
. Both in the package Stats which is already installed with R. On the project website (in English) there is a list with various distributions and packages that support them. What is listed in Base Functionality is included in any updated version of R.– Ailton Andrade de Oliveira
If the density function is known, it is worth the comment of @Ailtonandradedeoliveira, because there is a function for each distribution. If it is unknown, but you have a sample, the question becomes complicated (eg. what is the domain of the function? Real? Non-negative Real? Defined range?).
– Bernardo