As of version 3.6, the package random
has the function choices
that allows you to define the weights of each value within a population.
from random import choices
population = [1, 7, 15]
weights = [40, 30, 30]
samples = choices(population, weights, k=100)
The above code will generate 100 random values following the defined weights.
Based on
Random draw, but with different odds
The easiest way to implement is to generate a structure that already has these probabilities. Since the number 1 must have a probability equal to 40%, while the numbers 7 and 15 must have a probability equal to 30%, you can generate a list that has 10 elements in total, repeating the number 1 four times and the numbers 7 and 15 three times each.
v = [1, 1, 1, 1, 7, 7, 7, 15, 15, 15]
So in doing random.choice(v)
, the desired odds will be met.
Implementing in a genetic way, and quite simple, you can define a function that generates this list as your need. For example:
from typing import Tuple, Any, List
from itertools import starmap, chain, repeat
from random import choice
def sample_space(*values: Tuple[Any, float]) -> List:
if sum(value[1] for value in values) != 1.0:
raise ValueError('Soma das probabilidades não é 1.0')
while True:
if all(probability.is_integer() for _, probability in values):
break
values = [(value, probability * 10) for value, probability in values]
values = [(value, int(probability)) for value, probability in values]
return list(chain(*starmap(lambda a, b: list(repeat(a, b)), values)))
The function sample_space
will generate for you a list that defines exactly the sample space you want, just pass as parameter a set of tuples with desired value and probability. For the data presented in the question, it would be:
>>> space = sample_space((1, 0.4), (7, 0.3), (15, 0.3))
>>> print(space)
[1, 1, 1, 1, 7, 7, 7, 15, 15, 15]
If you select 100 numbers from this sample space and check how many times they repeat (you can use the collections.Counter
for this), will see that the probabilities tend to be followed:
>>> samples = [choice(space) for _ in range(100)]
>>> print(Counter(samples))
Counter({1: 40, 7: 32, 15: 28})
See working on Repl.it | Ideone
It will work for any probabilities, provided the sum is always 1.0. For example, for a sample space that is 99% True
and 1% False
would be:
>>> space = sample_space((True, 0.99), (False, 0.01))
However, this would generate a list of 100 values, of which 99 True
and 1 False
; therefore, for this solution, given its simplicity, take care not to generate too large lists and affect the memory of the application. The more decimal places have the probabilities, the more memory elements will be required.
would look great with an example of the code to generate the list with the desired distributions.
– jsbueno
@jsbueno de facto, I will provide.
– Woss