Deep down you want to "generate N random things without repeating" (several different arrays), so an alternative is to follow this idea:
- generate all possibilities (assuming the total amount is 'X')
- choose N random indices between zero and X, and take the possibilities that are in these indices
For the first step, you can use itertools.product
, and for the second step, use random.sample
to generate the indices, and itertools.slice
to take the element that is in that index. Something like this:
import numpy as np
from itertools import product, islice
from random import sample
# números que podem estar no array
nums = range(0, 3)
# tamanho dos arrays
tamanho_arrays = 16
# gerar 20 arrays diferentes
quantidade = 20
# quantidade total de arrays possíveis
total_arrays_possiveis = len(nums) ** tamanho_arrays
# sample pega 20 índices aleatórios entre zero e o total de arrays possíveis
for indice in sample(range(total_arrays_possiveis), k=quantidade):
# gera as possibilidades
todos_possiveis = product(nums, repeat=tamanho_arrays)
# pega somente a que está na posição do índice
lista = next(islice(todos_possiveis, indice, indice + 1))
array = np.array(lista) # gera o np.array
print(array)
I used itertools
because it would be very costly to generate all the possibilities and keep them in memory. In your specific case, there are three possible values (the numbers 0, 1 and 2) in an array with 16 elements, so the total possibilities is 316 (i.e., 43,046,721 possibilities - more than forty-three million possible arrays).
Using itertools
, the elements are only obtained when necessary, saving memory (and also time, because generating everything would take a long time).
With sample
i guarantee that there will be no repeated indexes, and so I guarantee that the array to be obtained will never be the same as the ones that were previously taken.
Store already fetched arrays in a list and see if new ones already exist in this list (as suggested in another answer) It is also an option, but may not scale well if the number of arrays to be generated is too large. For example, if you want to generate 10 thousand different arrays, there will come a time when the list will have, for example, 9 thousand arrays. Then you’ll have to go through that nine grand to see if it’s repeated. Then it goes through 9001, then 9002, etc. It’s a very inefficient algorithm (also called jocosely Shlemiel the Painter’s Algorithm) - of course for small values the difference will be tiny, but remember that "for small values, everything is fast".
And remember that itertools.product
generates all possible arrays (including those where all numbers are equal). But this is not a "bug", since with randint
this can also happen (only has a smaller chance, but being "random", it is not impossible).
I really liked this idea and I took the liberty of adding a Python code that implements it. Of course, if you don’t agree, just reverse the editing.
– hkotsubo