Combination of a numerical list with limiter (Python)

Asked

Viewed 67 times

1

I am generating all possible combinations within a list, but I have not found any that would satisfy all my needs, being them:

  • Result in list format;
  • Lists with repeated items are valid;
  • Completely equal lists are not valid;
  • There should be a maximum size limiter for list generation;
  • The sum of the list items cannot be greater than tamax;
  • Code execution time is important;

I made the code below, but would like to know if there is a more refined/pythonic way of writing this same code.

OBS: I used limiters to decrease the number of generator iterations, because I had to make combinations with a list of 30 numbers, and without the limiters the code never finished running.

import numpy as np
from math import floor
from itertools import product

tamax = 12 #valor máximo da soma dos itens na lista
tamin = tamax * 0.95 #valor mínimo da soma dos itens na lista
tamanhos = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] = lista para combinações
tamanhos.sort()
limites = [] #armazena os limites para cada número de iterações
iteracoes = [] #armazena o número máximo de iterações que os elementos da lista vão gerar se combinados. É pareado com a lista limites
check = [] #usado como checagem para adicionar itens na lista limites
tamcomb = [] #armazena todas as combinações de tamanhos
comb = () #combinação do gerador aleatório


def sumall(tupla): #retorna o somatório do itens da tupla gerada
  tam = len(tupla)
  lista1 = list(tupla[0:tam])
  return np.sum(lista1)


def cresc(tupla): #transforma a tupla em uma lista em ordem crescente
  tam = len(tupla)
  lista = list(tupla[0:tam])
  lista.sort()
  return lista

for i in tamanhos:
  iteracao = floor(((tamax-i)/min(tamanhos))+1) #retorna o número máximo de iterações que podem ocorrer nas combinações entre os itens da lista
  if iteracao not in iteracoes: #adicionar o número à lista iteracoes
    iteracoes.append(iteracao)
  for j in range(0, (max(iteracoes)+1)): #adiciona na lista limites o tamanho referente ao número de iterações calculado
    if j == iteracao:
      if iteracao not in check:
        limites.append(i)
        check.append(iteracao)
limites.sort(reverse=True)
iteracoes.sort()

print("iteracoes = ", iteracoes)
print("limites = ", limites)

for i in range(1, (max(iteracoes)+1))):
  limite = 0
  for j in iteracoes:
    if i <= j:
      if limites[iteracoes.index(j)] > limite:
        limite = limites[iteracoes.index(j)]
  tamanhos_validos = [tamanho for tamanho in tamanhos if tamanho <= limite] # gera a lista com os tamanhos que serão utilizados de acordo com o número de iterações
  gerador = product(tamanhos_validos, repeat=i) # repeat é utilizado para especificar o numero de item que as combinações irão possuir
  for comb in gerador: #gera as combinações com base no tamax e tamin
    if tamax >= sumall(comb) >= tamin:
      comb = cresc(comb)
      if comb not in tamcomb: #se a combinação ainda não existe em tambcom, é adicionada na lista
        tamcomb.append(comb)
print("combinações validas = "tamcomb)

Output:

iteracoes = [3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
limites = [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
combinações validas = [[2, 10], [3, 9], [4, 8], [5, 7], [6, 6], [1, 1, 10], [1, 2, 9], [1, 3, 8], [1, 4, 7], [1, 5, 6], [2, 2, 8], [2, 3, 7], [2, 4, 6], [2, 5, 5], [3, 3, 6], [3, 4, 5], [4, 4, 4], [1, 1, 1, 9], [1, 1, 2, 8], [1, 1, 3, 7], [1, 1, 4, 6], [1, 1, 5, 5], [1, 2, 2, 7], [1, 2, 3, 6], [1, 2, 4, 5], [1, 3, 3, 5], [1, 3, 4, 4], [2, 2, 2, 6], [2, 2, 3, 5], [2, 2, 4, 4], [2, 3, 3, 4], [3, 3, 3, 3], [1, 1, 1, 1, 8], [1, 1, 1, 2, 7], [1, 1, 1, 3, 6], [1, 1, 1, 4, 5], [1, 1, 2, 2, 6], [1, 1, 2, 3, 5], [1, 1, 2, 4, 4], [1, 1, 3, 3, 4], [1, 2, 2, 2, 5], [1, 2, 2, 3, 4], [1, 2, 3, 3, 3], [2, 2, 2, 2, 4], [2, 2, 2, 3, 3], [1, 1, 1, 1, 1, 7], [1, 1, 1, 1, 2, 6], [1, 1, 1, 1, 3, 5], [1, 1, 1, 1, 4, 4], [1, 1, 1, 2, 2, 5], [1, 1, 1, 2, 3, 4], [1, 1, 1, 3, 3, 3], [1, 1, 2, 2, 2, 4], [1, 1, 2, 2, 3, 3], [1, 2, 2, 2, 2, 3], [2, 2, 2, 2, 2, 2], [1, 1, 1, 1, 1, 1, 6], [1, 1, 1, 1, 1, 2, 5], [1, 1, 1, 1, 1, 3, 4], [1, 1, 1, 1, 2, 2, 4], [1, 1, 1, 1, 2, 3, 3], [1, 1, 1, 2, 2, 2, 3], [1, 1, 2, 2, 2, 2, 2], [1, 1, 1, 1, 1, 1, 1, 5], [1, 1, 1, 1, 1, 1, 2, 4], [1, 1, 1, 1, 1, 1, 3, 3], [1, 1, 1, 1, 1, 2, 2, 3], [1, 1, 1, 1, 2, 2, 2, 2], [1, 1, 1, 1, 1, 1, 1, 1, 4], [1, 1, 1, 1, 1, 1, 1, 2, 3], [1, 1, 1, 1, 1, 1, 2, 2, 2], [1, 1, 1, 1, 1, 1, 1, 1, 1, 3], [1, 1, 1, 1, 1, 1, 1, 1, 2, 2], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]

This code took about 9s to run on Google Colab.

  • I don’t know if I understood all the criteria exactly, but finally, a suggestion: https://ideone.com/0NXpxz

  • Thanks for the help, I updated to try to explain better, I hope I got it. I’ve already updated some parts according to what I saw in your code, thank you.

1 answer

1

You can simplify this code a lot.

From what I understand, the idea is to search all combinations, but the order of the elements doesn’t seem to matter. That is, if you already have [2, 10] in the results, so you don’t have to [10, 2]. Therefore, a better option is to exchange itertools.product for itertools.combinations_with_replacement.

This alone significantly reduces the number of possibilities to be tested. In this case, your list has 10 elements (the numbers from 1 to 10), then product(lista, repeat=n) results in 10n different elements to be tested. Now combinations_with_replacement(lista, n) results in (9 + n)! / n! / 9!. For example, for n equal to 6, product generates 1 million possibilities (including the repeated ones that are not necessary, as already mentioned), while combinations_with_replacement generates only 5005 combinations (that is, almost 200 times less possibilities). To n equal to 10, the difference is even greater: 10 billion product versus 92378 of combinations_with_replacement (about 100 thousand times less possibilities).

This buys you a lot of time. And since the values are already sorted, the combinations will also be (first it generates from 1 to 10, then (1, 1), (1, 2), etc., after (2, 1), (2, 2) and so on). So you don’t have to sort the tuples, which means more time.

Anyway, I’d be like this:

from itertools import combinations_with_replacement

valores = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # valores usados nas combinações
valores.sort() # não precisaria nesse caso, pois vc já criou a lista ordenada
tamax = 12 # valor máximo da soma dos itens na lista
tamin = tamax * 0.95 # valor mínimo da soma dos itens na lista

# calcula as iteracoes
iteracoes = []
etc... (confesso que não entendi essa parte, mas usei o seu código, e a mesma lista abaixo)


results = []
for i in range(1, max(iteracoes) + 1):
    for comb in combinations_with_replacement(valores, i):
        if tamin <= sum(comb) <= tamax:
            results.append(list(comb)) # transforma a tupla em lista e adiciona nos resultados

print(results)

Note that you don’t need a function to calculate the sum, because Python already has it ready: sum receives any iterable, including a tuple (do not need to turn it into a list, gaining more time).

And in the results, to convert the tuple into a list, just use list passing the straight tuple (create the Slices, as you did with [0:tam], In addition to spending more time, it’s unnecessary in this case where you want to take all the elements - it would make sense if you just wanted to take a piece of the tuple). And as now I no longer return the repeated combinations, nor need to check if it already exists in the results list.

Rodei no Colab and took about 0.3 seconds.


I also ran a test on module timeit:

def com_permutations():
    # seu código com permutations...

def com_combinations():
    # meu código usando combinations_with_replacement...

from timeit import timeit

# executa 1 vez cada teste
params = { 'number' : 1, 'globals': globals() }
# imprime os tempos em segundos
print(timeit('com_permutations()', **params))
print(timeit('com_combinations()', **params))

Times can vary from one machine to another, in mine were (in seconds):

5.775390355000127
0.13096024300102727

As you can see, a significant gain (from 5.7 to 0.13 seconds, about 40 times faster).

Browser other questions tagged

You are not signed in. Login or sign up in order to post.