How to improve the time cost of the algorithm - Python

Asked

Viewed 99 times

0

Good night, you guys! I have the following problem in this algorithm, its cost is still high and I would like to know if there is a way to improve its cost even more, because this is a matter of spelling that I have to deliver online and unfortunately a test gave a high time cost about 3s, if you have any idea how you could improve would be very grateful, or if you know some faster Levenshtein with a better cost of time, I would also be very grateful, I will leave the link of the question if you want to make some tests https://olimpiada.ic.unicamp.br/pratique/p2/2008/f2/ortografia/

n, m = [int(c) for c in input().split()]
dicionario = [0] * n
palavras = [0] * m
for i in range(n):
    n = input()
    dicionario[i] = n.lower()
for i in range(m):
    m = input()
    palavras[i] = m.lower()

def levenshteinDistance(s1, s2):
    if len(s1) > len(s2):
        s1, s2 = s2, s1
    distances = range(len(s1) + 1)
    for i2, c2 in enumerate(s2):
        distances_ = [i2+1]
        for i1, c1 in enumerate(s1):
            if c1 == c2:
                distances_ += [distances[i1]]
            else:
                distances_ += [1 + min((distances[i1], distances[i1 + 1], distances_[-1]))]
        distances = distances_
    return distances[-1]

for i in range(len(palavras)):
    lis = []
    for n in dicionario:
        if len(n) <= len(palavras[i]) + 2 or len(n) < len(palavras[i]) + 2:
            if levenshteinDistance(palavras[i], n) <= 2:
                lis += [n]
        else:
            pass
    palavras[i] = lis
    print(' '.join(palavras[i]), end="")
    if palavras[i] != palavras[-1]:
        print(" ", end='')
    print('')
  • the creation of the two initial lists can be worked out to integrate the two loops for which comes in sequence

  • The two initial lists are of the entries, how could you do this ??

1 answer

0

following the entry and exit of the exercise, the initialization of the initial lists becomes costly

dicionario = [0] * n
palavras = [0] * m

what happens is that a single position list is created several times, you can make a list comprehension already eliminating the loops in sequence

dicionario = [input().lower() for x in range(n)]
palavras = [input().lower() for x in range(m)]

you play the loop for inside the list, you won’t be saving on the loop, but on the creation of the starting lists, 6 lists for two

another observation is that you can withdraw the . Power(), the input of the exercise is already all in minuscule, and the words with capital letters and minuscula are different, this makes add one more operation

[input() for x in range(n)]

gets 6 function calls less

  • I think this would not make the algorithm out of the 3s kkkkkkkkkkkk, I’m still wondering what input is this that lasts 3s, this is not from God

  • truth, this change is only a slight optimization, maybe it would also be advantageous to look at the other 4 loop in the code, maybe I can reduce a

  • I tried to reduce but not be able to, I know if I can improve the algorithm of Levenshtein, I believe I can, because to hear some people saying that they have a solution of O(m*n) m is the line and n is the column, but not being able to implement !

  • another way to improve the code is to avoid the calls of Len, use variable to store the result of Len that will be used more than once, I think consulting a variable is faster than calling a function

Browser other questions tagged

You are not signed in. Login or sign up in order to post.