How to reduce the number of loops to calculate the smallest difference between all the numbers in a list?

Asked

Viewed 177 times

6

The program below returns the smallest possible difference between all the elements of a list:

def calculo(A):
    r = float("inf")
    n = len(A)
    for i in range(0, n-1):
        for j in range(1, n):
            if abs(A[i] - A[j]) < r:
                r = abs(A[i] - A[j])
    print(r)

But I’m tired of writing all the codes that need to compare pairs of elements from the same list, as if I were writing a Bubble Sort.

How do I reduce the number of loops in the above code? What’s the principle to start thinking differently and start writing more efficient code?

  • 1

    Maybe the second for should be: for j in range(i+1, n):?

1 answer

5


An alternative is to first sort the list, and then go through it only once, calculating the difference between each element and the next. As the list will be ordered, it is not necessary to calculate the difference between the first and the third, for example, because it will certainly be greater or equal to the difference between the first and the second.

def menor_diferenca(lista):
    r = float("inf")
    lista = sorted(lista)
    for i in range(len(lista) - 1):
        diff = abs(lista[i] - lista[i + 1])
        if diff < r:
            r = diff
    return r

Of course, sorting the list has its cost, but according to the documentation, the algorithm used is the Timsort, that ensures good performance (O(n logn) in the worst case), which is already better than two nested loops.

One detail is that your second loop is not right. For example, when i for 3, you make a for j in range(1, n), that is to say, j will also pass through the index 3 and then the difference will be zero (if the list does not have repeated values, the result will be - erroneously - zero). So actually this loop should start at i + 1.


Making a basic comparison using the module timeit to measure times:

def com_loop(lista):
    r = float("inf")
    n = len(lista)
    for i in range(0, n - 1):
        for j in range(i + 1, n):
            diff = abs(lista[i] - lista[j])
            if diff < r:
                r = diff
    return r

def com_sort(lista):
    r = float("inf")
    lista = sorted(lista)
    for i in range(len(lista) - 1):
        diff = abs(lista[i] - lista[i + 1])
        if diff < r:
            r = diff
    return r


import random
# gerando uma lista com 1000 números aleatórios
numeros = random.sample(range(1, 1000000), 1000)

import timeit

print(timeit.repeat('com_loop(numeros)', repeat=3, number=100, globals=globals()))
print(timeit.repeat('com_sort(numeros)', repeat=3, number=100, globals=globals()))

The exact running time varies from one machine to another, and even on the same machine can have variations between one execution or another, but just to get an idea, the result I got was:

[24.693826, 24.928060100000003, 25.2158304]
[0.13898210000000688, 0.14047560000000203, 0.10606690000000185]

That is, the loops nested took about 24 seconds, while the algorithm traversing the ordered list took about 0.13 seconds.

I also put the code in IDE’s online and the result was very similar (the version with the sorted list is much faster than both loops). See on Repl.it and in the Ideone.com (in the latter had to decrease the amount of executions not to exceed the time limit of the site).


You could also use lista.sort() to sort the list. The difference is lista.sort() modifies the list itself, while sorted returns another list.

  • 1

    Thank you so much for all the detail, and for the benchmark, impressive as n.log(n) grows so much smaller than exponential. I hadn’t really touched that with the ordered list I would only need to iterate once, and I didn’t know about timsort, for me, quicksort was the best option available

Browser other questions tagged

You are not signed in. Login or sign up in order to post.