Find larger and smaller unique values

Asked

Viewed 11,403 times

1

I’m trying to find the smallest and largest single value of a Python array, but something is still missing.

def pegaExtremosUnicos(v):
    i = 1
    for i in range(len(v)):
        atual = v[i]
        j = i - 1;
        while (j >=0) and (atual < v[j]):
            v[j +1] = v[j]
            j = j -1
        v[j + 1] = atual
    i = 1
    while i < len(v):
        if v[i] == v[i-1]:
            del v[i]
            del v[i-1]
        i = i +1
    d = [v[0],v[len(v)-1]]
    return tuple(d)

What was my mistake in this algorithm?

  • Why don’t you use the max and min?

  • Because the max and min should be unique

  • What’s wrong? In this test it seems to work: https://ideone.com/aFBwtd

  • Performing some tests I saw that in some cases, it returns numbers that repeated.

  • Matheus but don’t really want to use 'min/max'? Or is it just because they would return repeated values (in case there are equal max/min numbers in the list)? For this last case, I put a solution below

  • Your solution served Miguel. Thank you

  • You’re welcome Matheus

Show 2 more comments

3 answers

4


This can be done even using max/min, you have to define the set of values in which they will act, in this case we do not want to include the repeated elements (occurrences > 1) in this set:

vals = [123,4,5,4,6,7,8,300,200,150,300]
uniques = [i for i in vals if vals.count(i) < 2] # lista de todos os valores unicos de vals
max_value = max(uniques)
min_value = min(uniques)
print(max_value) # 200
print(min_value) # 5

In which vals.count(i) will count the number of occurrences of i on the list vals. ex: vals.count(300) in this case give us 2, that is not even on our new list uniques

PS: I tested your code you put in and as @zekk says in the comment, it also seems to be working well

2

An alternative to get unique elements is to use Counter to get the count of each element of the array, and then only check the largest and smallest among those whose amount is 1.

Another detail is to use min and then max Separately, although it works, it’s not very efficient because each of these functions goes through the entire list, so you’re going through all the elements twice (one to find the smallest, one to find the biggest). A better alternative would be to walk only once and check both in the same loop:

from collections import Counter

vals = [123, 4, 5, 4, 6, 7, 8, 300, 200, 150, 300]
c = Counter(vals)
menor, maior = float('inf'), float('-inf')
for n, qtd in c.items():
    if qtd == 1:
        if n > maior:
            maior = n
        if n < menor:
            menor = n

print(f'menor={menor}, maior={maior}')

Another detail is that the other answers are using count to check that the element does not repeat. The problem is that each call from count goes through the entire list to get the count of that element, and this is done for each element in the list. That is, when you do this:

[i for i in vals if vals.count(i) < 2]

It’s going through all the elements of the list, and for each of them, count(i) Scroll through the list again to get the count. This is extremely redundant (especially when the element repeats several times, because we do the same count for it several times), in addition to inefficient well.

When you go through a list several times without need, you are creating a variation of the call Shlemiel the Painter’s Algorithm (which is actually a joke to designate a bad algorithm).

Of course for small lists it won’t make that much difference (after all, for few data, everything is fast), but for larger lists begins to give problem. See a quick test done with the module timeit, comparing the algorithms of the other answers:

from collections import Counter

def com_list_count(vals): # uma das respostas
    uniques = [i for i in vals if vals.count(i) < 2] # lista de todos os valores unicos de vals
    return max(uniques), min(uniques)

def com_list_count2(lista): # outra resposta
    listaSemRepetidos = []
    for i in range(len(lista)):
        if lista.count(lista[i]) == 1:
            listaSemRepetidos.append(lista[i])
    if listaSemRepetidos == []:
        return None, None
    else:
        return max(listaSemRepetidos), min(listaSemRepetidos)

def com_counter(vals): # solução sugerida com Counter e um único loop em vez de usar min e max
    c = Counter(vals)
    menor, maior = float('inf'), float('-inf')
    for n, qtd in c.items():
        if qtd == 1:
            if n > maior:
                maior = n
            if n < menor:
                menor = n
    return maior, menor

from random import choices
# gerar uma lista com 10 mil números
# usar intervalo entre 10 e 999 para ter vários números repetidos
vals = choices(range(10, 1000), k=10000)
# adicionar números únicos (que não se repetem)
vals.extend([8, 2500, 3230, 4])

from timeit import timeit

# executa 5 vezes cada teste
params = { 'number' : 5, 'globals': globals() }

print(timeit('com_list_count(vals)', **params))
print(timeit('com_list_count2(vals)', **params))
print(timeit('com_counter(vals)', **params))

I created a list with 10,000 numbers between 10 and 999 (that is, with many repeated values). Then I added some unique values (just use values outside this range).

Times may vary depending on the hardware and other factors, but anyway, in my machine the times were (in seconds):

7.6565183
7.7807781
0.0035252999999997314

That is, the options of the other answers (that use count to know if the element repeats, and to call min and max separately) took about 2100 times longer than the solution with Counter. I also checked in IDE’s online and the results were similar:

  • Ideone.com (I had to run only 1 time instead of 5 to not exceed the time limit): Counter was about 1400 times faster
  • Repl.it: Counter was over 7000 times faster

This is because, as already explained, each call from count you need to go through the entire list again to get that element counted. And as this is done for all the numbers on the list, it makes the algorithm extremely inefficient (and still has the "bonus" to call min and max separately, which causes the list to be covered 2 more times - but what does "damage" are the calls to count within the loop).

On the other hand the Counter you only need to scroll through the list once and already count the quantities of each element. Then we only do another loop to find the largest and smallest. By not going through the list several times, this makes the algorithm much faster.

As already said, for small lists the difference will be insignificant. But for larger lists, it is worth being aware of these details.

  • This wins hkotsubo’s

2

can be done so tbm: import

def valoresUnicos(lista):
    listaSemRepetidos = []
    for i in range(len(lista)):
        if lista.count(lista[i]) == 1:
            listaSemRepetidos.append(lista[i])
    if listaSemRepetidos == []:
        return None, None
    else:
        return min(listaSemRepetidos), max(listaSemRepetidos)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.