Validate an expression based on which parentheses should close

Question

Validate an expression based on which parentheses should close

Asked 3 years, 11 months ago

Viewed 342 times

2

I’m solving a Python exercise where I had to validate an expression based on which the parentheses should close right:

Create a program where the user type an expression that uses parentheses. Your application should analyze whether the expression passed is in open and closed parentheses in the correct order.

Ex:

((a+b)*c)(x+y(3+2/3)*z) #é uma expressão válida
))(( #não é válida
6+)+85( #não é válida

But my code always validates the expression:

n1 = str(input('Digite uma expressão: ')).split()
for n2 in range(0, len(n1)):
    if n1[n2] == '(':
        for n3 in range(n2 + 1, len(n1)):
            if n1[n3] == ')':
                del n1[n2]
                del n1[n3]
                break
print(n1)
if '(' in n1:
    print('Expressão inválida')
elif ')' in n1:
    print('Expressão inválida')
else:
    print('expressão válida')

2 answers

2

If the idea is only check the parentheses (without taking into account if the rest is a valid arithmetic expression), you only need one counter, and only one loop running through the string characters and updating this counter.

If you find a (, increment the counter by 1. If you find a ), decrease the counter by 1. If at any time the counter is negative, it means that a ) who doesn’t have the ( corresponding (and then we don’t even need to check the rest).

At the end of loop, if the counter is zero, it is because each ( has his ) and the expression is valid:

expressao = input('Digite uma expressão: ')
cont = 0
for c in expressao:
    if c == '(':
        cont += 1
    elif c == ')':
        cont -= 1
    # se o contador é negativo, foi encontrado um ")" sem "(" correspondente
    if cont < 0:
        break # sai do for

if cont == 0:
    print('Expressão válida')
else:
    print('expressão inválida')

In the for i go through the string characters (note also that you do not need to split, just go through the whole string at once). And for each character, I update the counter as the logic already explained above.

If in the middle of loop the counter turn negative, I already know that the expression is invalid (it has a ) without their () and it’s no use checking the rest, so I interrupt the loop with break.

Another detail is that input already returns a string, so do str(input()) is redundant and unnecessary.

The other answers are unnecessarily complicating the algorithm (Obs: one of them has been deleted). Both use two loops nested (a for within each other, traversing the same string several times from the beginning) and multiple calls from replace, that create a new string.

That is, in addition to loops unnecessary, several strings are created during the process, which makes the algorithms very inefficient. In my example above, only one is made loop and no other string is created unnecessarily. Just because the code of the other answers "worked" doesn’t mean they are the best options.

One of them - the one that was erased - can be even more inefficient, because in addition to replace, still uses several times split (creating a list) and join (that joins the list into a string), and uses index (that traverses the string from the start, to find the element’s index - and use this in a loop often makes the algorithm very inefficient).

Of course, for a few small strings, the difference will be imperceptible (fractions of seconds), but run a few thousand times and the difference will be clearer. We can do a simple test with the module timeit:

def loops_aninhados(n1):
    for n2 in range(0, len(n1)):
        if n1[n2] == '(':
            for n3 in range(n2 + 1, len(n1)):
                if n1[n3] == ')':
                    n1 = n1.replace('(', 'o',1)
                    n1 = n1.replace(')', 'o', 1)
                    break
                    
    if '(' in n1:
        return False
    elif ')' in n1:
        return False
    else:
        return True

def loops_aninhados2(n1):
    for n2 in n1:
        if n2 == '(':
            for n3 in n1:
                if n1.index(n3) > n1.index(n2):
                    if n3 == ')':
                        n1 = ''.join(n1).replace(n2, '', 1).replace(n3, '', 1)
                        n1 = n1.replace('', ' ').strip().split(' ')
                        break
    if '(' in n1 or ')' in n1:
        return False
    else:
        return True

def loop_simples(n1):
    cont = 0
    for c in n1:
        if c == '(':
            cont += 1
        elif c == ')':
            cont -= 1
        # se o contador é negativo, foi encontrado um ")" sem "(" correspondente
        if cont < 0:
            return False

    return cont == 0

from timeit import timeit

# testando 100 mil vezes, com uma expressão válida e outra inválida
number = 100000
valida = '((a+b)*c)(x+y(3+2/3)*z)'
invalida = '6+)+85(x-y)+z(-13*2/23(r/4))'
print(timeit('loops_aninhados(valida)\nloops_aninhados(invalida)', number=number, globals=globals()))
print(timeit('loops_aninhados2(valida)\nloops_aninhados2(invalida)', number=number, globals=globals()))
print(timeit('loop_simples(valida)\nloop_simples(invalida)', number=number, globals=globals()))

timeit returns the running time in seconds. The exact time varies by various factors (such as itself hardware of the machine, if there were other things running at the same time, etc), but in general, the loops nested are slower (and the second, which uses split and join several times, it is slower still). On my machine times were:

1.3530971
6.647586800000001
0.32360029999999895

I ran a few times and the times - and especially the ratio between them - were similar: the first option was on average about 3 to 4 times slower (if compared to my solution), and the second was about 15 to 20 times slower. In Repl.it the times were different, but the relative difference between them was similar.

^{Anyway, it’s like one of our most famous users she says: "Working is different than being right".}

Finally, it is worth saying that to use for, split, join, replace and any other feature of language is not "wrong" by itself. But it’s important to know when and how to use them, and if you have a better, simpler and more efficient algorithm, and you don’t need them, it doesn’t make sense to keep using them just because "it worked".

And remembering once again that these solutions only check the balanced parentheses. If the idea were to verify that it is a valid arithmetic expression, then I would need to another solution.

Browser other questions tagged python string algorithm

You are not signed in. Login or sign up in order to post.

by magno.io • 1 point · Answer 1 · 2020-08-01T20:16:07+00:00

Hey there, Presado!! I want to be honest that your logic is very good!

I made some adjustments to your code to make it work! I’d like you to pay attention to the code!!

n1 = str(input('Digite uma expressão: '))

for n2 in range(0, len(n1)):
    if n1[n2] == '(':
        for n3 in range(n2 + 1, len(n1)):
            if n1[n3] == ')':
                n1 = n1.replace('(', 'o',1)
                n1 = n1.replace(')', 'o', 1)
                break
                
if '(' in n1:
    print('Expressão inválida')
elif ')' in n1:
    print('Expressão inválida')
else:
    print('expressão válida')

I hope I’ve helped!!