How to calculate the result of an arithmetic expression contained in a tuple respecting the operators' precedence?

Asked

Viewed 349 times

3

I have a tuple containing the tokens of an arithmetic expression: ('5', '+', '2', '*', '5').

I tried to calculate the result of the expression, but it is not respecting the precedence of the operators. In this case, it is not doing the multiplication first:

valores = ('5', '+', '2', '*', '5')
soma = 0
for indice, valor in enumerate(valores):
    if str(valor).isalnum():
        if indice == 0:
            soma = int(valor)
        else:
            soma = soma
    elif str(valor) == '+':
        soma += int(valores[indice + 1])
    elif str(valor) == '-':
        soma -= int(valores[indice + 1])
    elif str(valor) == '*':
        soma *= int(valores[indice + 1])

print(soma)
  • Could be a string "5 + 2 * 5" too

  • You just want to evaluate this specific string, or you want to make a full expression parser?

  • complete with (n) number of digits

  • 2

    There are at least two problems in the question. The title has nothing to do with the operator problem, and the accepted answer indicates whether it is a XY problem. In the next ones it is important to reduce the problem to a [mcve] and stick to a specific problem according to the site scope. To better understand how the site works, it would be good to read on Survival Guide. It would be interesting [Dit] to remove ambiguities.

2 answers

6


Do not use eval (or "use sparingly")

Yes, I know that "works", that the solution was very "easy", with a code "simple", as shown in another answer. I’m not saying she’s wrong, I just think it’s worth mentioning the problems of using eval indiscriminately. I agree that it really looks very good and is very tempting not to use. But eval, besides being controversial, hides various dangers, since he can execute any arbitrary code which is passed (the examples below have been removed of this answer):

# eval executa **qualquer coisa**, sem "pensar"

# comandos arbitrários que fazem coisas que você não quer
eval("__import__('os').remove('arquivo_importante')") # apaga um arquivo que não deveria ser apagado 

# cálculo demorado (consome CPU e memória)
eval("9**9**9**9**9**9**9**9", {'__builtins__': None})

Of course, if you "are sure" that you will only receive valid numeric expressions and "simple", you will have no problems using eval. But use it indiscriminately, without thinking, without at least validating the string that is passed to it, can cause several problems.

In the answer already linked formerly is shown how to make a parser that only accepts numeric expressions, and can limit the maximum value of each operation (thus you avoid cases of expressions that can generate very large numbers, before they burst the memory):

import ast
import operator as op
import functools

def limit(max_=None):
    """Decorator que limita o valor do resultado."""
    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            ret = func(*args, **kwargs)
            try:
                mag = abs(ret)
            except TypeError:
                pass # not applicable
            else:
                if mag > max_:
                    raise ValueError(f'resultado acima do valor máximo permitido: {ret}')
            return ret
        return wrapper
    return decorator

# operações suportadas
operators = {ast.Add: op.add, ast.Sub: op.sub, ast.Mult: op.mul,
             ast.Div: op.truediv, ast.Pow: op.pow, ast.BitXor: op.xor,
             ast.USub: op.neg}

# você pode sobrescrever operações, definindo valores limites para os operandos
# eu poderia fazer algo similar para todas as operações acima
def pow_limited(a, b):
    if any(abs(n) > 100 for n in [a, b]):
        raise ValueError(f'operandos da exponenciação devem ser menores que 100: {(a,b)}')
    return op.pow(a, b)
operators[ast.Pow] = pow_limited

@limit(max_=10**100) # valor máximo do resultado limitado em 10**100 (se der mais que isso, lança um ValueError)
def eval_(node):
    if isinstance(node, ast.Num): # <number>
        return node.n
    elif isinstance(node, ast.BinOp): # <left> <operator> <right>
        return operators[type(node.op)](eval_(node.left), eval_(node.right))
    elif isinstance(node, ast.UnaryOp): # <operator> <operand> e.g., -1
        return operators[type(node.op)](eval_(node.operand))
    else:
        raise TypeError(f'Expressão inválida: {node}')

def eval_expr(expr):
    return eval_(ast.parse(expr, mode='eval').body)

Some examples of use:

print(eval_expr('5 + 2 * 5')) # 15

# ou, para o seu caso específico
valores = ('5', '+', '2', '*', '5')
print(eval_expr(''.join(valores))) # 15

try:
    print(eval_expr("9**9**9**9**9**9**9**9")) # lança um ValueError
except ValueError as e:
    print(e) # operandos da exponenciação devem ser menores que 100: (9, 387420489)

try:
    print(eval_expr("10 ** 100 + 1")) # lança um ValueError
except ValueError as e:
    print(e) # resultado acima do valor máximo permitido: 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001

try:
    # lança um TypeError
    print(eval_expr("__import__('os').remove('arquivo_importante')"))
except TypeError as e:
    print(e) # Expressão inválida: <_ast.Call object at 0x00F4C790>

try:
    # lança um TypeError
    print(eval_expr("a + 1"))
except TypeError as e:
    print(e) # Expressão inválida: <_ast.Name object at 0x02DAC7F0>

Unfortunately, according to module documentation ast, the class ast.Num used in the code above will be deprecated from Python 3.8 and can be removed in future versions (I tested the code in Python 3.7 and it still works).

In this case, you can still make your own parser, or install existing ones. A good example is numexpr:

import numexpr as ne

print(ne.evaluate('5 + 2 * 5')) # 15
print(ne.evaluate('__import__('os').remove('arquivo_importante')')) # SyntaxError

See the documentation of numexpr for more details.

It may seem exaggerated to install a "just" package for this, but the simple fact of not accepting anything (such as the eval) is already a great advantage (although it still has the problem of not being able to limit the maximum value of intermediate operations, as done in the previous code with pow_limited).


Of course you can also validate the expression, and just call eval if it is valid. A very "naive" way would be to check whether all elements of the tuple are numbers or mathematical operations:

def is_valid(expr):
    def _valid(s):
        if s in ('+', '-', '*', '/', '**'): # verifica se é uma operação válida
            return True
        try: # tenta transformar em número
            int(s) # ou float(s), se quiser aceitar números com casas decimais
            return True
        except ValueError:
            return False
    # verifica se todos os elementos são válidos (números ou operações)
    return all(_valid(s) for s in expr)

def evaluate(valores):
    if is_valid(valores):
        return eval(''.join(valores))
    raise ValueError('Não é uma expressão válida')

valores = ('5', '+', '2', '*', '5')
print(evaluate(valores)) # 15
print(evaluate(("__import__('os')", ".remove('arquivo_importante')"))) # ValueError

As I said, this is a very naive way, because it validates only the parts and not the whole expression itself. So if the tuple only has operations, or numbers and operations out of order, etc., it is erroneously considered valid. If you only have numbers, for example ('4', '2'), the join and the result will be 42. But if you only have operations (like ('*', '/')) or incomplete/invalid expressions (such as ('4', '*')), there will be a SyntaxError (in this case you could capture it with a block try/except, to find out if there was a mistake, for example).

Despite being a simplistic solution, it already eliminates many cases of arbitrary and malicious code. And it would still be possible to limit the values of operands within the function _valid, for example.

Obviously a parser more complete than accepting only arithmetic expressions would be a little more complex than that, but perhaps the most appropriate solution.

And as pointed out by @jsbueno in the comments, maybe you don’t even need to join to generate the string. If the tuple is already properly tokenized (that is, each element of it is already a part of the expression), it would be enough to transform it into a tree and calculate the result (as an example, it was mentioned this video).

A version simplistic (and much less complete than the above mentioned links) would be:

from operator import add, sub, mul, truediv

operators = { '+': add, '-': sub, '*': mul, '/': truediv }

class Node:
    def __init__(self, value, left=None, right=None):
        if value in operators:
            self.value = value
        else:
            self.value = float(value)
        self.left = left
        self.right = right

    def is_op(self):
        return self.value in operators

    def is_leaf(self):
        return self.right is None and self.left is None

    def calc(self):
        if self is None:
            return 0
        if self.is_leaf():
            return self.value
        return operators[self.value](self.left.calc(), self.right.calc())

    def right_most(self):
        r, parent = self, None
        while r.right is not None:
            parent, r = r, r.right
        return r, parent

valores = ('5', '+', '2', '*', '5')

def create_tree(valores):
    root = Node(valores[0])
    if root.is_op():
        raise ValueError('expressão inválida')

    for v in valores[1:]:
        node = Node(v)
        if node.is_op() and not root.is_op():
            node.left = root
            root = node
        elif root.is_op() and not node.is_op():
            if root.left is None:
                root.left = node
            elif root.right is None:
                root.right = node
            else:
                root.right_most()[0].right = node
        elif root.is_op() and node.is_op():
            if node.value in ('*', '/'):
                r, parent = root.right_most()
                node.left = r
                parent.right = node
            else:
                node.left = root
                root = node

    return root

root = create_tree(valores)
print(root.calc()) # 15.0

It may seem like a lot of work when "something is already ready", but at least it will only accept valid expressions (an invalid expression will give error or at the time of building the tree, or when calculating the value - as I said, it is quite simplistic, just to get an idea of how it would look, so be sure to follow the links indicated for a more complete version). If you don’t want to build on your own, you can use the numexpr for example.

This code is very simplistic because it only accepts the 4 basic operations, does not consider parentheses, etc (and I have not done extensive tests with very complex expressions, it may need some adjustments yet). Basically it takes each element of the tuple and goes riding a tree. In the case of ('5', '+', '2', '*', '5'), would look like this:

    +
   / \
  5   *
     / \
    2   5

The operations with greater precedence are being thrown down. And to calculate it, it resolves from bottom to top (first multiplies 2 with 5, and then sums up 5 with the result of multiplication).

The result is 15.0 because I turned the numbers into float. But if you only want to accept whole numbers, use int in place.


Finally, on the eval, read here, here and here. Not that it is totally wrong to use it, but it is important to understand the problems it can cause, so that you take proper care.

  • 1

    I know calling the other "wrong" would be a bit of an exaggeration maybe, but I find a bad example of a solution for a site that proposes itself as a reference. And to make matters worse, clearly if the answer fit, it was a XY problem of the author of the question, as he starts from a code reasonably ready and exchanges it for a reprehensible solution. In my view the only possible answer to the question would be to take what has already been used in the code and complete it by correcting (but then it becomes "do it for me", because the question does not specify the central problem clearly).

  • 1

    To be "right" then the good thing is not to use it. The tuple is already "tokenized" - which is half the problem solved to create a tree and have a dynamic hiccup. I commented on the other reply with the link to a really cool video.

  • 2

    @jsbueno Oh yes, it’s just that I was so concerned with alerting to the problems of eval I forgot to mention this. In fact, mounting a tree from the tuple is much better than creating a string just to do Parsing again. If you give me time I’ll do it (although one of the links at the end of the answer already has a well completed parser implemented), but my goal was more alert to the use of eval same. I found it kind of "inconsequential" (irresponsible even) to suggest its use without at least warning of the problems he has...

  • 1

    @jsbueno About the numexpr, I did some quick research and ended up finding it a good option for evaluating arithmetic expressions. But you know if there are other alternatives as good as or better?

  • 1

    "Eval" doesn’t even have that many problems - it’s not religion - if you know what you’re putting for it, it’s as dangerous as any query in any database (any query, after all, is uam string in the source language code, which "runs" (i.e., the same thing as an Eval) within the database.

  • 1

    @jsbueno Yes, I may have exaggerated the "alarmist tone". But just as an answer that suggests using code susceptible to SQL Injection is not a good answer, I understand the same of an answer that suggests using eval without even mentioning that it can give problem if not careful. Anyway, thanks for the comments always relevant :-)

  • @jsbueno I put a version well simplified tree, just to show the general idea, because the links indicated already have more complete versions (and also because I was too lazy to elaborate more...) Anyway, there is it :-)

Show 2 more comments

4

To transform a string list in Python, we can use the method join of str:

valores = ('5', '+', '2', '*', '5')
soma = ' '.join(valores)
print(soma)

With this, the values of your tuple will form a string, each value separated by a space: "5 + 2 * 5"

See online: https://repl.it/repls/TragicRotatingDecagon


If you want to have the value of this expression formed, you can use the function eval:

valores = ('5', '+', '2', '*', '5')
soma = eval(''.join(valores))
print(soma)

Note: Here I did not put the space, as another example.

The result here will be 15, because the expression will be evaluated, multiplying and summing the values.

See online: https://repl.it/repls/KhakiSpiffyCron


Documentations:

https://docs.python.org/3/library/stdtypes.html#str.Join

https://docs.python.org/3/library/functions.html#Eval

  • 2

    If you already have the tokens separated - put it all together in a string and do Val, vi work, plus, let’s say it’s "Buddy" isn’t? I even think it’s cool as a quick fix - but the cool one from the tuple. is to create a tree with operations and operands on the leaves - and solve the problem like this. The other day I saw a very didactic video of a teacher who was doing this -let me see if I can find.

  • 4

    I found - are 22 minutes only - gives a peck - is much simpler than we think at first: https://www.youtube.com/watch?v=7tCNu4CnjVc

Browser other questions tagged

You are not signed in. Login or sign up in order to post.