Performing functions with and without creating local variables

Asked

Viewed 146 times

5

Despite the simplicity of this question, I could not find an answer to it on the Internet.

Is there any structural/performance difference between the two functions defined below?

def F(x):

  Output = x + 1

  return Output

def F(x):

    return x + 1

4 answers

6

In general I can say that it is zero. The result should be the same, it cannot be changed. Performance is also not meant to be different, but nothing guarantees it. This sort of thing depends on implementation to determine if any is faster. There is not only one Python. All implementations must have the same result, but the performance characteristics of each is its problem.

That said, I am against creating variables without need. Variables must exist for two reasons:

  • Need an intermediate result to be stored somewhere for later use in more than one location, so it would be bad or it would produce wrong results if you have to get that value more than once; or it depends on the order the information will be acquired and used and so keeping the value is important
  • You need to document better what you’re doing that’s not obvious when you read it, so a variable with a excellent name is used to indicate what that result is.

In general I see programmers creating the variable because they don’t know what they can do without it. The programmer just plays cake recipe and does not understand What is a variable?. And I’ve seen some people justify that it would be legibility, but the name of the variable used is completely meaningless, which shows that it’s a lie, for example this case, artificial, I know, the name means nothing. Indicate what is an output? Dãã! This does not make the code more readable.

This is a case that, in any case, although it works, always gives the same result, from a stylistic point of view, and this has a little taste, I consider the first code to be a mistake. Another context in code next to this but it makes sense to have a variable to document there I think different.

The reply of fernandosavio shows that in the main implementation of Python is generated a bytecode different if you use the variable, which I would classify as unfortunate since it has no utility. But it does not surprise me, Python is not a suitable language when the concern is performance. If you need it use another language.

There could actually be some optimization, and one day it might be, so the hkotsubo response that shows that this implementation is actually slower when using the variable, but it’s a transitory situation.

Reinforcement that should look for the most readable option, if the use of the variable brings more readability should use it primarily, which is not even the case of the question example. Python is primarily a language of script, so any concern with micro-optimization in it makes no sense. The example should not use variable because in addition to being faster it is more readable to return a value soon, a variable that adds nothing is noise.

5


Just to complement Maniero’s response:

You can use the module dis (Disassembler) to check the difference between the generated bytecodes.

Ex.:

from dis import dis

def exemplo_1(x):
    output = x + 1
    return output

def exemplo_2(x):
    return x + 1

print('>>> dis(exemplo_1)')
dis(exemplo_1)

print('-' * 60)

print('>>> dis(exemplo_2)')
dis(exemplo_2)

Repl.it with the code running

The output (in Python 3.7.1) is:

>>> dis(exemplo_1)
  4           0 LOAD_FAST                0 (x)
              2 LOAD_CONST               1 (1)
              4 BINARY_ADD
              6 STORE_FAST               1 (output)

  5           8 LOAD_FAST                1 (output)
             10 RETURN_VALUE
------------------------------------------------------------
>>> dis(exemplo_2)
  8           0 LOAD_FAST                0 (x)
              2 LOAD_CONST               1 (1)
              4 BINARY_ADD
              6 RETURN_VALUE

It’s a good tool to test simple cases like this, as much as you don’t fully understand the output, you can already see that the exemplo_1 allocates memory to ouput.

5

Complementing the other answers, a simple way to compare performance is to use the module timeit.

import timeit

def exemplo_1(x):
    output = x + 1
    return output

def exemplo_2(x):
    return x + 1

n = 100000000
rep = 5
print(timeit.repeat("exemplo_1(1)", 'from __main__ import exemplo_1', number = n, repeat = rep))
print(timeit.repeat("exemplo_2(1)", 'from __main__ import exemplo_2', number = n, repeat = rep))

In the above example I am calling each function 100 million times (and repeating each cycle 100 million times for 5 times). The return is a list of the times of each of the 5 cycles:

[10.230189208560486, 10.76496154874702, 10.183613784861148, 9.914715879252743, 9.953630417515548]
[9.250180609985641, 9.20510965178623, 9.140656262847259, 9.346281658065251, 9.511226614071674]

The time can vary with each run as it depends on a number of variables (like your hardware, if there were other processes running on the machine, etc.), so you won’t necessarily get the same results as me. But note that the second version (without allocating the variable output) is slightly faster (about 1 second difference, more or less).

But that was for 100 million executions. When I switched the n to 1 million, the difference between the first and second versions fell to about 1 hundredth of a second. And for smaller programs (where the function will be executed a few times) will make less difference even to the point of being irrelevant (for n equal to 100, for example, obtained differences in the box of 1 microsecond - the sixth decimal place of the second fractions).

Honestly, unless your code really needs to run hundreds of millions of times in a row and the performance is extremely critical, you shouldn’t worry about it. The main concern should be the creation of readable code and the use of variables where this makes sense, as already well explained in maniero’s response.

And if your system is experiencing performance issues, it certainly won’t be in these roles. In this case, you should do specific performance tests to find out where the bottlenecks are.

3

There is always the %time function that I use in jupyter, but in this case the time difference was so infinite that it didn’t register the difference: follow another example of connection with an API, which recorded the time.

Seu codigo com %time

Segue outro exemplo de conexão com uma API, que registrou o tempo

These commands are part of the famous iPhone Commands. has a very interesting also in which appeared a difference:

Tem um bem interessante tambem em que apareceu uma diferença:

Outside this follows the link to Voce see all the commands Magic:

Ipython Magic

Obs: sorry if the post ta messed up I’m new around here, anything edits!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.