What is the problem of returning a local variable?

Asked

Viewed 134 times

5

What can happen if I return a local variable? I read on the internet that it is not a good idea to return a local variable.

Maybe why the variable when exiting the function is deleted?

Example:

std::string StrLower(std::string str)
{
    std::transform(str.begin(), str.end(), str.begin(), tolower);
    return str;
}
  • I’ve never heard that, including marking your question as a favorite in case someone answers it, I believe it may be in case of memory saving, projects for Arkadin for example that we have to save memory since it has more limited hardware, because if you pass the variable by reference it will occupy only one space in memory, and if it returns another variable it will use one more space in memory.

  • 1

    I see no sense in this statement and without you putting the context from which you read such information, I believe it will be impossible to measure anything.

2 answers

3

You never return variables. This is an abstract concept to facilitate the understanding of the code, but not being something concrete we can not transpose to another location. So the local variable only exists there and there is no way out.

You return a value. Ok, I understand what you meant by that. A locally created value (allocated to stack) can only be copied, because this value lives in an area that is guaranteed to be alive while the function is being executed, then, for all intents and purposes, consider that this area will be destroyed. If you try to access the value in its original position you will potentially be accessing junk, something you should not. Copying you carry the value to an area that will surely be alive when you access it.

Long types have a lot of data to carry and end up getting slow. That’s why most long types are by reference.

Some types have reference semantics, so instead of you returning the value of the object itself you return a reference to it, then you consider that you moved the object, so only the reference (pointer) is copied, and the reference points to the effective value. That’s where the problem lies. you returned a reference to a value that will be destroyed at the end of the function execution, which will obviously give problem, simply speaking. That’s what people talk about.

In this case the solution is to allocate heap (definition in link above), or pass the value by reference, which will allow writing in the place where the value being manipulated in this function will be used in the calling function. So it’s guaranteed that the value will be alive when you use.

Maybe why the variable when exiting the function is deleted?

This, although to use the correct terminology I would say that it is because the value is potentially destroyed when leaving the function.

Fixing the errors the code works perfect (I hope it’s a generic example, I don’t think this is an idea in real code):

#include <iostream>
#include <string>
#include <algorithm>
using namespace std;

string StrLower(string str) {
    transform(str.begin(), str.end(), str.begin(), [](char c) -> char { return tolower(c); });
    return str;
}
int main() {
    cout << StrLower("TESTE");
}

Behold working in the ideone. And in the repl it.. Also put on the Github for future reference.

3


This statement is old and exists precisely because the compilers did not know how to optimize the code well in the past, resulting in an executable with questionable performance (the variable is copied, and copies can have a high cost). Nowadays, returning something local can be even better than using output parameters. Of course, never rely on popular phrases about optimizations, always do the calculations and performance measurements of your program to get any conclusion.

We have two names for the types of possible optimizations in this case:

  • RVO: Return Value Optimization (optimization of return value in Portuguese), and
  • NRVO: Named Return Value Optimization (optimization of return value with name in Portuguese), which is basically a variation of RVO for cases when the value has a name (i.e. is a variable).

These two optimization techniques are within the technique Copy Elision (copying/omission in Portuguese). In , Copying is part of the standardization. Previously, this technique was mentioned as permitted, but did not go into many details about which cases were allowed or not allowed to omit copies.

With all this said, we can now observe the effects of RVO and NRVO:

When the RVO optimization technique is successfully applied, the copy (which would previously have been made) of an object, which had just been created and returned by the function, is omitted, making the storage area of that object the same as the object that is receiving this return value. To be clear, the following code:

#include <string>

std::string foo() { return "teste"; }

auto s = foo();

It is transformed into the following:

#include <string>

std::string s;

void foo() { s = "teste"; }

foo();

Notice how optimization used to store the outside variable s to assign the literal string "teste", instead of creating a new std::string and copy this object pro s. Compiling with GCC 7.3 and with optimization level 3, we have the following body for the function std::string foo():

foo[abi:cxx11]():
  lea rdx, [rdi+16]                    # Calcula o local onde `s` está
  mov rax, rdi
  mov DWORD PTR [rdi+16], 1953719668   # Escreve "teste" no buffer de `s`.
  mov BYTE PTR [rdi+20], 101
  mov QWORD PTR [rdi+8], 5             # Escreve o tamanho da string.
  mov QWORD PTR [rdi], rdx
  mov BYTE PTR [rdi+21], 0             # Escreve o caractere nulo da string.
  ret

Instead of creating a new object from std::string, the function foo just assume that the object’s storage location already exists (i.e., whoever called the function has already allocated space to the object) and makes use of it.

The NRVO variation does exactly the same thing, except that it is extended to variables. If we had the following code:

#include <string>

std::string foo()
{
    std::string s_local = "teste";
    s_local[0] = 'T';
    return s_local;
}

auto s = foo();

We would have exactly the same optimized output, with the single addition of a mov BYTE PTR [rdi+16], 84 at the end, which changes the first character of the string to a capital T. This is, s_local and s will have the same storage location after optimization.

There are some cases where NRVO optimization cannot be applied easily. If we just return the same local variable, then the application of NRVO is trivial. Otherwise, if we have returns of multiple values, then we are in a difficult case for NRVO, and probably the optimization will not be performed. For example:

std::string foo(bool b)
{
    std::string s1 = "abc";
    std::string s2 = "def";
    return b ? s1 : s2;
}

Here, the compiler may even be able to apply NRVO (by writing "abc" or "def" string, depending on the value of b), But once the code gets more complex, the chances of NRVO being successfully applied decrease. In contrast, if we only have returns of always the same variable, the function can get as complex as you want, that the application of NRVO will be trivial independently.

Finally, here is the output of your (briefly changed) function of some compilers (compiling with in all).

#include <string>
#include <algorithm>

std::string foo()
{
    std::string s = "teste";
    std::transform(begin(s), end(s), begin(s),
                   [](char c) { return c - 32; });
    return s;
}

With GCC 7.3 and optimization level 3:

foo[abi:cxx11]():
  lea rdx, [rdi+16] # Calcula o começo da string que já existe fora da função
  mov DWORD PTR [rdi+16], 1953719668 # Escreve "teste"
  mov BYTE PTR [rdi+20], 101
  mov rax, rdi
  mov QWORD PTR [rdi+8], 5
  mov BYTE PTR [rdi+21], 0
  mov QWORD PTR [rdi], rdx
  sub BYTE PTR [rdi+16], 32 # Sequência de subtração (pra passar pra maiúsculo)
  sub BYTE PTR [rdi+17], 32 # que foi desenrolado de `std::transform`
  sub BYTE PTR [rdi+18], 32
  sub BYTE PTR [rdi+19], 32
  sub BYTE PTR [rdi+20], 32
  ret

With Clang 6.0.0, level 3 optimization and also compiling with libstdc++:

foo[abi:cxx11](): # @foo[abi:cxx11]()
  lea rax, [rdi + 16]
  mov qword ptr [rdi], rax
  mov qword ptr [rdi + 8], 5
  mov dword ptr [rdi + 16], 1414743380 # Clang conseguiu remover o `std::transform`
  mov word ptr [rdi + 20], 69          # e já passou a string na versão maiúscula
  mov rax, rdi
  ret

You can play around and test with compiler outputs on Compiler Explorer Godbolt.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.