It’s like the same memory address has two values... How is that possible?
The short answer is: This is not possible. It turns out that when you had the variable value printed a
your program did not read again in memory which value stored there, it returned the direct value.
This result is due to an optimization that the compiler did. When you use const
there are performance optimization possibilities that can be explored because the compiler knows that you will not change the value of this variable. But then you went in there and tripped the compiler.
Let’s Investigate
I have tried compiling with Clang and Sanitizer to detect Undefined behaviour, but he saw no problems. Even so I do not know if we can say that it is not Undefined behaviour.
I tested compiling with gcc++ and with Clang++, both passing the options -g -O0
, and yet the result was the same. That is, it is an optimization so "basic" (why not do?) that the compiler is doing even when you can as little as possible of optimizations with the option -O0
.
Let’s look at the compiler-generated Assembly to confirm. For this I used the code below to avoid loading the iostream library (more lines in Assembly). Note that we are now interested in the return value of main.
int main() {
const int a = 5;
int *p1 = (int *)&a;
(*p1) = 9;
return a;
}
With the const
this program has a return value of 5 (wrong) and without the const
of 9 (correct). Now let’s see the Assembly (you can pass the option -S
when compiling for it to stop at the Assembly step).
With the const
g++ generated the following Assembly on my machine (compiled with g++ -O0 -S main.cpp
.
.file "main.cpp"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movq %fs:40, %rax
movq %rax, -8(%rbp)
xorl %eax, %eax
movl $5, -20(%rbp)
leaq -20(%rbp), %rax
movq %rax, -16(%rbp)
movq -16(%rbp), %rax
movl $9, (%rax)
movl $5, %eax
movq -8(%rbp), %rdx
xorq %fs:40, %rdx
je .L3
call __stack_chk_fail@PLT
.L3:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Arch Linux 9.3.0-1) 9.3.0"
.section .note.GNU-stack,"",@progbits
Now let’s remove the const
and see the generated Assembly.
.file "main.cpp"
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $32, %rsp
movq %fs:40, %rax
movq %rax, -8(%rbp)
xorl %eax, %eax
movl $5, -20(%rbp)
leaq -20(%rbp), %rax
movq %rax, -16(%rbp)
movq -16(%rbp), %rax
movl $9, (%rax)
movl -20(%rbp), %eax
movq -8(%rbp), %rdx
xorq %fs:40, %rdx
je .L3
call __stack_chk_fail@PLT
.L3:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Arch Linux 9.3.0-1) 9.3.0"
.section .note.GNU-stack,"",@progbits
Well, I don’t understand much of Assembly, but if we look at the diff of these two codes the only difference is in one line (22). With the const
line 22 is
movl $5, %eax
and without the const
line 22 becomes
movl -20(%rbp), %eax
Did you see the difference? When the variable a
is not const the value of it, as any other variable is read from memory when we need it (-20(%rbp)
(o -20 here is an Ofsset relative to the pointer, in the case rbp which is a frame pointer indicating that this variable is in the stack) which is where the variable a
has been stored), but when it is const
the generated Assembly has the value literal of the variable a
when it has been stated, that is one of the ways that g++ treats constants, with higher optimizations the compiler simply puts the literal value of the constant whenever it is used, thus never consulting the memory, bringing great benefits while efficiency.
For more information you can read this FAQ, also worth checking the documentation and that question in the OS has several interesting answers.
Interesting fact: I recently found that question no stackoverflow on the std::launder
(i didn’t know) that serves for you to grab the pointer of an object to which you passed the pointer. I decided to test with the code here and realized that *std::launder(&a)
or simply *(&a)
will read the correct value in the variable a
because first you get the address of a
and then see the value stored at that address.
You are modifying a constant by writing to a memory address reserved for it. I believe you are falling into an Undefined Behavior, since you "tricked" the compiler.
– Max Fratane
Not always what you write and the same output line by line in the compiler, it does several optimizations for your code to run theoretically in the best possible way. even in some compilers has to disable these optimization functions
– Junior Nascimento