3
I set up Visual Studio to compile VC++ with /Ox
and compiled this code (with more others that were omitted to simplify).
union { unsigned long long u64 ; unsigned short u16[4] ; } x ;
union { unsigned u32 ; unsigned short u16[2] ; } i ;
i.u16[0] -= x.u16[3] ;
According to Disassembly, the i
entirely was in ecx
and the x
was in memory. I expected the compiler to generate Assembly like this.
sub cx , word ptr [x+6]
But what he generated was this, that is, two more instructions preceded that prove unnecessary.
mov rax , qword ptr [x]
shr rax , 30h
sub cx , ax
In other words, it clicked on a register... and still loaded more than needed, having to move the data after. It was as if doing i.u16[0]-=(unsigned short)(x.u64>>48)
! In addition, the data in rax
are no longer used (they are overwritten), rendering both the shr
as to the mov
.
Why didn’t he optimize more? There’s some more configuration needed to improve this code that even a baby can see where to reduce operations?
I am voting to close, since you did not specify the compiler version, and the question is contradictory to the current situation: https://godbolt.org/g/K388UN
– Mário Feroldi
For me can close, nobody will answer... right? But two things are certain: (1) when you need the compiler version, you warn yourself to put it instead of using it as one of the two excuses to close someone else’s topic and (2) the question is not contradictory to the situation because it is a reality using VS in the settings I used, no matter if with another program and other parameters you saw something different, it does not make my question contradictory. Anyway, you can close the topic. But you can’t keep hounding me for reasons of disagreement on another topic, right? Frankly, man...
– RHER WOLF