2
Details
In Assembly, C, C++, C# with unsafe and other languages it is possible to reinterpret binary code at the address as of different type from the original. Convert type int*
for float*
in C, that is to say that it points to whole valid 0x3F800000
then there’s also floating point 1.0f
.
Although it allows algorithms that require fine-grained control of the bits and even if one expects something obvious from a rereading, it is still considered U.B. (Undefined behavior), that is, undefined behavior, it is not known what compiler/interpreter will do with it.
If I’m not mistaken, almost always convert pointers is considered U.B. and I want to know why. Why do this gives B.O.? For example, reading whole float after all does not give an expected result due to the known formatting that the float has? What might turn out different?
Here Visual Studio optimizes so well that Disassembly even finds constants after conversions of known values. To my knowledge, the compiler can at most
use any code among several possible ones that represent the value (like float Nan, which has several binary codes that represent it, then the compiler chooses anyone) and
do not keep reading and writing order when optimizing code in more complicated situations (type in arrays traversing indexes instead of working with simple variables).
Other than that, I don’t know and that’s why I don’t understand.
Even for me in the compiler one would avoid the second U.B. in an obvious way: being implemented to maintain the order of reading and writing instructions than it does not guarantee access to bytes at different addresses. In other words, if what the programmer expects is that order then it only changes if there is absolute certainty that the result will be the same. Still, I think this problem has already happened to me programming on VC++. So that this?
Questions
So the first question is why re-signifying values in memory is U.B. so generalized? Now the second question is as in case of need to do this we ensure that it is not U.B. and the result is certainly the same? And of course, preferably without disabling optimizations.
To be clear, if I want in C++ two functions that convert pointers (one to "read and write" and the other to "read only") in a generic way with template, like this...
template< typename DstDataType , typename SrcDataType >
inline DstDataType* RemeanPtrAs( SrcDataType *srcPtr ){
return (DstDataType*)srcPtr ;
}
template< typename DstDataType , typename SrcDataType >
inline const DstDataType* RemeanPtrAs( const SrcDataType *srcPtr ){
return (const DstDataType*)srcPtr ;
}
Why is U.B. and how do I do exactly these functions withnon-U.B. procedures and do exactly the same as expected from them?
Edit: Is this code U.B.? Not by itself, it makes it possible to optimize the inline call? Is this the solution? Exchange typecasts for memcpy
and memmove
always avoids U.B.?
# include <string.h>
template< typename DstDataType , typename SrcDataType >
inline DstDataType* RemeanPtrAs( SrcDataType *srcPtr ){
DstDataType* dstPtr ;
memmove( &dstPtr , &srcPtr , sizeof(void*) ) ;
return dstPtr ;
}
template< typename DstDataType , typename SrcDataType >
inline const DstDataType* RemeanPtrAs( const SrcDataType *srcPtr ){
const DstDataType* dstPtr ;
memmove( &dstPtr , &srcPtr , sizeof(const void*) ) ;
return dstPtr ;
}
Aviana, notice that the first function I did converts
SrcDataType*
("read and write" pointer to "source data type") forDstDataType*
("read and write" pointer for "Destiny data type"), that is, it does not apply write restriction, it only applies the new data interpretation.– RHER WOLF
Likewise, realize that the second converts
const SrcDataType*
("read only" pointer to "source data type") forconst DstDataType*
("read only" pointer for "Destiny data type"), ie no write restriction strip, only applies the new data interpretation.– RHER WOLF
And all the time I approached the re-signification of the data regarding the format of the data type and without the change regarding the presence or absence of
const
. I mean, I worked hard to make it clear that I want to keepconst
as it is and change the pointer type.– RHER WOLF
Pimeiramente, sorry for the mistake, was tired and during my reading I had not seen from this angle. The core of my answer doesn’t change much. Casting types using C notation
int var = (int) foo;
is not the recommended way in C++. The right way is throughstatic_cast<T>
anddynamic_cast<T>
. Example:template <typename IN, typename OUT>
OUT to_out(IN var) { return static_cast<OUT>(var); }
– aviana
There’s no need for anything more complicated than that. A type T in a template maintains the const/volatile properties automatically, there is no need to define different functions for each possible combination. As I said before, unless there is a special reason or behavior that must be performed before casting, this type of function is redundant. You can just use
static_cast
,dynamic_cast
orreinterpret_cast
directly. If conversion is not possible,static_cast
does not let your program compile anddynamic_cast
returnsnullptr
.– aviana
It means none of those
_cast
of C++ cause Ub?– RHER WOLF
const_cast
andreinterpret_cast
are the most dangerous and you should use carefully, butstatic_cast
anddynamic_cast
give you clear signals when something is wrong.– aviana
Hm, I know. I would like to know the "Ub rules" about this, especially
reinterpret_cast
which is apparently what is used in this context.– RHER WOLF
https://en.cppreference.com/w/cpp/language/reinterpret_cast
– aviana