Union and type conversion

Asked

Viewed 260 times

2

Searching about Unions I found the following information:

Unions are usually used for type conversions. For example, we can use a Union to write the binary representation of an integer in a disk file.

First we create a union between an integer and a 2-byte array:

union pw {
int i;
char ch[2];
};

Then, a function that prints the values:

putw(union pw word, FILE *fp)
{
    //  Escreve a primeira metade
    putc(word->ch[0], fp);
    //  Escreve a segunda metade
    putc(word->ch[1], fp);
}

I did not understand very well what the author meant by this example. Did he mean that through a Union we could use, for example, an integer as a char? Is there any other example related to type conversion using Unions? Another question: according to the arguments of the function created, it would receive a Union per copy/value, since there is no pointer specifier (*) in the declaration, but there is the use of a pointer (->) in the body of the function.

  • If I were you I would not consult the source from which you obtained this information. Those who teach should at least know that a int occupies 4 bytes

1 answer

2


What is a union?

One union functions as a memory block that is used to store variables of different types so that when a new value is assigned to any of the fields, the existing information is overwritten with the new value. That is, the fields share the same memory space, or part of it (if the field is smaller than the total Union space).

Thus, all the fields of union start at the same memory address. That way you can pick up pieces of the whole at a time and write them into the file as you did.

Unions vs Structs

I know this isn’t part of your question, but just for the purposes of comparison, this operation differs from struct's, where each field has its distinct memory address, its memory space. That is, in struct's, each field has its value and when assigning a value to one field, it does not change the value of another. In union's fields have the same start address.

About your example

Leaving the peculiarities aside for didactic reasons, an integer (int) in c/c++ usually occupies 4 bytes (32 bits) of memory (with a 32-bit processor). So in your example, actually, it’s not the 'half' that’s being printed first, but the first byte of the int, because the guy char takes 1 byte. So in the example you wrote the first two bytes of int.

Simple example of Union

Here you can see how the union's can be used to 'pick up pieces' of a whole or other type of information:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

// O tamanho que a union armazena na memória é o tamanho do maior campo da union
union MinhaUnion {
    uint32_t x; // ocupa 4 bytes de memória
    uint16_t y; // ocupa 2 bytes de memória
    uint8_t z; // ocupa 1 byte de memória
}; // tamanho total que a union ocupa na memória é 4 bytes.

int main() {
    MinhaUnion u;

    // coloca o inteiro nos 4 bytes da union
    u.x = 123456789;

    // pega o inteiro total, depois o valor inteiro dos dois primeiros bytes, depois o inteiro somente do primeiro byte
    printf("X = %d, Y = %d, Z = %d\n", u.x, u.y, u.z);

    return 0;
}

About the operators -> and .

The operator -> is used with pointers, variables that contain memory addresses. When a variable is passed by reference (stating the parameter as int* i for example) when referencing variable members within the function, the operator ->. But if the variable was value-driven, the operator shall be used . to reference the members of the struct or union (or methods of an object, if you are using C++). Modifying the example a little bit above, we now have:

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

// O tamanho que a union armazena na memória é o tamanho do maior campo da union
union MinhaUnion {
    uint32_t x; // ocupa 4 bytes de memória
    uint16_t y; // ocupa 2 bytes de memória
    uint8_t z; // ocupa 1 byte de memória
}; // tamanho total que a union ocupa na memória é 4 bytes.

int imprimeUnion(union MinhaUnion un) {
    FILE *fp = NULL;

    fp = fopen("output.txt", "w");

    if (fp != NULL) {
        fprintf(fp, "X = %d, Y = %d, Z = %d\n", un->x, un->y, un->z);
        return 1;
    }

    return 0;
}

int main() {
    MinhaUnion u;

    // coloca o inteiro nos 4 bytes da union
    u.x = 123456789;

    // passa a union por valor e imprime os campos dela no arquivo
    imprimeUnion(u);

    return 0;
}

See what the compiler says about the code above:

union_vs_struct.cpp: In function ‘int imprimeUnion(MinhaUnion)’:
union_vs_struct.cpp:18:51: error: base operand of ‘->’ has non-pointer type ‘MinhaUnion’
     fprintf(fp, "X = %d, Y = %d, Z = %d\n", un->x, un->y, un->z);
                                               ^
union_vs_struct.cpp:18:58: error: base operand of ‘->’ has non-pointer type ‘MinhaUnion’
     fprintf(fp, "X = %d, Y = %d, Z = %d\n", un->x, un->y, un->z);
                                                      ^
union_vs_struct.cpp:18:65: error: base operand of ‘->’ has non-pointer type ‘MinhaUnion’
     fprintf(fp, "X = %d, Y = %d, Z = %d\n", un->x, un->y, un->z);

Saying that the operand of -> (the variable that was passed by value - un) is not a pointer. Now, if you change -> for . or change the parameter declaration in the function so that the variable is passed by reference, the code compiles, depending on what you intend to do.

About that 'conversion''

Finally, what happens is not a "conversion" between types. But as stated above, because all members share the same initial memory address, you can pick up 'pieces' from the total memory space of union and manipulate them through another field (variable). So if you have the following union two-field:

union UNION {
    int num;
    char[4] vetor;
};

you can treat the second byte of the number as a char when referencing vetor[1] since vetor and num reference the same initial memory address.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.