3
I was wondering how to answer this issue and I came to the conclusion that I would need a data structure with:
- the size of a set
- a previously reported size vector
It would be something like the structure:
struct conjunto {
int tamanho;
int *conteudo;
};
In my first naive idea, I would allocate one memory region to the structure and then another to its content:
struct conjunto* aloca_conjunto(int n) {
struct conjunto *set = malloc(sizeof(struct conjunto));
set.conteudo = n? malloc(sizeof(int)*n): NULL;
set.tamanho = n;
return set;
}
However, this double allocation bothers me for two reasons:
- memory fragmentation
- possible loss of reference location
So my main question is
- What would be the alternative to implement in a single allocation?
I know that even if, hypothetically, I could make this unique allocation, access the element of
n
-nth position of the set would involve two dereferences:set->conteudo[n]
, but at least the pointerset->conteudo
would be close to the desired memory region, maximizing the chance of a hit cache due to space locality.
A point of extra curiosity, to deepen the question:
- And if I were to place several of these elements in a contiguous memory region (in place of a vector with the pointers), how would I rescue the
n
-nth element of the typestruct conjunto
of that memory region?
If you know how many elements the content should have at the moment it allocates the set manages to do everything at once by playing a little with memory. For a set with 5 integers would do so:
conjunto = malloc(sizeof(int) * (5 + 1)); conjunto->conteudo = (&conjunto->tamanho) + 1;
which assumes that the layout of the memory for the structure places the integer before the pointer, and without any padding– Isac
@Isac I thought of something like this, but I get the impression that
conjunto->tamanho
andconjunto->tamanho[0]
share bytes, then a change inconjunto->tamanho[0]
wouldconjunto->tamanho
point to another point. So I evolved my thinking toconjunto->conteudo = conjunto +1
, which would point to the position just after[tamanho][*conteudo][bytes do vetor de conteudo]
. And, yes, in this case it is possible to state that I will create the object with a known size. Anyway, relocations are also possible.– Jefferson Quesado
@Isac , I tried to explain the point that I think most important (that refers to your comment) of what is an additional curiosity
– Jefferson Quesado
@Isac now that you mentioned "creative solution," I thought I’d create an index vector to do a less indirect access, and that index vector would precede the memory region with the structures, but I’m short of a C compiler to validate that. My original thought was exactly what you commented
– Jefferson Quesado
Yeah, I rethought my last statement a little bit, because although it’s true and you can go through an array of these elements, the creation of the array has to be done in an artistic way, because the structure for normal types results in 8 bytes, but each element would have more contiguous size in memory, so the allocation could not be direct, and the same to put the elements there. In the end, I think it doesn’t compensate for the crazy things that need to be done, to avoid some allocations :D
– Isac
@Isac , already have an answer? I will explore this question during the holiday
– Jefferson Quesado
I thought my answer would not be particularly interesting and I chose not to put it. But I could build with the
realloc
from start to finish and always relocating to double, but I still think that what is gained in fewer allocations is lost in access time, because in a vector of this genus will not be able to directly access an element without going through the previous ones, since each one has different size. For this reason the location where the nth element in the memory is still is not deterministic and requires O(N) to access.– Isac