How to allocate in contiguous memory a structure that contains 1 vector with user defined size?

Asked

Viewed 111 times

3

I was wondering how to answer this issue and I came to the conclusion that I would need a data structure with:

  • the size of a set
  • a previously reported size vector

It would be something like the structure:

struct conjunto {
  int tamanho;
  int *conteudo;
};

In my first naive idea, I would allocate one memory region to the structure and then another to its content:

struct conjunto* aloca_conjunto(int n) {
  struct conjunto *set = malloc(sizeof(struct conjunto));
  set.conteudo = n? malloc(sizeof(int)*n): NULL;
  set.tamanho = n;

  return set;
}

However, this double allocation bothers me for two reasons:

  1. memory fragmentation
  2. possible loss of reference location

So my main question is

  • What would be the alternative to implement in a single allocation?

I know that even if, hypothetically, I could make this unique allocation, access the element of n-nth position of the set would involve two dereferences: set->conteudo[n], but at least the pointer set->conteudo would be close to the desired memory region, maximizing the chance of a hit cache due to space locality.

A point of extra curiosity, to deepen the question:

  • And if I were to place several of these elements in a contiguous memory region (in place of a vector with the pointers), how would I rescue the n-nth element of the type struct conjunto of that memory region?
  • 1

    If you know how many elements the content should have at the moment it allocates the set manages to do everything at once by playing a little with memory. For a set with 5 integers would do so: conjunto = malloc(sizeof(int) * (5 + 1)); conjunto->conteudo = (&conjunto->tamanho) + 1; which assumes that the layout of the memory for the structure places the integer before the pointer, and without any padding

  • @Isac I thought of something like this, but I get the impression that conjunto->tamanho and conjunto->tamanho[0] share bytes, then a change in conjunto->tamanho[0] would conjunto->tamanho point to another point. So I evolved my thinking to conjunto->conteudo = conjunto +1, which would point to the position just after [tamanho][*conteudo][bytes do vetor de conteudo]. And, yes, in this case it is possible to state that I will create the object with a known size. Anyway, relocations are also possible.

  • @Isac , I tried to explain the point that I think most important (that refers to your comment) of what is an additional curiosity

  • @Isac now that you mentioned "creative solution," I thought I’d create an index vector to do a less indirect access, and that index vector would precede the memory region with the structures, but I’m short of a C compiler to validate that. My original thought was exactly what you commented

  • 1

    Yeah, I rethought my last statement a little bit, because although it’s true and you can go through an array of these elements, the creation of the array has to be done in an artistic way, because the structure for normal types results in 8 bytes, but each element would have more contiguous size in memory, so the allocation could not be direct, and the same to put the elements there. In the end, I think it doesn’t compensate for the crazy things that need to be done, to avoid some allocations :D

  • @Isac , already have an answer? I will explore this question during the holiday

  • I thought my answer would not be particularly interesting and I chose not to put it. But I could build with the realloc from start to finish and always relocating to double, but I still think that what is gained in fewer allocations is lost in access time, because in a vector of this genus will not be able to directly access an element without going through the previous ones, since each one has different size. For this reason the location where the nth element in the memory is still is not deterministic and requires O(N) to access.

Show 2 more comments

1 answer

1


It’s pretty simple to do if you’re using a compiler compliant with at least the C99 standard, which in practice is almost any situation. Use a VLA (Variable Array Length). You just have to take care to allocate memory for the structure and the data together. I believe this meets the requirements of the question.

#include <stdio.h>
#include <stdlib.h>

typedef struct {
    size_t tamanho;
    int conteudo[];
} Conjunto;

int main(void) {
    Conjunto *dados = malloc(sizeof(Conjunto) + 10 * sizeof(int));
    dados->tamanho = 10;
    for (int i = 0; i < dados->tamanho; i++) dados->conteudo[i] = i;
    for (int i = 0; i < dados->tamanho; i++) printf("%d, ", dados->conteudo[i]);
}

Behold working in the ideone. And in the repl it.. Also put on the Github for future reference.

If you need to grow up just make one realloc() updating the size with the part that calculates the space for the elements of the array.

If this is not possible by using a non-compliant compiler then for this case the solution to use only one array can be a good one leaving the position 0 reserved for the size. It is not perfect, but should work well in almost any real case, just need to be some care.

The idea of auxiliary indices seems very bad and even worse than fragmentation.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.