List<> Best practice, start with fixed capacity or start without limit?

Asked

Viewed 221 times

4

I have a scenario where I will receive a list, or an array, or any other type of data from the database where I can know the size of my list before creating it, what is the advantage between the newList and the newList2 in my code below ? Is there any performance benefit in this case? The guys (int and object) are just to illustrate the example.

public Task TestMethod(List<int> amounts)
{
    List<object> newList = new List<object>(amounts.Count);
    List<object> newList2 = new List<object>();
}
  • If performance is a concern then create a List indicating your ability.

  • Is the difference significant? That’s my question, if it really makes up for.

  • If you know the ability why not use it? You won’t have any more work if you do, so it pays off, whether it’s a significant difference or not.

3 answers

10


I did a test with both cases and the performance was irrelevant, with small values, but the memory lease is something that should be taken into account.

When you create the list with Capacity 0 (zero), o . NET will dynamically increase and allocate space as new items are added to the list, always doubling the previous capacity, i.e., in the power of 2. In practical terms, if your list has 2 elements and its capacity is 2, By adding one more element it doubles the capacity, allocating space for 4 elements. So far so good, but since it is exponential, when you have 64 elements and add one more, you will be allocated a capacity of 128, and so on. If you have 8192 elements and add another, the capacity goes to 16384 and so on.

With this we can conclude that, from the point of view of space and memory allocation, it is more advantageous to define the capacity of the list if we know the size and this is large. How big is it? You would have to analyze the type of object that goes in the list to calculate the allocated memory, but something with more than 1000 items would be interesting to define the capacity.

Here a code that demonstrates this:

System.Collections.Generic.List<int> lista = new System.Collections.Generic.List<int>();
Random rand = new Random();
int i =0;
while (i < 1000)
{
    lista.Add(rand.Next(0, 20000));
    i++;
    Console.Write("\nTamanho: " + lista.Count.ToString());
    Console.Write("  Capidade: " + lista.Capacity.ToString());
}

Can be executed here: https://dotnetfiddle.net/5Kw99i

Output example:

Size: 31 Grass: 32
Size: 32 Grass: 32
Size: 33 Grass: 64
Size: 34 Body size: 64

Note that the capacity doubles, and then the space allocated in memory also.

EDIT: I ran the same code on dotnetfiddle with 50,000, 500,000 and 1,000,000 items, without initializing the capacity and initiating. Below the results demonstrate the difference in memory consumption. Again, the performance from the time/cpu point of view was irrelevant:

+-----------+------------------+------------------+
| Itens     | Sem inicializar  | Inicializando    |
+-----------+------------------+------------------+
| 50.000    | Memory: 512.23kb | Memory: 195.34kb |
+-----------+------------------+------------------+
| 500.000   | Memory: 4.01Mb   | Memory: 1.91Mb   |
+-----------+------------------+------------------+
| 1.000.000 | Memory: 8.01Mb   | Memory: 3.81Mb   |
+-----------+------------------+------------------+
  • Ricardo, perfect example, was really where I wanted to go, the application in which I work any minimal detail of performance is very important. We benchmark for everything and I came across a change where they had changed using the limit.

6

The advantage of using the list with a defined capability is that there will be no future relocation of the list items. When you use a list without setting the size, it starts with zero capacity and grows as you need. The capacity will always be greater or equal the number of elements in the list.

In short, you earn in performance using a set capacity list because internally C# will not need to reallocate existing elements to add a new one.

1

If nothing is specified it starts with 0 capacity and jumps to 4 if you add an item. Then it doubles of capacity whenever it occupies all the free spaces. At least it is the current implementation, nothing guarantees that this will always be so.

If you already know how many elements the list will have, or at least something approximate, there will be gain in defining the initial ability to avoid memory relocation. If you have the information there is zero advantage in not putting up the initial capacity, even if the gain is not so great, and it is not so small. Without indicating the capacity is about 40-50% slower, this in an operating system that does optimizations in relocations, if you do not have this, can become a tragedy.

In addition to having to relocate memory every time it increases size puts pressure on Garbage Collector, what will trigger it more often and with larger pauses. In general the tests that people do ignore this. We should always avoid allocation, even small ones. And many allocations are not easily visible, which is the case of a growing list.

It is preferable to allocate more elements than it will use than to stay to do memory relocation. Of course it always has a point that may not compensate.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.