Why does . NET’s Garbage ollector have several generations?

Asked

Viewed 300 times

8

I understand why there is a Garbage Collector, but I wanted to know the advantages of making it have several generations. It seems something much more complicated, it needs to be very advantageous.

2 answers

10


The GC is something very old, much more than people realize. This is a mechanism studied in depth by many people. More and more were detecting the problems and finding solutions.

Problems of the GC

Pause

One of the problems this mechanism has always had was the pause when collecting. It can be long and without any time restriction. This is something that makes a lot of people avoid GC. Especially on Garbage Collector what makes tracing this is a problem, although even other techniques may suffer from this and other problems that the tracing does not suffer. I will not go into detail about the problems of other techniques since the . NET uses a garbage collector tracker.

Some techniques have been devised to reduce pauses. One of them is the incremental collection where it is possible to determine a maximum time that the collection can take place. It helps well, but has a problem, it is common to have a lot of rework for "leaving the service by half". There are some ways to minimize this, but it is not the most suitable solution for many cases. I needed a better solution to decrease breaks.

Memory fragmentation

There is also a problem in memory management that holds true for any application, no matter how it manages. As the application goes allocating and releasing memory creates allocation holes. This has two problems: there is memory waste because many of these holes cannot be reused, and when they are reused, objects that are used together end up being allocated far apart from each other creating a reference locality problem, which can greatly degrade the performance of access to objects.

The solution to both problems was the generational garbage collector.

Other problems

There are other problems that the generational mechanism does not solve, one of them is the collection Determinism. There is solution, is the using. It is possible to create a deterministic collector, but it is too complicated and has disadvantages.

Generations

Gen0

In generational GC there is a first generation where small objects are allocated whenever they are created.

This generation is usually small. The . NET starts with 256KB (if not changed). This size is adapted over time according to the need of the application. The size may vary if the architecture is 32 or 64 bits, and the type of GC that is enabled and even the amount of colors machine.

There are actually some arenas of this size in Generation 0. One for each core of the machine. This way the allocation never occurs concurrently and does not need locks to allocate, as is common to do in allocations in the heap. Just increment the pointer that determines the end of that preallocated memory area.

This is a huge performance gain. The cost of allocating in GC in . NET is the same as allocate into stack, which is very fast. This is something fantastic.

When you make a new allocation and fill an arena the application is frozen and starts a collection. This collection is very short because there is little to collect. So the pause should not be longer than a few microseconds.

Actually, it’s not really a collection that’s done. What actually happens is the copy of the objects to the next generation, Gen1, which still have reference in the application. In general, very little is copied because most objects die soon. This is called compaction.

This is great because it allocates everything next and in sequence, avoiding memory fragmentation. Isn’t it great? Solves both problems.

It is very common for semi-automatic techniques such as reference counting to perform worse than this form of GC while only collecting Gen0. It can be faster until much manual memory technique when analyzing all necessary operation and all loss because of fragmentation.

Note that a collection may never be called, which is a huge gain. Even when collecting there is no cost of memory release, there is only the cost of moving objects still alive. The release takes place throughout an arena at once.

Gen1

This is a slightly larger area (starts in 2MB, but depends on some circumstances) and only receives objects through the collection of Gen0. It allocates everything in sequence without holes. Only objects with a certain lifespan will arrive here. It takes time for it to fill.

When it fills up it is triggered a collection that will copy the surviving objects to the Gen2. The pause for the collection is not usually greater than 1 millisecond.

This collection occurs with rarity since there are few objects that survive long.

In some cases there is more performance in the application with GC even collecting Gen1 compared to an application that does everything Counting, or even manual management that is not extremely optimized.

Gen2

This is the last generation that receives objects that tend to survive for a long time, or even all the time. The ideal is that this generation does not have many collections, or never occurs. It has no size limit and its pause can take not only many milliseconds, but also several seconds (rare, I explain below).

In practice there are segments (16 or 256MB) so the collection at a time can have a certain limit. I don’t know if it’s working like this at the moment, but it’s possible that only the segment you filled is collected.

Younger generations usually stay within a segment of these.

The collection can be done concurrently, which does not prevent the application to continue working without pause. It actually has a pause to check the objects still referenced, but it is a very short time. This is possible because the application cannot allocate in that area directly.

GC concorrente

This collection moves the objects in the Gen2 area itself and this can generate minimal fragmentation in some cases. Nothing critical.

This is where performance can become a problem in GC, but at least the pause can be well minimized if you have more than one core in the machine. Even with only one core it is possible to improve the experience.

Very rarely it needs to be done. As long as you have leftover memory there is no need to collect it (depends on configuration).

If you have enough memory you can even perform more with GC than without it.

LOH

Objects over 85000 bytes enter another area called Large Object Heap. This area has normal collection and frees the memory of large objects that are no longer referenced. This collection takes place along with the collection of Gen2. In general causes fragmentation (has a configuration that avoids it, but has disadvantages in using).

Almost impossible for a normal object to enter this area. Much of arrays (even internalized ones within some type) enter this area. It only depends on the size.

This is useful because moving large objects can cost a lot. Something that has collaborated a lot for Java to have the reputation of slow for a long time.

It is placed in segments.

POH

No . NET 5 to have also the Pinned Object Heap which stores all objects that are pinned, thus decreasing the fragmentation of the rest of the heap since these objects cannot be moved, so they are segregated.

Completion

For all this we say that the objects in . NET should die young or live forever, so do not need to be collected.

Generations solve the problem of memory pause and fragmentation, and give more performance in memory management.

You can’t make an application 100% real team (even there are Gcs like this: paper) with GC, but can make games, has smooth GUI without freezes.

If you still need to reduce pauses further then you have to start using specific allocation techniques, such as Object pool or abuse of structs in place of classes, technique facilitated in C# 7.

That’s roughly it. There are a number of details that I don’t think serve the purpose of this question.

If you want more details have a very good article.

  • I was reading this post and I ended up creating a question related to memory consumption. https://answall.com/questions/200836/consumo-excessivo-mem%C3%B3ria-ou-memory-Leak-na-aplica%C3%A7%C3%A3o

2

Browser other questions tagged

You are not signed in. Login or sign up in order to post.