How does the G1 (Garbage First Collector) work?

Asked

Viewed 1,301 times

39

In the JEP-248 the definition of G1 has been discussed (Garbage First Collector) as the standard Garbage Collector on Java 9. In this period, I have heard many quotes about the G1, but very little material has deeper details or comparisons as for example with the CMS (Concurrent Mark and Sweep), which today would be the GC "pattern" in production applications with minimum performance requirements.

Since the automatic memory management provided by JVM is one of the crucial points of the platform, I would like a deeper view of why (pros and cons) the adoption of this new Garbage Collector, in addition to a parallel with the existing implementations.

  • 2

    I will follow this question +1

  • 1

    Is it even possible to answer this in the case of a closed source product? The "Tchan" of this G1 seems to be the subdivision of memory into blocks, and the prioritization of doing GC on near-empty blocks, including moving objects out of the block in order to free it. This avoids a typical problem of systems with GC: using a memory truck, with objects occupying this memory very sparsely. Details of how this is actually implemented, just looking at the code, and certainly the G1 will perform worse on specific loads (although it should be better in the average case, otherwise they would not use.)

  • I still think there is reasonable room for improvement in the answers, I have a reasonable knowledge of G1 it has been implemented/used since 2009. I will see if I complement the answer before setting it as correct. Thank you for the @Ciganomorrisonmendez reward

  • Regis, set the fire there. It’s an hour before the reward ends.

  • 1

    I don’t have an answer, but that reference (in English) seems to describe the process fairly well. If I had seen the question earlier, I could risk a translation, but at first sight I believe that the answer from Daniel touches on all the important points - although without going into too much detail (in my interpretation, the GC of Java has been generational for a long time, so that’s not the news; what improved, it seems to me, was just the same implementation).

2 answers

22


Classic garbage collectors (Garbage Collector) work more or less as follows:

  1. They paralyze the application execution;
  2. They scan all application memory, to identify which objects can no longer be accessed, and free them from memory;
  3. They summarize the execution of the application;

This shutdown is a problem for large apps that need to be highly responsive, such as Facebook, because the more memory the app uses, the longer the downtime and the less responsive the app.

The current JVM collector, Concurrent Mark and Sweep (CMS), performs part of the scan and memory release concurrently with the execution of the application (hence the name), to try to reduce downtime. This reduces the problem but does not solve it.

Garbage First Collector (G1C) solves this problem using some techniques:

  • It scans the memory without paralyzing application execution.
  • It divides memory into blocks to allow partial collections.
  • It allows setting a downtime limit for garbage collection.
  • He estimates how many memory blocks he can collect, within the time limit, using data from previous collections.
  • He prioritizes the collection of blocks with more garbage.
  • It collects using evacuation, i.e.: it picks up a block, moves what is not garbage to another block and releases the entire first block.
  • it would be interesting if you put the sources of your answer, to have a more detailed version of your answer.

  • Unfortunately I have no other source than the question links and my knowledge about the functioning of memory collectors in general.

  • Just one detail: CMS does not paralyze once to scan and collect - it paralyzes once (mark) to identify the "living" objects and then paralyzes again (re-mark) to pick up what was missing. The collection itself is parallel to the execution of the application, without further breaks (hence the "Concurrent" in the name - if it paused also in the collection, it would be simply "mark and Sweep"...). More details here.

  • @mgibsonbr I changed the answer to include this information.

2

The best explanation on G1 in relation to CMS: Getting Started with the G1 Garbage Collector

The text is a bit long (I don’t intend to do the whole translation), but in short:

  • How the heap is organized in the CMS and in the G1 (G1 Operational Overview)

  • Heap types: permanent generation, old generation and new generation (G1 Operational Overview)

  • How the collection takes place in the CMS (Reviewing Generational GC and CMS)

  • How the collection of new generation takes place in the G1 (The G1 Garbage Collector Step by Step)

  • How the collection of old woman generation takes place in the G1 (G1 Old Generation Collection Step by Step)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.