Nine common JVM garbage collectors

2024.03.11

JVM is not only a frequently asked question in interviews with major manufacturers, but also a knowledge point that Java programmers must master when entering high-level positions. As a core part of the JVM, understanding its principles can help us better adjust the garbage collector. Optimization and troubleshooting, so today we will talk about 9 common garbage collectors in JVM.

background 

Because there are many types of Java virtual machines, if there is no special explanation, this article refers specifically to the HotSpot virtual machine. Before sharing the recycler, we first give a brief introduction to the background of the HotSpot virtual machine.

HotSpot VM was originally designed by the small company "Longview Technologies" and was not developed for the Java language at the beginning.

In 1997, Sun acquired the company and thus obtained the HotSpot virtual machine. After some optimization by Sun, the HotSpot virtual machine became the default virtual machine for Sun/OracleJDK and OpenJDK.

In 2010, Oracle acquired Sun Microsystems, and the HotSpot virtual machine naturally became an Oracle product.

Sun/OracleJDK and OpenJDK are both products of Oracle. Sun/OracleJDK is the commercial version and OpenJDK is the free version. The kernels of the two virtual machines are the same, but their functions are slightly different.

Regarding whether Sun/OracleJDK or OpenJDK is used, you can check it through the java -version command.

Sun/OracleJDK:

OpenJDK:

1.Serial 

The Serial collector, as the name implies, is a single-threaded collector, and when performing garbage collection, other worker threads must be suspended until it completes the collection (Stop The World).

Prior to JDK 1.3.1, it was the only choice for the HotSpot virtual machine young generation collector.

In the combined mode of Serial (young generation) and Serial Old (old generation), the general workflow of the collector is as follows:

Although the Serial collector is a single-threaded collection and will pause other worker threads, which seems to have poor performance, it is still the default new generation collector for the HotSpot virtual machine running in client mode, because compared to other collectors The single-threaded, Serial collector consumes the lowest memory, and there is no overhead of multi-thread interaction, which makes it simple and efficient.

When starting a Java process, you can use the above recycler combination by setting the -XX:+UseSerialGC -XX:+UseSerialOldGC parameters.

2.Par New

The ParNew collector is a multi-threaded parallel version of the Serial collector. Except for using multi-threads for garbage collection, other behaviors are the same as the Serial collector. Mainly used in scenarios where the HotSpot virtual machine runs in server mode.

In the combined mode of ParNew (young generation) and Serial Old (old generation), the general workflow of the collector is as follows:

When starting a Java process, you can use the above recycler combination by setting the -XX:+UseParNewGC -XX:+UseSerialOldGC parameters.

3.Parallel Scavenge 

The Parallel Scavenge collector is also a collector for the young generation. Like the ParNew collector, it uses multi-threaded concurrent recycling. However, Parallel Scavenge can set the maximum pause time of the GC through the -XX:MaxGCPauseMillis parameter, so that it can be achieved A throughput (Throughput) controllable goal, thus better than the ParNew collector.

In the combined mode of Parallel Scavenge (young generation) and Serial Old (old generation), the general workflow of the collector is as follows:

When starting a Java process, you can use the above combination of collectors by setting the -XX:+UseParallelGC -XX:+UseSerailOldGC parameters.

However, this combination looks awkward. The young generation uses multi-threaded concurrent collection, while the old generation uses a single thread for recycling. It seems that the recycling of the old generation is a "drag". Therefore, Parallel for the old generation The Old concurrent collector was born.

In the combined mode of Parallel Scavenge (young generation) and Parallel Old (old generation), the general workflow of the collector is as follows:

picture

When starting a Java process, you can use the above recycler combination by setting the -XX:+UseParallelGC -XX:+UseParallelOldGC parameters.

4.Serial Old 

The Serial Old collector is the old generation version of Serial. It is also a single-threaded collector that uses the 'mark-sort' algorithm. Like the Serial collector, it is also used in HotSpot client mode.

In the combined mode of Serial (young generation) and Serial Old (old generation), the general workflow of the collector is as follows:

When starting a Java process, you can use the above recycler combination by setting the -XX:+UseSerialGC -XX:+UseSerialOldGC parameters.

5.Parallel Old

The Parallel Old collector has been supported since JDK 6. It is the old generation version of the Parallel Scavenge collector. It supports multi-threaded concurrent collection and uses the 'mark-collation' algorithm. The emergence of the Parallel Old collector truly achieves the goal of "throughput priority".

In the combined mode of Parallel Scavenge (young generation) and Parallel Old (old generation), the general workflow of the collector is as follows:

When starting a Java process, you can use the above recycler combination by setting the -XX:+UseParallelGC -XX:+UseParallelOldGC parameters.

6.CMS

The CMS collector was officially born after the release of JDK5. It is no exaggeration to say: CMS is a cross-era collector. Once upon a time, it was a must-know knowledge point about garbage collectors in interviews with major Internet companies.

CMS is the abbreviation of Comcurrent Mark Sweep, which is used for garbage collection in the old generation. The CMS collection process consists of 5 steps:

  • Initial Mark Stop The World
  • Concurrent Marking
  • Remark Stop The World
  • Concurrent Sweep (concurrent clear)
  • Resetting

The general workflow of the CMS collector is as follows:

Although the CMS collector achieves the goal of the recycling thread and the application thread working concurrently, it also has a fatal problem: it cannot handle "floating garbage", and Concurrent Mode Failure may occur, leading to Full GC. Therefore, Oracle has officially declared CMS as "deprecated" and is not recommended for use. This also declares that the historical mission of the CMS collector has ended. 

When starting the Java process, you can set the -XX:+UseConcMarkSweepGC parameter to display the use of the CMS recycler.

7.G1 

G1 collector is the abbreviation of Garbage First. It is a server-oriented garbage collector for large-memory multi-processor computers. The goal is to achieve low-latency garbage collection.

G1 is fully supported from Oracle JDK 7 Update 4 and later, and starting with JDK9, G1 has become the default garbage collector.

It should be said that G1 is a milestone in the history of garbage collectors, ushering in the era of Region-based recycling. Unlike previous garbage collectors, although G1 still retains the concepts of young and old generations, the storage addresses of each generation are Discontinuous, each generation contains n discontinuous Regions of the same size. The heap memory allocation of G1 is as follows:

G1 provides two GC modes: Young GC and Mixed GC.

The G1 collection process consists of 4 steps:

(1) Initial Marking: Marks objects that are directly reachable from the GC Root.

(2) Concurrent Marking: Search for active objects on the entire heap and mark all reachable objects. This phase may be interrupted by young generation garbage collection.

(3) Remark: Complete the marking of active objects in the heap. An algorithm called Snapshot-at-the-Beginning (SATB) is used, which is much faster than the algorithm used in the CMS collector.

(4) Cleanup: This process accomplishes 3 things

  • Live objects and fully freed areas are accounted for. (Stop The World) 
  • Clean up remembered collections. (Stop The World) 
  • Resets empty regions and returns them to the free list. (Concurrent execution)

The general workflow of the G1 collector is as follows:

When starting a Java process, you can set the -XX:+UseG1GC parameter to display the use of the G1 collector.

8.Shenandoah

Shenandoah is also a HotSpot virtual machine recycler. It first appeared in Open JDK12. It was originally developed by RedHat and contributed to OpenJDK in 2014. Perhaps because it was not developed by Oracle itself, Shenandoah currently only exists in OpenJDK and not in OpenJDK. In OracleJDK Commercial Edition. Shenandoah mainly uses the technology of connection matrix and forwarding pointer. The connection matrix replaces the card table in G1.

The Shenandoah workflow is divided into 9 steps:

  • Initial Marking: Like G1, it marks objects that are directly reachable from the GC Root. Stop The World
  • Concurrent Marking: Like G1, it searches for active objects on the entire heap and marks all reachable objects.
  • Final Marking: Same as G1,
  • Concurrent Cleanup: Clean up Regions with no surviving objects
  • Concurrent Evacuation: Copy surviving objects to an empty Region.
  • Initial Update Reference: Correct the reference address of the copied object during the concurrent recycling phase
  • Concurrent Update Reference (concurrent reference update): reference update operation
  • Final Update Reference: Correct references that exist in GCRoots
  • Concurrent Cleanup: Recycle empty Regions
2. Concurrent Marking(并发标记):和G1 一样,在整个堆上查找活动对象,标记全部可达对象。
  • 1.

The general workflow of the Shenandoah collector is as follows (picture from OpenJDK official):

When starting the Java process, you can set the XX:+UseShenandoahGC parameter to display the use of the Shenandoah recycler.

Note that if you are using Sun/OracleJDK, you will not be able to use this collector.

9. ZGC 

ZGC is officially developed by Oracle and introduced in JDK11. It is a collector that uses dyed pointers and read barrier technology. Like G1, ZGC has the heap space divided into multiple Regions. The difference is that the Region of ZGC is officially called For Page, it can be dynamically created and destroyed, and the capacity can also be dynamically adjusted.

There are three types of Regions in ZGC:

  • Small Region: The capacity is fixed at 2MB, used to store objects < 256KB;
  • Medium Region: The capacity is fixed at 32MB, used to store objects >= 256KB and < 4MB;
  • Large Region: The capacity is 2^n MB, storing objects >= 4MB, and only one large object is stored in each large Region. Since moving a large object is too expensive, the object will not be reallocated.

The ZGC workflow is divided into 4 steps:

  • Concurrent Mark: Like G1, it marks objects that are directly reachable from the GC Root.
  • Concurrent Prepare for Relocate
  • Concurrent for Relocate (concurrent reallocation)
  • Concurrent Remap

The general workflow of the ZGC collector is as follows:

The ZGC garbage collection process is almost all concurrent, and the actual Stop The World (STW) pause time is extremely short, less than 10ms. This is due to its use of colored pointers and read barrier technology.

When starting a Java process, you can set the XX:+UseZGC parameter to display the use of the ZGC recycler.

At this point, the 9 garbage collectors have been introduced. If you are very interested in garbage collectors, it is recommended to read the third edition of "In-depth Understanding of Java Virtual Machines" by Dr. Zhou Zhiming. In addition to garbage collectors, the book also covers other JVM-related contents. It is also introduced in detail and should be a must-have book for many domestic Java programmers to learn JVM.

Due to limited space, this article only briefly analyzes the 9 garbage collectors commonly used in HotSpot virtual machines, and does not do a theoretical analysis. In the following articles, I will analyze the 4 garbage collectors of CMS, G1, ZGC, and Shenandoah respectively. For a detailed explanation of the tool, link: JVM column. Finally, use a chart to compare 9 recyclers: