Table of Contents
Introduction
In this article we will go over one of the most important and favorite topic amongst Java interviewers. Lot of Java developers tend to ignore this topic or don’t think its necessary to know memory management. Even though JVM takes care of this, developers should be aware of how Memory management is done.
You are at the right place if you were looking to gain knowledge on memory management. Lets get started
What is Java Heap Space
Heap space in Java is responsible to store the actual objects in memory. This Heap space is periodically garbage collected by the the framework. For every JVM process, there exists only one heap memory. There are different parts of heap memory as below.
- Eden Space
- First Survivor Space
- Second Survivor Space
- Old Generation
What is Eden Space in Java Heap
Eden space is part of Heap Memory. Whenever a new object is created the space is allocated in the Eden memory. Whenever a garbage collector runs for the first time, the unused objects are garbage collected.
What is Survivor Space?
Survivor Space is part of Heap memory. There are two types of Survivor space i.e. first survivor and second survivor space. The alive objects which were not garbage collected from the Eden memory, are moved to first survivor space.
When the garbage collector runs on the first survivor space, the unused objects will be garbage collected. But the objects which survived the first survivor space, will be moved to the second survivor space.
What is Old Generation
Old generation is part of Heap memory. The garbage collection of the permanent generation would be tied to the garbage collection of the old generation, so whenever either gets full, both the permanent generation and the old generation would be collected.
What is PermGen space
- The permanent generation is a space which holds the meta-data describing the user classes.
- The JVM keeps track of the loaded class metadata in the PermGen space.
- The JVM also stores all the static content such as static methods, primitive variables and references to the static objects.
- PermGen space only exists in Hotspot JVM as opposed to IBM J9 JVM implementation where there is no PermGen space.
- String pool was part of this space until Java 1.7.
- PermGen is responsible for one of the famous runtime error i.e. OutOfMemory error. We would get this error if classes metadata size would go beyond the -XX:MaxPermSize (Option to set the maximum size of permgen).
- The size of the PermGen can be controlled by two options.
-XX:PermSize
which is the initial/minimum size of the PermGen space , and-XX:MaxPermSize
which is the maximum size. - Whenever Old generation space or permanent generation would get full, both would get garbage collected.
- After introduction of Java 8, the PermGen has been replace with Metaspace.
What is java.lang.outofmemoryerror: in permgen space
The Permanent generation space is used to store the metadata of classes. When the Permgen space is full and there is insufficient space for new objects in the heap even after garbage collector is run, then the OutOfMemoryError is thrown.
The way to resolve OutOfMemoryError is to increase the maximum heap size by using JVM options “-Xmx512M”, this will resolve OutOfMemoryError.
What is Metaspace in Java 8
In Java 8 there is no PermGen space, this space has been replaced by Metaspace. There were some issues with PermGen such as difficulty to size the Permgen, because it depended on the lot of factors such as total number of classes, method size, the size of constant pool etc. It was really difficult to tune the PermGen space.
Since Java 8, the Hotspot JVM is using native memory for representing class metadata which is similar to IBM JVM and Oracle JRockit. The metadata has now moved to native memory to an area known as the Metaspace.
What is the difference between PermGen and Metaspace
The main difference between PermGen and Metaspace is that , PermGen was part of Heap space prior to Java 8. But since introduction of Java 8, the Metaspace is moved out of Heap space. Now Metaspace is part of the native memory which is limited by the Host Operating System.
The problem with PermGen was that it had a fixed maximum size. One advantage with Metaspace is that it auto increases its size depending on the underlying operating system.
What is native memory
Native memory is a memory which is allocated within the process address space, which is not within the heap. Whereas Heap memory is within the JVM process that is managed by the JVM to represent Java objects. Whatever memory is not a heap memory is native memory. The total of native and heap memory used by the JVM is the total space used by our application.
Native Memory Tracking is one of the feature provided by JVM, through which we can get detailed breakdown of memory areas used by JVM.
What are different reference types in Java
There are 4 reference types in Java
- Weak Reference
- Soft Reference
- Strong Reference
- Phantom Reference
What is Strong reference in Java
Strong reference is a reference in which the object present in the heap is not garbage collected because there is a strong reference pointing to it, or strongly reachable through a chain of strong references.
What is Weak reference in Java
Weak reference is a reference in which the object is cleared by the Garbage collector because it is weakly reachable. A weak reference to an object from the heap is most likely to not survive after the next garbage collection process.
WeakHashMap
is an example datastructure using weak references.
import java.lang.ref.WeakReference;
class WeakReferenceExample {
public static void main( String args[] ) {
String str = new String("springmicroservices.com"); // This is a string reference
WeakReference<String> myString = new WeakReference<>(str);
str = null; // nulling the strong reference
// Try invoking the GC, but no guarantees it'll run
Runtime.getRuntime().gc();
if (myString.get() != null) {
System.out.println(myString.get());
} else {
System.out.println("String object has been cleared by the Garbage Collector.");
}
}
}
//Output
//String object has been cleared by the Garbage Collector.
What is Soft reference in Java
Soft reference is a reference in which objects will be garbage collected only when the application is running low on memory. In order words, softly reachable objects will not be garbage collected as long as there is no critical requirement for free space. All softly referenced objects will be cleaned up by Java before OutOfMemoryError occurs.
What is the difference between Weak reference and soft reference
The difference between WeakReference
and SoftReference
is, if a weak reference is pointing to an object then the Garbage collector can collect an object i.e. a weak reference is eagerly collected. On the other hand, objects with SoftReference are only collected when the JVM absolutely needs memory.
What are Phantom References
Phantom references are most often used for scheduling pre-mortem cleanup actions in a more flexible way than is possible with the Java finalization mechanism. Unlike soft and weak references, phantom references are not automatically cleared by the garbage collector as they are enqueued. An object that is reachable via phantom references will remain so until all such references are cleared or themselves become unreachable.
Before Java 9
Whilst Weak and Soft references are put in the queue after the object is finalized, Phantom references are put in the queue before the object is finalized. If for any reason you don’t poll the queue, the actual objects referenced by the PhantomReference will not be finalized, and you can incur an OutOfMemory error. Consider the same program from the previous question where instead of a weak reference we’ll use a phantom reference.
Java 9 and After
Phantom references are automatically cleared (set to null) in Java 9 and after.
Another difference between Phantom references and other references is that the get()
method of a phantom reference always returns null even before a GC has occurred. The other reference types return their referents with the get()
method.
Phantom reference can be used to notify one when an object is out of scope to do resource cleanup. Remember that the object.finalize()
method is not guaranteed to be called at the end of the life of an object, so if one needs to close files or free resources, one can rely on Phantom references. A typical pattern is to derive your own reference type from PhantomReference
and add information useful for the final freeing.
What is a garbage collector
Being a Java developer, we don’t have to be much concerned about the memory management. Java is intelligent enough to decide which objects are no longer used and when to reclaim the memory. Garbage collector is nothing but a program running in the background just hunting for objects which are no longer referred so that it can be garbage collected. Java has several garbage collector algorithms which run in their own thread. Different garbage collector algorithms run in their own thread. One interesting fact to know is that, your application will be put on hold or paused during the garbage collection process. This behavior is also known as “stop the world”. Some GC algorithms pause the application for long period of time while others pause for shorter duration.
Different types of garbage collectors in Java
There are three types of garbage collectors in Java and the developer has the choice of which one should to be used. By default, the choice of garbage collector is done based on the underlying hardware. Below are the three types of garbage collector.
Serial GC : Serial Garbage collector is mostly used and suited for single processor machines. It uses a single thread for garbage collection. All the application threads are freeze when serial GC is run and it does not work great in multi threaded server environments.
Parallel GC : Parallel GC used multiple threads for garbage collection. Parallel GC is mostly targeted for applications with medium sized to large sized data sets that are run on multiprocessor or multithreaded hardware.
Mostly Concurrent GC : The name itself is self explanatory, it attempts to work concurrently to the application. It is called “mostly” concurrent because there’s still a period of time for which the application threads are paused. There are two kinds of mostly concurrent garbage collectors:
- Garbage First (G1): The G1 collector is a server-style garbage collector, targeted for multi-processor machines with large memories. It meets garbage collection (GC) pause time goals with high probability, while achieving high throughput. Unlike other collectors, G1 collector partitions the heap into a set of equal-sized heap regions, each a contiguous range of virtual memory. When performing garbage collections, G1 shows a concurrent global marking phase (i.e. phase 1 known as Marking) to determine the liveness of objects throughout the heap. After the mark phase is completed, G1 knows which regions are mostly empty. It collects in these areas first, which usually yields a significant amount of free space (i.e. phase 2 known as Sweeping). It is why this method of garbage collection is called Garbage-First.
- Concurrent Mark Sweep: This implementation of garbage collection has been deprecated as of JDK 9. It uses multiple garbage collector threads for garbage collection. It’s designed for applications that prefer shorter garbage collection pauses, and that can afford to share processor resources with the garbage collector while the application is running.
Explain the garbage collection process
Heap space consists of four different spaces i.e. Eden space, first survivor space , second survivor space and old generation. Below is the step by step garbage collection process.
Step 1 – Whenever a new object is created, the memory is allocated in Eden space. Eden space fills up fast as it has limited memory. At this stage the first and second survivor space is empty.
Step 2 – When the garbage collector runs for the first time, the unused objects are garbage collected from the Eden space. The objects saved from garbage collection, are moved to first survivor memory.
Step 3 – When the garbage collector is run for the second time, the objects surviving from the Eden memory and first survivor memory are placed in second survivor memory.
One of the reasons to have two survivor spaces apart from Eden memory is to avoid memory fragmentation. When the objects are garbage collected from Eden memory and first survivor memory, the memory will have holes in them since dead objects have been reclaimed. Instead of compacting the memory spaces, the JVM keeps moving the live objects between memory space.
Step 4 – When the objects are not garbage collected even after garbage collector process, the those objects are moved to old generation.
Step 5 – Once the old generation memory spaces becomes full, then garbage collection process runs on old generation memory space. But this time the garbage collector takes long time to run as compared to younger generations.
What is minorGC and majorGC
When the garbage collector is run on either Eden space, first or second survivor space (younger generation) then it is called as minorGC. When the garbage collector is run on older generation then it is called as majorGC. But when the garbage collector is run on both younger as well as older generation then it is called fullGC.
Though there is no formal definition present in JVM specification nor in Garbage collector research papers. Major GCs are often triggered by Minor GCs, so separating the two is impossible in many cases.
What is mark and sweep algorithm
Mark phase : Mark phase is the first step in which garbage collector identifies which objects are not in use and which objects are in use. During the mark phase, the application thread need to be stopped temporarily for marking to happen. This stopping the application thread is also called as “stop the world” phase. The duration of pause depends on the number of alive objects in the heap.
Sweep phase: This is the second step, where the unreachable objects are swept away to clear the heap memory of unreachable objects.
Compact phase: This is an additional step for the mark-sweep-compact algorithm, which moves all marked (alive objects) to the beginning of the memory region. The disadvantage of this approach is an increased GC pause duration as we need to copy all objects to a new place and update all references to such objects.
What is System.gc() or Runtime.getRuntime.gc()
System.gc() is mostly used to invoke the garbage collector explicitly. But there is no guarantee that the garbage collector will execute. It is up to the framework whether the garbage collector is run or not.
What is a memory leak
Memory leak happens when objects in the heap are no longer used by the application, and the garbage collector fails to recognize as unused, this type of problem is called as memory leak. OutOfMemoryError when these type of objects are in large numbers, which keep holding the memory unnecessarily. The memory request done by the application will not be fulfilled due to this.
What is classloader leak
Classloader leak problem happens when the application is unloaded but the class definitions for the classes loaded by the application continue to live on in metaspace (or permgen) increasing the memory footprint of the JVM over time. Finally, an OutofMemoryError is thrown.
All classes hold a reference to their classloader and all objects hold references to their classes. As a result, if an application gets unloaded but one of its objects is still being held (e.g., by a cache or a thread-local variable), the underlying classloader cannot be removed by the garbage collector. A classloader will be removed by the garbage collector only if nothing else refers to it.
One example where a classloader leak can happen is if some thread continues to run after the application is undeployed. The thread will usually hold a reference to a classloader of the web application it was started by, called context classloader. Which in turn means that all classes of the undeployed application continue to be held in memory.
These sort of memory leaks can happen in application servers and OSGi containers e.g. an application is redeployed without restarting the application server. This is an example of a memory leak which automatic garbage collection paradigm can’t address.
How do I optimize Java memory and resolve performance issues in Java
- Whenever you think the variables are not needed, then set variable references to null. This will make objects eligible for garbage collection.
- We need to avoid writing finalize blocks, as they do not guarantee anything and in turn they slow down the program performance.
- We need to avoid using Strong references where weak and soft references apply. For example, in case of caching the data is held in the memory even though it is not needed.
- The JVM can be instructed to dump the heap on an OutOfMemoryError exception by adding the -XX:+HeapDumpOnOutOfMemoryError argument to the JVM. The heap can then be visualized for memory leaks and consumption using tools such as eclipse’s MAT, jvisualvm or yourkit profiler.
What can be done to resolve a memory leak
Even though we have automatic Java garbage collector, it doesn’t mean all the memory issues will be resolved and memory management will be taken by Java completely. Java programs can have logical errors that hold on to object references or resources when they are no longer needed causing the program’s memory footprint to expand over the course of the application run.
Remember the beloiw points that prevent the memory leak in Java.
- Do not create unnecessary objects.
- Avoid String Concatenation.
- Use String Builder.
- Do not store a massive amount of data in the session.
- Time out the session when no longer used.
- Do not use the System.gc() method.
- Avoid the use of static objects. Because they live for the entire life of the application, by default. So, it is better to set the reference to null, explicitly.
Leave a Reply