Improving virtual machines using string deduplication and internal object pools
University of New Brunswick
The efficiency of memory management is one of the key metrics when researching virtual machines. In cases where deallocation of objects is performed automatically, garbage collection has become an important field of research. It aims at speeding up and optimizing the execution of applications written in languages such as Java, C#, Python and others. Even though garbage collection techniques have become more sophisticated, automatic memory management is still far from being optimal. Garbage collection techniques such as mark sweep, mark compact, copying collection, and generational garbage collection form the base of most virtual environments. These algorithms rely on a stop-the-world phase that is used to detect and free live objects. The research presented in this dissertation aims at improving automatic memory management by investigating the optimization of memory layout as well as optimizing the allocation and deallocation processes of frequently created and freed objects. The first optimization aims at using the stop-the-world phase of the garbage collector in order to detect duplicate strings and deduplicate them before copying them to a new region. The goal of this algorithm is to reduce multiple storage of the same data in memory, as well as copying of memory, in order to decrease the heap size and therefore the number of garbage collections required to execute the client application. The second optimization aims at speeding up the allocation of frequently created and discarded objects by keeping a pool of empty objects. Instead of requesting new memory, the virtual machine requests an empty object of the class and initializes the values required. Object pools are a widely used software engineering pattern utilized by software developers to reuse object instances without the need of repeated allocation and instantiation. While the benefits of using object pool structures are still present when used in a garbage collected environment, it adds a memory management component to the development process. The dissertation investigates the feasibility of introducing automatically created and maintained object pools for predefined classes. Automatic object pools are implemented and discussed using the GenCon GC and Balanced GC policies of the IBM Java VM.