Optimized arraycopy implementation for improved runtime performance of OpenJ9 in AArch64
University of New Brunswick
The Eclipse OMR – OpenJ9 pair is used to build robust language runtimes that support various hardware and operating system platforms. One of the architectures supported by OpenJ9, the AArch64 platform, is widely used in electronic devices because of its reasonable price and resource efficiency. We propose adding an optimization in the Just-In-Time (JIT) compiler of OpenJ9, to copy arrays, including the System.arraycopy() method, efficiently through the arraycopyEvaluator. The arraycopy nodes generated from the bytecode in Intermediate Language (IL) during runtime help in copying values from a specified source to the referenced destination of a given length. The optimizing JIT compiler function, arraycopyEvaluator, separates inlinable code for better performance. Making use of Vector Floating Point registers helps in copying up to 128 bits of any data type in a single load/store instruction. The situations where primitive values are copied or Garbage Collection checks are required to access the reference fields are handled. A comparision of optimized loops with traditional copying—using Java Class Library function System.arraycopy and a loop, without arraycopy optimization in JIT is investigated. We evaluate the benchmark results of the DaCapo Benchmark Suite and the BumbleBench Microbenchmarking test framework. We investigate the trace files and utilize the Perf tool to identify the reason for unexpected Benchmark results. We achieve an up to tenfold increase in performance. Also, we evaluate the performance improvement on AArch64, with arraycopyEvaluator added, against x86 64. Compared to x86 64, AArch64 shows an improvement of up to 45.7%.