Just-in-time compilation of SQL queries with OMR JitBuilder
Loading...
Files
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of New Brunswick
Abstract
Most database systems translate a given query into an expression, and then start evaluating this algebraic expression to produce the query result. The evaluation of SQL expressions and tuple or row materialization can consume a significant portion of the overall execution time of a query, dominating all other tasks of the query processor, therefore increasing the CPU costs for query processing. Moreover, modern database systems are designed to support execution of any arbitrary query, which results in more branching, thus generating more machine instructions. To evaluate an expression, both the data and the machine instructions have to be loaded into the cache memory. For queries with complex operations, the size of the generated code can be relatively large. Both the data and the generated code resides in the cache, and that often tends to overwhelm the cache. This leads to frequent cache misses and instruction mis-predictions. The aim of this work is to generate efficient machine code for scan, filter, join, aggregation and group-by operations for a given SQL expression with Just-in-time (JIT) compilation using the Eclipse OMR JitBuilder compiler framework. For this purpose, an incremental adoption approach is proposed, which creates a blend of specialized code consisting of compile-time constants and JIT compilation for parts of the same SQL expression. The implementation is based on a light-weight integration of JitBuilder into PostgreSQL 12.5, where both the JIT-compiled code and interpreted evaluation co-exist for different opcodes in the same bytecode interpreter. Experimental evaluation with enhanced PostgreSQL 12.5 demonstrates that the proposed approach offers improved query performance over purely interpreted execution, with a number of queries from the TPC-H benchmark. The incremental adoption model is also extended to propagate through the table tuples or rows and evaluate the expression by completely eliminating the interpreted execution, which showed a decent speedup for a simple SELECT query with filter operation.