Compilation-based spatial query processing
Loading...
Date
2024-10
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
University of New Brunswick
Abstract
The proliferation of spatial data applications and rising spatial data volumes demand efficient processing capabilities. Although most relational databases support spatial extensions of SQL, they offer limited scalability. Traditional relational database follows a pull-based model of query processing. This is inefficient for processing large volumes of data. Specialized systems, such as those extending Hadoop and Spark, improve scalability but often lack comprehensive SQL support or suffer from the overheads of the pull-based model.
This thesis introduces a distributed spatial query processing system using the Push-based query compilation approach, generating C++/UPC++-based query plans for both single node and distributed execution on a high-performance framework using the Partitioned Global Address Space paradigm. It also proposes two new morsel-driven parallelism algorithms for scalable spatial query execution. Experiments on real-world datasets show significant performance gains over leading systems, including Apache Sedona, Citus - a distributed database based on PostgreSQL, and PostgreSQL in single-node configurations.