Gutarra Velez, Eduardo2023-03-012023-03-012012Thesis 8996https://unbscholar.lib.unb.ca/handle/1882/13251Bitmap indices are used to accelerate queries against large databases. We often expect bitmap indices over high-cardinality columns to be too large to be practical. Thus, we might expect multi-column bitmap indices to be disastrous because they are similar to indices over high-cardinality columns. Our findings, however, contradict this intuition. We find that multi-column bitmap indices can be smaller than conventional bitmap indices over individual columns. Yet multi-column bitmap indices can also be alarmingly large in some instances. Hence, our discovery motivates an engineering question: how do we efficiently determine whether a multi-column bitmap index is relatively small without first constructing it? We tried modeling the problem in terms of statistical characteristics such as correlation. However, we found simple heuristics based on indexing samples most effective. Our heuristics could be integrated in database systems to help database administrators pick a better indexing strategy.text/xmlx, 128 pageselectronichttp://purl.org/coar/access_right/c_abf2Multi-column bitmap indexingmaster thesis2020-06-10Kaser, OwenLemire, DanielComputer Science