Distributed Geospatial Data Access on the WWW

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
This thesis investigates the design and implementation of a Web-based distributed geospatial data warehouse (WDGSDW) system which allows a user to query geographical information and access the geospatial data services across multiple servers over the Internet. A multi-tiered client/server architecture was used to implement WDGSDW. The CORBA-based (for Java and C++), Java RMI-based and Java servlet-based implementations of the server-side components of DWGSDW are tested and compared for the contextual data service, which providing the user interface of WDGSDW. The comparison showed that the performance of servlets-based implementation is much better than those of other implementations. The servlets technique was chosen to implement an experimental catalog server and geospatial data servers. An integrated tool to visualize the Canada Land Inventory data (in Arc/Info Export .E00 format) and raster image data was also implemented in this research. The search engine, which is the kernel of WDGSDW, supports combined text search and geographical search with an adjustable match factor. The search engine was built using R-Tree and AVL-Tree indexes. WDGSDW system was tested using test data sets containing 6979 CEONet metadata files, 1690 CLI vector data sets and 45 CCRS raster data sets. For the contextual data server, CORBA and RMI techniques are 2 to 2.5 time slower compared to the Java servlet and a performance of 85 bytes/ms was observed for the latter, on average. The keyword searches can take up to 4.9 seconds compared to bounding box searches times of less than 2.5 seconds on a catalogue containing 8188 entries. A combined keyword and bounding box search requires an average of 1.2 times more than the individual searches. For a fixed bounding box [200, 350; 20, 84], the variation of match factor from 0.90 to 10−8 resulted in a change of the number of returned items from 4 to 5673. Search time per item found varied from 0.17 ms to 0.83 ms. The Fat-client via Thin-server architecture for CLI data service achieved the best performance of 349 bytes/ms, about 23 times as fast as the Thin-client via Fat-server architecture.