Indexing infrastructure for semantics full-text search

Lashkari, Fatemeh

Indexing infrastructure for semantics full-text search

dc.contributor.advisor	Ghorbani, Ali
dc.contributor.advisor	Bagheri, Ebrahim
dc.contributor.author	Lashkari, Fatemeh
dc.date.accessioned	2023-03-01T16:35:22Z
dc.date.available	2023-03-01T16:35:22Z
dc.date.issued	2019
dc.date.updated	2023-03-01T15:02:48Z
dc.description.abstract	The increasing effectiveness and wide spread use of automated entity linking platforms has enabled search techniques to adopt semantic-enabled methods such as sense disambiguation, intent determination, and instance identification within the search process. Researchers have already delved into the possibility of integrating semantic information into practical search engines, a paradigm known as semantic full-text search. However, the practical and efficient incorporation of semantic information within search indices is still an open challenge. In this thesis, we proposed two indexing approaches for building efficient and effective semantic full-text indices. In the first approach, we remain faithful to the traditional form of building search indices where the index key of the index is guaranteed to be present in each of the indexed documents. As such, we will assume that the documents related to each of keyword, semantic entity, semantic type, do in fact explicitly contain this information. For this reason, the first proposed indexing mechanism is referred to Explicit Semantic Full-text Index. We propose various representation data structures and their effective integration strategies for building the explicit semantic full-text index. Furthermore, we introduce algorithms for performing query processing tasks such as Boolean and rank union and intersection on the proposed indices. In the second approach, we relax the traditional condition of search indices and allow documents associated with an index key to be semantically similar to the index key as opposed to explicitly including the key. We refer to this indexing strategy as the Implicit Semantic Full-text Index. We propose a mechanism to embedd keyword, semantic entity, semantic type information within a homogeneous representation space and hence be indexed in the same indexing data structure. Based on our experiments, we find that when neural embeddings are used to build inverted indices; hence, relaxing the requirement to explicitly observe the posting list key in the indexed document, (a) retrieval efficiency will increase compared to a standard inverted index, hence reducing the index size and query processing time, and at the same time (b) retrieval effectiveness retains competitive performance compared to the baseline in terms of retrieving a reasonable number of relevant documents from the indexed corpus.
dc.description.copyright	© Fatemeh Lashkari, 2019
dc.format	text/xml
dc.format.extent	xix, 228 pages
dc.format.medium	electronic
dc.identifier.uri	https://unbscholar.lib.unb.ca/handle/1882/14160
dc.language.iso	en_CA
dc.publisher	University of New Brunswick
dc.rights	http://purl.org/coar/access_right/c_abf2
dc.subject.discipline	Computer Science
dc.title	Indexing infrastructure for semantics full-text search
dc.type	doctoral thesis
thesis.degree.discipline	Computer Science
thesis.degree.fullname	Doctor of Philosophy
thesis.degree.grantor	University of New Brunswick
thesis.degree.level	doctoral
thesis.degree.name	Ph.D.

Files

Original bundle

Now showing 1 - 1 of 1

Name:: item.pdf
Size:: 2.01 MB
Format:: Adobe Portable Document Format

Download

Collections

Open Theses & Dissertations

Indexing infrastructure for semantics full-text search

Files

Original bundle

Collections

General

Libraries

Departments

Join the conversation: