Faculty of Computer Science (Fredericton)

Pages

Efficient classification of complex ontologies
Efficient classification of complex ontologies
by Weihong Song, Description logics (DLs) are knowledge representation languages that provide the theoretical underpinning for modern ontology languages such as OWL and serve as the basis for the development of ontology reasoners. The task of ontology classification is to compute the subsumption relationships between all pairs of atomic concepts in an ontology, which is the foundation for other ontology reasoning problems. There are two types of mainstream reasoners to perform ontology classification: 1) tableau-based reasoners usually support very expressive DLs, but they may not be efficient for large and highly cyclic ontologies. 2) consequence-based reasones are typically significantly faster than tableau-based reasoners, but they support less expressive DLs. It is difficult to extend the consequence-based reasoners directly to support more expressive DLs. In the present thesis, we propose a weakening and strengthening based approach for ontology classification, which aims to extend the capability of an efficient reasoner for a less expressive base language Lb (Lb-reasoner) to support a more expressive language. The approach approximates the target ontology by a weakened version and a strengthened version in Lb. Their subsumptions are a subset and a superset of the subsumptions of the target ontology. There are two cases: (1) the subsumptions of the strengthened ontology are the same as that of the target ontology; (2) there may be more subsumptions in the strengthened ontology, which is therefore unsound. In case (1) which we call soundness-preserved strengthening, we classify only the strengthened ontology with the Lb-reasoner to get the final classification results. In case (2) which we call soundness-relaxed strengthening, a hybrid approach is employed – we first classify both the weakened and strengthened ontologies with the Lb-reasoner, and then use a full-fledged (hyper)tableau-based assistant reasoner to check whether the subsumptions implied by the strengthened ontology are also implied by the target ontology. We first study the general principles to apply weakening and strengthening to extend an Lb-reasoner for a DL language that has one more constructor than Lb, i.e., single extension. Then we study the combination of several single extensions for multiple extended constructors for the reasoner, i.e., multiple extension. Based on the general principles, we investigate two single extensions from the ALCH description language to ALCH(D)¯ and ALCHI with soundness-preserved strengthening, as well as a single extension from ALCH to ALCHO with soundness-relaxed strengthening. Then, we show how to combine them into multiple extensions from ALCH to ALCHI(D)¯, ALCHOI, ALCHO(D)¯, and ALCHOI(D)¯. The soundness and completeness of all the single and multiple extensions are proved. We also develop a prototype ALCHOI(D)¯-reasoner, WSClassifier, based on the proposed approach. We experiment with and evaluate WSClassifier on large and highly cyclic real-world ontologies such as FMA and Galen ontologies, the required languages of which are beyond the capability of the current efficient consequence-based reasoners. The experiments show that on most of these ontologies, WSClassifier outperforms or significantly outperforms conventional (hyper)tableau-based reasoners.
Efficient sub-genome neuroevolution via probabilistic selection of genes
Efficient sub-genome neuroevolution via probabilistic selection of genes
by Aaron Lawrence Broad, Chimera, a novel sub-genome neuroevolution method, solves double pole balancing without velocity (DPNV), a modern version of the classic physical control machine-learning benchmark, in significantly fewer network evaluations than prior sub-genome neuroevolution methods. Neural networks are bio-mimicking computers, distributed, fault-tolerant, and applied to cashing cheques, natural language processing, and many other deep learning applications [11,16]. Deep learning repeatedly modifies a single neural network to produce known desired outputs to training example inputs. Alternatively, neuroevolution (NE) differentially reproduces candidate networks, based on their relative ability to produce training example output, or some other fitness measure [14]. Cooperative NE methods, herein referred to as sub-genome methods, select sub-genomes, or partial networks, rather than whole genomes, or complete networks. Selecting sub-genomes is theorised to select for many independent specialisations, whereas genome selection converges on a single generalisation [5–7, 13] . Sub-genome neuroevolution has historically been made more efficient by selecting smaller more numerous genes (or specialisations). Symbiotic Adaptive Neuroevolution (SANE) randomly composes genomes from a population of chromosomes, and assigns each a fitness value. SANE then selects chromosomes by the average fitness of every genome they were a part of [13]. Enforced Sub-populations (ESP) improves upon SANE by maintaining separate sub-populations of chromosomes for each of the x chromosomes required to form a genome [7]. Cooperative Synapse Neuroevolution (CoSyNE) further improves upon ESP by maintaining sub-populations of genes, one for each gene in a complete genome, rather than sub-populations of multi-gene chromosomes. CoSyNE first selects genomes, but then shuffles the gene subpopulations of selected genomes [5, 6]. This thesis proposes a novel sub-genome neuroevolution method, Chimera, and compares this new method with prior methods for solving DPNV. Chimera simply selects genes from gene sub-populations, with a probability proportional to the fitness of each gene’s genome. Chimera combines CoSyNE’s gene matrix population data structure [5] with ESP’s intra-sub-population reproduction [7] and a probabilistic [14] variant of the fitness-proportional selection originally abandoned by SANE [13]. Chimera uses significantly fewer network evaluations to solve DPNV than any of the prior sub-genome methods examined: SANE, ESP, or CoSyNE. While Chimera uses fewer evaluations, it can use more evaluation steps in tasks where fitness is proportional to total evaluation steps., Electronic Only.
Efficient text search with spatial constraints
Efficient text search with spatial constraints
by Dan Han, This thesis presents a search engine called TexSpaSearch that can search text documents having associated positions in space. We defined three query types Q 1 ( t) , Q2( t, r) and Q3(p, r) that can search documents with queries containing text t, position p and radius r components. We indexed a sample herbarium database of 40,791 records using a novel R*-tree and suffix tree data structure to achieve efficient text search with spatial data constraints. Significant preprocessing was performed to transform the herbarium database to a form usable by TexSpaSearch. We built unique data structures used to index text with spatial attributes that simultaneously support Ql, Q2 and Q3 queries. This thesis presents a novel approach for simplifying polygon boundaries for packing into R*-tree leaf nodes. Search results are ranked by a novel modified Lucene algorithm that supports free form text indexing in a general way. A novel ranking of Q2 search results combines the text and spatial scores. The architecture of a prototype Java based web application that invokes TexSpaSearch is described. A theoretical analysis shows that TexSpaSearch requires O(A 2lbl) average time for Ql search, where A is the number of single words in the query string t, and llbl is the average length of a subphrase in t. Q2 and Q3 require O ( A 2 Tbf + Z logM Vn + y) and O(logM Vn + y), respectively, where Z is the number of point records in the list P of text search results, Vn is the number of data objects indexed in the R*=tree for n records, M is the maximum number of entries of an internal node in the R*-tree, and y is the number of leaf nodes found in range in a Q3 query. Testing was performed with 20 test Ql queries to compare TexSpaSearch to a Google Search Appliance (GSA) for text search. Results indicate that the GSA is about 45.5 times faster than TexSpaSearch. The same 20 test queries were combined with a Q2 query radius r = 5, 50 and 500m. Results indicate Q2 queries are 22.8 times slower than Ql queries. For testing Q3 queries, 15 points were chosen in 15 different N .B. counties. The average Tc, T 8 and Te values of 191.5ms, 3603.2ms and 4183.9ms are given in the Q3 test, respectively, and the average value of Npt + Npl is 1313.4.
Enhancing the MMD algorithm in multi-core environments
Enhancing the MMD algorithm in multi-core environments
by Michael Schlösser, The work done in this thesis enhances the MMD algorithm in multi-core environments. The MMD algorithm, a transformation based algorithm for reversible logic synthesis, is based on the works introduced by Maslov, Miller and Dueck and their original, sequential implementation. It synthesises a formal function specification, provided by a truth table, into a reversible network and is able to perform several optimization steps after the synthesis. This work concentrates on one of these optimization steps, the template matching. This approach is used to reduce the size of the reversible circuit by replacing a number of gates that match a template which implements the same function and uses less gates. Smaller circuits have several benefits since they need less area and are not as costly. The template matching approach introduced in the original works is computationally expensive since it tries to match a library of templates against the given circuit. For each template at each position in the circuit, a number of different combinations have to be calculated during runtime resulting in high execution times, especially for large circuits. In order to make the template matching approach more efficient and usable, it has been reimplemented in order to take advantage of modern multi-core architectures such as the Cell Broadband Engine or a Graphics Processing Unit. For this work, two algorithmically different approaches that try to consider each multi-core architecture’s strengths, have been analyzed and improved. For the analysis these approaches have been cross-implemented on the two target hardware architectures and compared to the original parallel versions. Important metrics for this analysis are the execution time of the algorithm and the result of the minimization with the template matching approach. It could be shown that the algorithmically different approaches produce the same minimization results, independent of the used hardware architecture. However, both cross-implementations also show a significantly higher execution time which makes them practically irrelevant. The results of the first analysis and comparison lead to the decision to enhance only the original parallel approaches. Using the same metrics for successful enhancements as mentioned above, it could be shown that improving the algorithmic concepts and exploiting the capabilities of the hardware lead to better results for the execution time and the minimization results compared to their original implementations., Electronic Only. (UNB thesis number) Thesis 8689. (OCoLC) 960950379, M.C.S., University of New Brunswick, Faculty of Computer Science, 2011.
Enhancing the usage of the Shared Class Cache
Enhancing the usage of the Shared Class Cache
by Devarghya Bhattacharya, With the increasing popularity of the Java language and sandboxed environments, research needs to be conducted into improving the performance of these environments by decreasing the execution time as well as the memory footprint of an application. This thesis examines various critical data structures, used by IBM's Java Virtual Machine (JVM) during the start-up phase, for potential improvements. These data structures start small and expand as required in order to save space, however, growing them slows down the start-up of the JVM. This thesis will describe how the data structures were optimized using the Shared Class Cache (SCC), in order to improve the execution time as well as the memory footprint of the application running on IBM's JVM. The impact of this approach on performance and memory has been evaluated using different benchmarks. On average, a performance increase of 6% and a memory reduction of about 1% has been achieved with this approach. The alterations made are completely automated and the user requires no prior knowledge about the Java application or the VM to improve the performance of the deployed application. The only task the user has, is to activate the SCC., M.C.S. University of New Brunswick, Faculty of Computer Science, 2017.
EnviroPlanner: design and implementation of a distributed environmental querying system in rule responder
EnviroPlanner: design and implementation of a distributed environmental querying system in rule responder
by Sujan Chandra Saha, Environmental information querying can be cumbersome and time-consuming as users sometimes need to go through multiple Web pages for finding answers to specific questions. As a prototype modeling a multi-agent Virtual Environmental Organization (VEO), EnviroPlanner is developed to allow users to retrieve and deduce environmental information via problem-oriented question answering. In this thesis, we focus on the design and implementation of the EnviroPlanner VEO model that is supported by rule and ontology knowledge. This formalized knowledge allows EnviroPlanner's semi-automated agents to assist human experts in environmental question answering. We realized EnviroPlanner for distributed querying in the Rule Responder framework that consists of three kinds of agents: External Agent (EA), Organizational Agent (OA), and Personal Agents (PAs). The EA is the single point of entry that allows users to pose queries to the system, employing a Web interface coupled to an HTTP port to which requests are sent. EnviroPlanner consists of an extension and an instantiation of the Rule Responder framework similar in the communication architecture to the SymposiumPlanner-2011/2012 instantiations but in a very different knowledge domain. The SymposiumPlanner systems since 2011 have used two Sub-Organizational Agents (Sub-OAs) for question answering. We extended the framework to provide architectural flexibility to developers so they can add as many Sub-OAs as needed, e.g. for answering user queries about different locations or regions. The architecture designed for environmental query answering has been implemented and the EnviroPlanner prototype has been evaluated with respect to efficiency and overall performance. In addition, the EnviroPlanner prototype has been deployed on the official Rule Responder website.
Estimating the safety function response time for wireless control systems
Estimating the safety function response time for wireless control systems
by Victoria Pimentel Guerra, Safety function response time (SFRT) is a metric for safety-critical automation systems defined in the IEC 61784-3-3 standard for single input and single output systems communicating over wired technologies. This thesis proposes a model to estimate the SFRT for multiple input and multiple output feedback control systems communicating over the IEEE 802.15.4e wireless medium access control standard designed for process automation. The wireless SFRT model provides equations for the worst case delay time and watchdog timer of participating network entities, including wireless communication channels. Thirty-nine on board, wired and wireless control experiments using real devices were carried out to evaluate control performance, and the applicability of the wireless SFRT model. The estimated SFRT for the wired implementation is 38.2 ms. For the wireless experiments, the best SFRT obtained was 655.4 ms with no acceptable packet loss. The wireless implementation failed to provide successful control on 15 of the 21 experiments.
Exploring an IPv6 protocol for mobile sensor network communication
Exploring an IPv6 protocol for mobile sensor network communication
by Weiqi Zhang, This research explores IPv6 in mobile wireless sensor networks (WSNs). An indoor WSN mobile sensor network testbed of length 24 m was built and used for mobile WSN testing. The test network enabled the use of one or two moving nodes and six stationary nodes. TelosB sensor nodes were used for testing. The thesis presents a detailed explaination of sending and receiving User Datagram Protocol (UDP) packets using the IPv6 Low Power Wireless Area Network (6LoWPAN) software stack in the Berkeley Low power Internet Protocol (BLIP) implementation of Tiny0S 2.1.1 and 2.1.2. A Java based Web application called WSNWeb was built that displays real-time route topology changes and sensor data. The data is updated in a log file, and used to compute packet loss and determine the number of route topology changes. We created 35 test cases, 15 with two moving nodes, and 20 with one moving node. On a test track, model train velocities between 0.076 m/s and 0.376 m/s were used, with three different routing table update periods (RTUPs) of 60 s, 6 s, and 0.6 s. The results show that the one moving node 0.6 seconds RTUP has significantly higher packet loss (up to 1.4% compared to 0.16%) over a five hour test compared to RTUPs of 60 s and 6 s. The two moving nodes test shows that RTUP of 0.6 s still has a higher packet loss compared to RTUPs of 6 sand 60 s.
Extracting feature words from customer reviews
Extracting feature words from customer reviews
by Ting Zhang, Potential customers often browse online reviews before buying products. Manufacturers also collect customer feedback from the reviews. It is very hard for customers and manufacturers to get useful information from a large number of comments quickly. Thus, automatic information extraction in reviews has become a significant problem. This thesis investigates feature word extraction. Feature words are product components or attributes indicating customer interests. Since there is no systematic study on feature word extraction, we first study three classic methods: (1) the frequency-based extraction method; (2) the Web PMI-based extraction method; (3) the rapid automatic keyword extraction (RAKE) method. To provide an objective evaluation, the performance of each method is validated and compared from the following aspects: precision and recall, time complexity, and robustness. Then a new approach is proposed, the rapid feature word extraction (RFWE) method, to improve the performance. RFWE combines the techniques used in the popular methods and performs well in precision, recall, and runtime. RFWE is a great option for users to extract feature words from customer reviews.
FRIEND: a brain-monitoring agent architecture for adaptive system
FRIEND: a brain-monitoring agent architecture for adaptive system
by Alexis Morris, Brain-monitoring is rapidly becoming an important field of research, with potentially significant impacts on how people interact with technology. As the inner- workings of the brain become better understood, sensing technologies are also advancing, becoming smaller, cheaper, and ubiquitous. It is expected that new forms of computing that take advantage of brain state information to deduce user mental contexts (emotions, intentions, and moods) will be developed. This capability would enable systems to perform streamlined user-interaction, monitoring, and assistance, as they would access, manage, and respond to real-time brain state dynamics for adaptive applications and services. In this new domain of brain-monitoring, particularly for non-rehabilitative purposes, there are few studies that consider how to leverage distributed agent architectures. Additionally, current approaches to brain monitoring systems have tended toward non-scalable, single user, single application situations. However, for a ubiquitous system, it is unrealistic for each possible application to have the specialized overhead required; hence a distributed, yet still personalized, approach is essential. To realize this, a multi-purpose agent system for brain-monitoring and management of brain context is the goal of this work. It involves the selection of a brain-monitoring paradigm, an agent architecture, an inferencing mechanism, and the combination of the three towards a unified framework. This general framework is implemented and tested on an application scenario, leveraging brain context as part of a service-oriented architecture. Finally, an assessment is conducted of the technology, studying the implications of the system. By contributing a unique methodology and approach to making such systems tenable, this work helps to pave the way toward making futuristic, adaptive, human-aware information systems that are both effective and practical.
Feasibility of deception in code attribution
Feasibility of deception in code attribution
by Alina Matyukhina, Code authorship attribution is the process used to identify the probable author of given code, based on unique characteristics that reflect an author's programming style. Inspired by social studies in the attribution of literary works, in the past two decades researchers have examined the effectiveness of code attribution in the computer software domain, including computer security. Authorship attribution techniques have found a broad application in code plagiarism detection, biometric research, forensics, and malware analysis. Studies show that analysis of software might effectively unveil the digital identity of a programmer, reflected through variables and structures, programming language, employed development tools, their settings and, more importantly, how and what these tools are being used to do. Authorship attribution has been a prosperous area of research when an assumption can be made that the author of an unknown program has been honest in their writing style and does not try to modify it. In this thesis, we investigate the feasibility of deception of source code attribution techniques. We begin by exploring how data characteristics and feature selection influence both the accuracy and performance of attribution methods. Within this context, it is necessary to understand whether the results obtained by previous studies depend on the data source, quality, and context or the type of features used. It gives us the opportunity to dive deeper into the process of code authorship attribution to be able to understand its potential weaknesses. To evaluate current code attribution systems, we present an adversarial model defined by the adversary's goals, knowledge, and capabilities; for each group, we categorize them by the possible variations. Modeling the role of attackers figures prominently in enhancing the cybersecurity defense. We believe that having a solid understanding of the possible attacks can help in the research and deployment of reliable code authorship attribution systems. We present an author imitation attack that deceives current authorship attribution systems by imitating a coding style of a targeted developer. We investigate the attack's feasibility on open-source software repositories. To subvert an author imitation attack and to help in protecting the developer's privacy, we introduce an author obfuscation method and novel coding style transformations. The idea of author obfuscation is to allow authors to preserve the readability of their source code while removing identifying stylistic features that can be leveraged for code attribution. Code obfuscation, common in software development, typically aims to disguise the appearance of the code making it difficult to understand and reverse engineer. In contrast, the proposed author obfuscation hides the original author's style by leaving the source code visible, readable and understandable. In summary, this thesis presents original research work that not only advances the knowledge in code authorship attribution field but also contributes to the overall safety of our digital world by providing author obfuscation methods to protect the privacy of the developers.

Pages

Zircon - This is a contributing Drupal Theme
Design by WeebPal.