Faculty of Computer Science (Fredericton)

Pages

Big data analytics toolkit for business data based on social network analysis
Big data analytics toolkit for business data based on social network analysis
by Fan Liang, Social network analysis (SNA) measures the relationships and structures with a set of metrics by building graphs for capturing in uential actors and patterns. In this thesis, we investigate the SNA approaches for solving real-world business applications, and propose a general-purpose software system that combines big data analytics and social network analysis techniques. The system's work ow consists of data collection, graph generation, graph reuse, network property calculation, SNA result interpretation, and application integration. The system operations are executable in a Hadoop-based distributed cluster with high throughput on large-scale data. We evaluate our prototype system with a case study on stock network. The result shows that the system is capable of analyzing business data at-scale and using SNA approach to solve business problems.
Bursty event discovery from online news outlets
Bursty event discovery from online news outlets
by Seyed Pooria Madani Kochak, On this thesis, we have developed a set of methods along with a framework for discovery of bursty events and their relationship from streams of online news articles. Bursty event discovery can be done using the discovered bursty terms which are significantly smaller in size compared to the original feature-set. Moreover, the discovered bursty events are compared in order to discover any potential relational link between any of two. It is the assumption of this work that bursty events and their relationship in time can provide useful information to firms and individuals who their decision making process is significantly affected by news events. The system performed at 64% level of accuracy on a real world dataset. The results show a great promise as do the implicit measures that our proposed framework and methods can be utilized towards real world applications., (UNB thesis number) Thesis 9566. (OCoLC)963858008. Electronic Only., M.C.S. University of New Brunswick, Faculty Computer Science, 2015.
Bursty topic detection using acceleration and user influence
Bursty topic detection using acceleration and user influence
by Rizwan Ali, In this thesis, we present a system which detects bursty topics from real-time social media data. Bursty topics are topics which get a sudden surge in their mentions online in a very short period. They are detected using acceleration of keywords from the real-time data. The acceleration of a bursty topic is measured using the increase in appearance of keywords of a topic over a small period. Along with acceleration, user influence is used to score keywords in our system to improve the detection by increasing the keyword precision of the detected bursty topics. Bursty topics are formed using the top scoring keywords, based on acceleration and influence, and grouping them based on the similarities in their term document vectors. We use soft frequent pattern mining approach for generating topics. The bursty topics are also linked to bursty topics detected on previous time windows by comparing similarities in their keywords. Bursty topics detected using acceleration are evaluated with and without user influence score, with a baseline topic model. We use Latent Dirichlet Allocation topic model as the baseline. It is found that user influence helped the topic detector improve its precision by 11% on average. The results show that user influence can add great value to bursty topic detection methods.
Business Rules Interoperation
Business Rules Interoperation
by Fahad A. Albalwi, The present report describes the approach and two technical solutions for interoperation between business rules represented in various formats. Semantic Web techniques have been deployed to make such interoperation work. One of the interoperation methods uses the Java Interoperation Object (JIO) described in the context of Positional-Slotted Knowledge (POSL), which is a human-friendly variant of the Rule Markup Language (RuleML), and Notation 3 (N3) representations. Details of the connections between these representations are demonstrated with the use of query-based interoperation between POSL and N3. Another solution described in the report is conversion of business rules stored in Microsoft Excel as decision tables into POSL using OpenL tablets. Although the current business rules interoperation framework involves three formats (Excel, POSL, and N3), it can be extended to other representations through appropriate conversions of data in rule bases and queries., A Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Computer Science in the Graduate Academic Unit of Computer Science. Scanned from archival print submission., M.C.S. University of New Brunswick, Faculty of Computer Science, 2013.
Characteristics of mental representations in novices and experts in Java
Characteristics of mental representations in novices and experts in Java
by Mohammadhossein Parastar, How programmers mentally represent code is of interest to researchers who study program comprehension and design new programming languages. In 1993 Wiedenbeck et al. conducted a study on characteristics of mental representation using a procedural language, and five characteristics were introduced. We designed an experiment using JAVA to determine whether programmers using an object-oriented language have the same characteristics as programmers using a procedural language. We considered two alternative definition of expertise: expertise as years of experience and expertise determined by self-assessment. We used the same Multivariate and Univariate analysis as the previous study and in addition used Mixed Effect Logistic Modeling to analyze the results. We found a significant difference between experts and novices defined using self-assessment in Linear Modeling. Our results did not fully agree with the previous research. Our study supports the existence of recurring basic patterns and well-connected representations. However, we could not find support for hierarchical structure, grounding in the program text, and found an unexpected result in mapping code to goals. Our study had several limitations, such as the Corona-virus pandemic that caused the limit in the number of participants, artificiality of the tasks, and a lack of professional programmers in the experiment sample., Electronic Only.
Characterizing and improving the general performance of Apache Zookeeper
Characterizing and improving the general performance of Apache Zookeeper
by Chandan Bagai, Coordination has become one of the most vital requirements in a scenario where a group of systems are communicating with each other over a network trying to accomplish goals. In some scenarios for instance, some high end distributed applications also require sophisticated coordination primitives such as leader election, rather than just agreeing over certain parameters. Implementing these types of primitives is rather difficult, due to their complexity, and their vulnerability to errors may lead to application failure. ZooKeeper, an open source distributed coordination service maintained by Apache takes care of all such issues. Its wait-free nature and strong ordering ensures synchronization and makes it ideal for developing high performance applications. ZooKeeper is inspired by other services such as Chubby[1], and at the protocol level from Paxos[2]. The main goal of this report is to enhance the performance of ZooKeeper for ensuring its normal functioning. The performance of ZooKeeper has been increased, and none of the alterations have been done at the protocol level. The implementation involves modifications to the marshalling and demarshalling mechanisms followed by request processors and use of some queue implementations. The resulting performance impacts have been evaluated as well., Electronic Only. "A report submitted in partial fulfillment of the requirements for the degree of Masters of Computer Science". Graduate Academic Unit missing, no entry: "In the Graduate Academic Unit of your GAU"., M.C.S. University of New Brunswick, Faculty of Computer Science, 2014.
Characterizing concurrency of java programs
Characterizing concurrency of java programs
by Chenwei Wang, With the emergence of multi-core processors, concurrent programs are becoming the commonplace, as they can provide better responsiveness and higher performance. Java usually uses shared memory as the concurrent programming model. Moreover, Java provides various concurrency related features at the language level to support concurrent programming, such as thread creation and synchronization. However, because of the managed runtime environment of the Java Virtual Machine (JVM), it is hard to understand the performance of concurrent programs written in Java. Currently available metrics|the number of spawned threads, execution time and throughput|are not enough to characterize a Java program's concurrency behavior. More metrics are needed, such as how many threads contribute to the workload significantly and concurrently and how threads use shared memory. We present a set of metrics to characterize concurrency for Java programs. IBM's J9 JVM has been instrumented to trace concurrency behaviors, such as thread starts/ends and shared object access. Trace files are dumped when the VM shuts down. Post processing programs are used to process these trace files to generated these metrics. We also characterize concurrency for micro benchmarks with different concurrency patterns, as well as commercial and academic benchmarks. The results show that we can characterize concurrency for Java programs by using these metrics we presented., Entered as "Master’s in Computer Science", changed to "Master of Computer Science"
ClsEqMatcher: an ontology matching approach
ClsEqMatcher: an ontology matching approach
by Yassaman Zand-Moghaddam
Cold object identification and segregation via application profiling
Cold object identification and segregation via application profiling
by Abhijit S. Taware, Managed runtimes like the Java Virtual Machine (JVM) provide automatic memory management, i.e., garbage collection, to remove the burden of explicitly freeing allocated memory from the programmer. Garbage collectors (GC) periodically walk the heap and free unused memory. Some of the surviving objects are infrequently accessed. However, the JVM has to account for these objects during each GC cycle, which is clearly an unnecessary overhead. Such objects can be categorized as cold and moved to a dedicated memory area. This segregation gives an opportunity to perform the GC either on hot or cold regions. Typically only hot regions are GCed. This leads to having fewer objects to process during each GC cycle and hence exhibit better performance. Furthermore, cold objects can be stored in cheaper and larger capacity memory, like NVRam, leaving more main memory for the hot objects of the application. Identification of such objects is a particularly difficult task. As a part of this thesis, application profiling is explored to identify cold classes. Various object properties are evaluated and alternate methods are studied to overcome the limitations with application profiling. The experimental results show 4% average runtime gain with applications showing predictable behavior and object access patterns. Memory intensive applications like in-memory database showed better object segregation. A very primitive technique of operating system supported memory protection was used to further assist cold object identification. A better alternative to memory protection will offer significant gains.
Cold objects in the java virtual machine
Cold objects in the java virtual machine
by Baoguo Zhou, Objects in the heap are alive when they are reachable from the root set, otherwise they are dead. Live objects that are not accessed for a specified time are called cold objects. If cold objects are moved to cold regions, then these regions need not to be included in garbage collection. Consequently, the pause time of garbage collection could be reduced and Java application throughput could be increased. In this thesis, the proportion of cold objects in real Java applications has been investigated. The results show that some Java applications have as much as 22% cold objects. If these cold objects need not to be marked and swept during garbage collection, significant pause time could be saved. It has been suggested that all active objects can be identified by periodically walking the stack. This method is called the Stack-based solution. The correctness and efficiency of the Stack-based solution has been evaluated and confirmed with an Access Barrier methodology. The experimental results show that the Stack-based solution is acceptable to identify cold objects. Furthermore, some improved approaches to minimize the overhead have been implemented for the Stack-based solution. The improved approaches have shown good results. Experimental results show that the Java application throughput has been increased and the pause time of the garbage collection has been reduced under the improved approaches.
Collaborative content distribution in mobile networks with caching and D2D-assistance
Collaborative content distribution in mobile networks with caching and D2D-assistance
by Haoru Xing, While the mobile networks are evolving to a content-centric paradigm, the surging traffic demands keep stressing the ever-increasing network capacities. In-network caching offers an effective means to alleviate the traffic pressure by saving bandwidth utilization and balancing traffic loads. In this thesis, we first introduced an integrated content distribution framework that integrates universal in-network caching and enables collaboration across domains. This framework takes advantage of appealing design principles in device-to-device (D2D) communications, mobile edge computing (MEC), network function virtualization (NFV), and software-defined networking (SDN). Leveraging this framework, we studied the request screening problem, which aims to appropriately select the video requests offloaded to energy-efficient D2D communications. We redirected content requests to maximize the coverage of requests that can be fulfilled through D2D communications. Given the constraints of individual transmission and caching capacities, the number of available D2D channels, and information privacy with social-awareness, we can decouple the screening problem into two subproblems, i.e., the device caching and matching problem, and the D2D channel allocation problem. As we proved that both problems are NP-hard, we proposed social-aware heuristic algorithms that iteratively make the best offloading decision in each step. Simulation results show that the proposed algorithms perform fairly closely to optimal solutions in small-scale instances and outperform the reference schemes under various situations. Then, we studied the request routing problem, which selects sources and redirects video streams appropriately to optimize in-network flows. We proposed a context-aware approach for request routing through the integrated edge-core. The results demonstrate that the proposed solution achieves significant performance gain over the reference schemes in exploiting network dynamics and user context to relieve network congestion.

Pages

Zircon - This is a contributing Drupal Theme
Design by WeebPal.