Faculty of Computer Science (Fredericton)
Pages
-
-
A Phishing e-mail detection approach using machine learning techniques
-
by Kenneth Fon Mbah, According to APWG reports of 2014 and 2015, the number of unique Phishing e-mail reports received from consumers has increased tremendously from 68270 e-mails in October 2014 to 106421 e-mails in September 2015. This significant increase is a proof of the existence of Phishing attacks and the high rate of damages they have caused to Internet users in the past. Because no attention is made in the literature to specifically detect Phishing e-mails related to advertisement and pornographic, attackers are becoming extremely intelligent to use these means of attraction to track users and adjusting their attacks base on users behaviours and hot topics extracted from community news and journals. We focus on detecting deceptive e-mail which is a form of Phishing attacks by proposing a novel framework to accurately identify not only e-mail Phishing attacks but also advertisements or pornographic e-mails consider as attracting ways to launch Phishing. Our approach known as Phishing Alerting System (PHAS) has the ability to detect and alert all type of deceptive e-mails so as to help users in decision making. We are using a well known e-mail dataset and base on our extracted features we are able to obtain about 93.11% accuracy while using machine learning techniques such as J48 Decision Tree and KNN. Furthermore, we equally evaluate our system built based on these above features and obtained approximately the same accuracy while using the same dataset as input to our system.
-
-
A behavioral based detection approach for business email compromises
-
by Nasim Maleki, The most recent infectious vector in email attacks is Business Email Compromise (BEC), which is an entry point for attackers to get access to an enterprise network and obtain valuable company data. According to the Symantec Internet Threat Security Report (ISTR), around 7,710 organizations are hit by a Business Email Compromise attack every month. A BEC is a type of phishing attack that criminals impersonate a person of authority in an organization (CEO) through spoofing or take-over accounts. Since spoofing techniques are detectable using SPF, DMARC, and DKIM, we proposed and implemented a behavioral-based framework for the detection of BEC when accounts or machines are compromised. This framework stops malicious emails on the sender-side because the lack of enough email of the sender on the receiver-side cannot result in a representative user-profile. Moreover, a compromised account or machine turns into a devastating weapon targeting many people. Hence it ought to be stopped from the sender-side, and the real owner should be notified of this disaster. Our framework in the experiment on Enron Dataset for all users has reached a total average of 92% and 93% for Accuracy and F1 score, respectively., Electronic Only.
-
-
A blockchain-based privacy-preserving medical insurance storage system
-
by Son Luong, Blockchain technology is an innovative invention that is disrupting many industries including business and healthcare. In this thesis, we propose a blockchain-based privacy-preserving medical insurance storage system. This system takes advantage of the decentralization and immutability properties of blockchain technology, and makes use of a (2,3)-threshold secret sharing scheme to achieve the privacy-preservation property. In an experimental setup of the system, there are a public blockchain, a patient, four hospitals - an owner hospital and three helper hospitals, and an insurance company. For a patient, any hospital can become the owner hospital while the other three become helpers. The owner hospital holds the spending records of the patients and publishes that data to the blockchain. The helper hospitals help the insurance company to query for the patient’s spending data on the blockchain and perform homomorphic computations on the results. However, the helpers cannot learn anything about the patient’s spending data, as long as there is no collusion between helpers. We deploy our system on the Ethereum blockchain and give a final performance evaluation.
-
-
A detection framework for android financial malware
-
by Andi Fitriah Abdul Kadir, As attempts to thwart cybercrime have intensified, so have innovations in how cybercriminals provision their infrastructure to sustain their activities. Consequently, what motivates cybercriminals today has changed: from ego and status, to interest and money; the cybersecurity researchers today have turned their attention to financial-related malware, especially on the Android platform. There are some major issues faced by researchers in detecting Android financial malware. Firstly, what constitutes Android financial malware is still ambiguous. There is a disparity in labelling the type of malware where most of the current detection systems emphasize the recognition of generic malware types (e.g., Trojan, Worm) rather than indicating its capabilities (e.g., banking malware, ransomware). Without knowing what constitutes financial malware, the detection systems are not capable of providing an accurate recognition of an advanced and sophisticated financial-related malware. Secondly, most of the current anomaly-based detection systems via machine learning suffer from inaccurate evaluation and comparison due to the lack of adequate datasets, which result in unreliable outputs for real-world deployment. Due to time consuming processes, most of the available datasets are crafted mainly for static analysis, and those created for dynamic analysis are installed on an emulator or sandbox. Sophisticated malware can bypass these approaches; malware authors have employed obfuscation methods and included a wide range of anti-emulator techniques, where the malware programs attempt to hide their malicious activities by detecting the emulator. These deficiencies are some of the major reasons why Android financial malware is able to avoid detection.
A comprehensive understanding of the existing Android financial malware attacks supported by a unified terminology and high-quality dataset is required for the deployment of reliable defence mechanisms against these attacks. Therefore, we seek to understand trends and relationships between Android malware families and devise a taxonomy of Android financial malware attacks. In addition, a systematic approach to generate the required datasets is presented to address the need to use physical platforms instead of emulators. In this regard, an automated dynamic analysis system running on smartphones is developed to generate the desired dataset in a testbed environment. In order to correlate the generated dataset and the proposed taxonomy, a hybrid framework for malware detection is presented. We propose a novel combination of both static and dynamic analysis based specifically on features derived from the string literal (statically via reverse engineering) and network flow (dynamically on smartphones). This combination can assist security analysts in recognizing the threats effectively. We employ five common classifiers to construct the best model to identify malware at four levels: detecting malicious Android apps, classifying Android apps with respect to malware category and sub-category, and characterizing Android apps according to malware family. Specifically, a dataset containing over 5,000 samples is used to evaluate the performance of the proposed method. The experimental results show that the proposed method with a Random Forest classifier achieves an accuracy of over 90% with a very low false positive rate of 4% on average.
-
-
A dynamic graph-based malware classifier
-
by Hossein Hadian Jazi, The anti-virus industry receives a sheer amount of new malware samples on a daily basis. The prevalence of new sophisticated instances, for most of which no signature is available, coupled with the significant growth of potentially harmful programs have made the adoption of an effective automated classifier almost inevitable.
Due to the vast majority of obfuscation techniques employed by the malware authors, extraction of a high-level representation of malware structure is an efficient way in this regard. High-level graph representations such as Function Call Graphs or Control Flow Graphs are able to represent the main functionality of a given sample in more abstract way. The graph-based approaches have mostly revolved around static analysis of the binary and share the common drawbacks of any static-based approaches. For example, generating a graph from a packed executable does not reflect the real structure of the code at all.
In addition to the type of analysis, the scalability of these approaches is also affected by the employed graph comparison algorithm. Full graph comparison is by itself an NP-hard problem. Approximated graph comparison algorithms such as Graph Edit Distance have been commonly studied in the field of graph classification.
To address the two major weaknesses involved with the current graph-based approaches, we propose a dynamic and scalable graph-based malware classifier. At the time of this proposal, this is the first attempt to generate and classify dynamic graphs. In spite of providing more accurate graphs, dynamic analysis leads to the generating larger graphs, and aggravating the problem of comparison measurement. To address this problem we modify an existing algorithm called Simulated Annealing to reduce computational complexity. To have a reasonable estimation of the effectiveness, our proposed system is compared against Classy, which is the state-of-the-art graph-based system. Our results show that proposed classifier is able to outperform Classy by an average classification accuracy of 94%, 4% false positive rate, and leaving only 2% of samples unlabeled.
-
-
A fault tolerant data structure for Peer-to-Peer range query processing
-
by Zahra Mirikharaji, We present a fault tolerant dynamic data structure based on a constant-degree Distributed Hash Table called FissionE that supports orthogonal range search in d-dimensional space. A publication algorithm, which distributes data objects among all nodes in the network is described, along with a search algorithm that processes range queries and reports all objects in range to the query issuer. The worst case orthogonal range search cost in our data structure with n nodes is O(log n + m) messages plus reporting cost, where m is the minimum number of nodes intersecting the query. We have proved that in our data structure the cost of reporting data in range to the query issuer is ∑mi=1 ⌈Ki/B⌉ O(log n) ∈ O((K/B + m) log n) messages, where K is the number points in range, Ki is the number of points in range stored in node i, and B is the number of points fitting in one message. Storing d copies of each data objects on d different nodes provides redundancy for our scheme. This redundancy permits completely answering a query in the case of simultaneous failure of d — 1 nodes. Results of our experimental simulation with up to 12,288 nodes show the practical application of our data structure.
-
-
A framework for developing adaptive service compositions
-
by Mahdi Bashari, This thesis proposes a framework for automatic generation of self-healing service composition which can recover from functional and non-functional failures. To this end, it firstly proposes an automated method for generation of service composition which enables a user to build a service composition by selecting a set of desired features and secondly it proposes a method for adapting the generated service composition to recover autonomously from service failures or non-functional constraint violations.
The proposed service composition method uses software product line engineering concepts to build a repository of features and link them to their corresponding services. Using this repository, it uses AI planning to build a work flow of service interactions based on the requirements. It then uses concepts from partial-order-planning to optimize the generated work flow. Eventually, the generated work flow is converted to structured and executable BPEL code.
The proposed adaptation method extends the composition software product line to become a dynamic software product line. The proposed dynamic software product line is capable of re-selecting features of a running service composition to continue service with limited features to recover from a service failure or a violation of critical non-functional requirements. A method has been proposed which uses linear regression to determine the effect of features on the non-functional properties of service composition. Knowing how each feature affects non-functional requirements, a method has been proposed which reduces the problem of finding an alternate set of features which recovers service composition from service failure or non-functional requirement violation to a pseudo-boolean optimization problem, which can then be solved.
An online tool-suite realizing the proposed framework has been implemented and the usability, effectiveness, and reliability of the proposed framework have been evaluated with extensive experiments.
-
-
A framework for migration of conventional client-server software systems to cloud
-
by Jianbo Zheng, As an emerging model for the delivery of software services, Software as a Service (SaaS) becomes a trend in software industry due to its low investment, flexibility and accessibility. However, migration of conventional client-server software systems and applications to SaaS may involve complicated processes. This thesis proposes a framework named A2SF for helping software developers to migrate conventional client-server applications to high quality SaaS based applications in cloud environments, with multi-tenancy support, without re-developing or modifying the original applications. The migration framework consists of four components: service proxy, data proxy, tenant management, and cloud resources management. The four framework components, together with an original client-server application, can be seamlessly deployed on the cloud as an SaaS software. A prototype of A2SF has been implemented on the Amazon AWS cloud platform. Based on A2SF, the thesis also describes a general cloud migration process for client-server applications and presents a case study of migrating a real-world client-server application to Amazon AWS cloud., Electronic Only.
(UNB thesis number) Thesis 9397.
(OCoLC) 961830460., M.C.S. University of New Brunswick, Faculty of Computer Science, 2014.
-
-
A framework to process and exchange logical rules in multiple rule languages
-
by Ismail Akbari, Web rule languages have been developed for the Web-based interchange of rules, in particular business rules, business policies, and any business or application logic that can be presented with rules. The primary goal of this dissertation is to create methods for rule interchange between selected rule markup languages, as well as to develop a rule engine for a subset of W3C’s RIF language. Logical rule interchange is the act of transforming rules presented in one rule language to another. The rule interchange is done from the Notation3, POSL, RuleML, SWRL rule languages to RIF-BLD. In this dissertation, to enable rule interchange in the Semantic Web, a framework has been proposed. The framework contains different rule grammars, parsers, visualizers and translators as well as a rule engine. A grammar, parser and rule visualizer are developed for each of the N3, POSL and RIF-BLD languages. Also, rule translators from N3, POSL, SWRL, and RuleML to RIF-BLD are developed.
As a central component, a rule engine for the RIF-BLD language has been developed. The rule engine comprises both forward and backward reasoning. All the grammars, parsers, visualizers, translators and the rule engine are part of the framework. The translators and the rule engine have been separately evaluated with various use cases and with a case study on the independently provided Port Clearance Rules.
-
-
A genetic-algorithm-based solution for HTTP-based malware signature generation
-
by Amir Pourafshar, The rise in prevalence of malwares has become the most serious threat to Internet security. In order to minimize the devastating impact of this threat many malware detection strategies and systems have been developed, in recent years. This thesis presents a novel malware signature generation and evolution system to detect never-before-seen malwares. We focus on automatic generation of evolved signatures for HTTP-based malware traces based on features and the structure of currently known malwares. The idea is that we can evolve signatures of known malwares to predict the structure of future malware traces since they usually inherit some of their characteristics and structure from their predecessors.
We implemented a proof-of-concept version of our proposed evolutionary signature generation system. Datasets of malicious and legitimate network traffic have been used to evaluate the proposed system. Results from performed experiments show the system's ability in detecting an acceptable portion of new, unknown malware samples while maintaining a low false alarm rate. Using the base and evolved signatures together increased the average detection rate of the unknown malicious traces from 38:4% to 50:8%. This improvement happens while the average false positive rate of the evolved signature sets is 2:7 * 10‾³., Electronic Only.
(UNB thesis number) Thesis 9385.
(OCoLC) 961805552., M.C.S., University of New Brunswick, Faculty of Computer Science, 2014.
-
-
A meta-learning approach for evaluating the
effect of software development policies
-
by James Ashley Stewart, Delivering high-quality software on time and on budget is a challenging endeavor but it can be made more likely by adhering to an approach where guidance is provided through the use of software development policies. Software development policies represent standards and best practices that a company has chosen to follow throughout their software development effort. For our purposes, a software policy is a statement of conduct intended to guide and constrain development activities. Policies can be written to capture company guidelines, industry best practices, empirical research, and even past experience. A simple example of a policy might read “a preliminary design must be completed before implementation begins.” Policies help to ensure the existence of certain environmental conditions that are conducive to a successful outcome. Depending on the situation, however, the policies in use may not have the expected effect. Currently, there does not exist a formal way to evaluate a company’s policy set without resorting to extensive experimentation or case study on each policy. We propose a method that monitors weekly success indicators on project aspects such as quality, time, budget, and morale. The policies in use are then evaluated against these indicators resulting in a summary of those policies thought to impact process performance. Due to the many complexities of this problem (e.g. policy interactions, delayed effects of changes, etc.), our method consists of a combination of several different analysis techniques that are combined to yield a more complete solution. Our set of analysis methods currently includes: a form of linear regression adapted for greater sensitivity; a check that extreme values coincide; a trend analysis that detects whether data generally deviates in the same (or opposite) direction; and a special check adapted specifically for discrete measures. The results from each method are then combined using a meta-learner that compares the similarity of the ranked results produced by each individual technique, and provides a single indicator of how strongly they agree. To ensure our method works and is practical, we validated it against industry data from a leading Canadian business-solutions provider. Despite the many challenges inherent with real-world data (e.g., missing, inconsistent, incorrect, biased, sparse, and limited data), our validation work indicates that our method can identify more potential effects than other traditional approaches, especially the more subtle weaker effects, which can serve as a trigger for further investigation. These results shall be of special interest to project managers in their efforts to deliver on successful projects., Ph.D. University of New Brunswick, Faculty of Computer Science, 2017.
Pages
Zircon - This is a contributing Drupal Theme
Design by
WeebPal.