A comparison of machine learning algorithms for zero-shot cross-lingual phishing detection

Staples, Dakota

A comparison of machine learning algorithms for zero-shot cross-lingual phishing detection

Files

Dakota Staples - Thesis.pdf (751.37 KB)

Date

2023-08

Authors

Staples, Dakota

Publisher

University of New Brunswick

Abstract

Phishing is a major problem worldwide. Existing studies have focused mainly on detecting emails in one language (mostly English). However, detecting emails in multiple languages is challenging due to a lack of datasets. Without ample data from which to learn, the models cannot detect a benign email from a spam email accurately, resulting in false positives and negatives. This research aims to compare the performance of numerous machine learning models and transformers using zero-shot learning for multilingual phishing detection. In a zero-shot learning set-up, the model is trained on one language and tested on another. English, French, and Russian emails are used as the training and testing languages. My results show that, on average, XLM-Roberta performs the best out of all the tested models in terms of accuracy scoring 99% testing on English, 99% testing on French, and 95% testing on Russian.

URI

https://unbscholar.lib.unb.ca/handle/1882/37716

Collections

Open Theses & Dissertations

Full item page

A comparison of machine learning algorithms for zero-shot cross-lingual phishing detection

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections