K-spectrum Support Vector Machine classifier for spam filtering
dc.contributor.advisor | Mahanti, Prabhat | |
dc.contributor.advisor | Kim, Dongmin | |
dc.contributor.author | Yang, Ming | |
dc.date.accessioned | 2023-03-01T16:39:44Z | |
dc.date.available | 2023-03-01T16:39:44Z | |
dc.date.issued | 2013 | |
dc.date.updated | 2016-12-14T00:00:00Z | |
dc.description.abstract | Traditionally machine learning approaches including Support Vector Machine (SVM) for spam filtering use the bag of words text representation technique to represent its features. However, this technique does not take the word order information into account and is not suitable for languages that do not use white spaces as word delimiters. Therefore, it is appealing to treat every email as a string of symbols by using a string-based approach. In this report, we implement a contiguous string-based approach, which is called k-spectrum kernel, for use with SVM in a discriminative approach to the spam classification problem. When using the k-spectrum SVM spam classifier, email texts are implicitly mapped into a high-dimensional feature space. The classifier produces a decision boundary in this feature space, and emails are classified based on whether they map to the positive (spam) or negative side (non-spam) of the boundary. Our experimental results demonstrate that the k-spectrum SVM spam classifier could offer an effective and accurate alternative to other approaches of spam filtering, such as generally used approaches including Naive Baysian and SVM classifier that is based Bag-of-Words (BOW). | |
dc.description.copyright | © Ming Yang, 2013 | |
dc.description.note | A Report Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Computer Science. Electronic Only. (UNB thesis number) Thesis 9207. (OCoLC) 960951414 | |
dc.format | text/xml | |
dc.format.extent | ix, 61 pages | |
dc.format.medium | electronic | |
dc.identifier.oclc | (OCoLC) 960951414 | |
dc.identifier.other | Thesis 9207 | |
dc.identifier.uri | https://unbscholar.lib.unb.ca/handle/1882/14288 | |
dc.language.iso | en_CA | |
dc.publisher | University of New Brunswick | |
dc.rights | http://purl.org/coar/access_right/c_abf2 | |
dc.subject.discipline | Computer Science | |
dc.subject.lcsh | Support vector machines | |
dc.subject.lcsh | Supervised learning (Machine learning) | |
dc.subject.lcsh | Kernel functions | |
dc.subject.lcsh | Spam filtering (Electronic mail) | |
dc.title | K-spectrum Support Vector Machine classifier for spam filtering | |
dc.type | master thesis | |
thesis.degree.discipline | Computer Science | |
thesis.degree.fullname | Master of Computer Science | |
thesis.degree.grantor | University of New Brunswick | |
thesis.degree.level | masters | |
thesis.degree.name | M.C.S. |
Files
Original bundle
1 - 1 of 1