An empirical study on machine learning for tweet sentiment analysis

dc.contributor.advisorZhang, Huajie
dc.contributor.authorTao, Hao
dc.date.accessioned2023-03-01T16:45:48Z
dc.date.available2023-03-01T16:45:48Z
dc.date.issued2016
dc.date.updated2023-03-01T15:03:15Z
dc.description.abstractTweet sentiment analysis has been an effective and valuable technique in the sentiment analysis domain. As the most widely used approach for tweet sentiment analysis, machine learning algorithms work well on the sentiment classification, just as they have been successfully applied for many other purposes. In this thesis, we conduct a systematic and thorough empirical study on the machine learning algorithms for tweet sentiment analysis, and expect to provide a guideline for applying machine learning algorithms for tweet sentiment analysis. Based on our experiments, we found that the Support Vector Machine (SVM) and the Random Forest (RF) work better than Maximum Entropy (MaxEnt), Adaptive Boosting (AdaBoost) and Naive Bayes on tweet sentiment analysis. For the pre-processing methods, stop words removal can improve the performance of classifiers obviously, and the combination of bi-grams + SentiWordNet + Stop words removal is the most effective pre-processing method combination in our experiments.
dc.description.copyright© Hao Tao, 2016
dc.formattext/xml
dc.format.extentix, 80 pages
dc.format.mediumelectronic
dc.identifier.urihttps://unbscholar.lib.unb.ca/handle/1882/14446
dc.language.isoen_CA
dc.publisherUniversity of New Brunswick
dc.rightshttp://purl.org/coar/access_right/c_abf2
dc.subject.disciplineComputer Science
dc.titleAn empirical study on machine learning for tweet sentiment analysis
dc.typemaster thesis
thesis.degree.disciplineComputer Science
thesis.degree.fullnameMaster of Computer Science
thesis.degree.grantorUniversity of New Brunswick
thesis.degree.levelmasters
thesis.degree.nameM.C.S.

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
item.pdf
Size:
2.13 MB
Format:
Adobe Portable Document Format