An empirical study on machine learning for tweet sentiment analysis
University of New Brunswick
Tweet sentiment analysis has been an effective and valuable technique in the sentiment analysis domain. As the most widely used approach for tweet sentiment analysis, machine learning algorithms work well on the sentiment classification, just as they have been successfully applied for many other purposes. In this thesis, we conduct a systematic and thorough empirical study on the machine learning algorithms for tweet sentiment analysis, and expect to provide a guideline for applying machine learning algorithms for tweet sentiment analysis. Based on our experiments, we found that the Support Vector Machine (SVM) and the Random Forest (RF) work better than Maximum Entropy (MaxEnt), Adaptive Boosting (AdaBoost) and Naive Bayes on tweet sentiment analysis. For the pre-processing methods, stop words removal can improve the performance of classifiers obviously, and the combination of bi-grams + SentiWordNet + Stop words removal is the most effective pre-processing method combination in our experiments.