Sentiment analysis: transferring knowledge across domains
University of New Brunswick
This research focuses on text data mining and more precisely sentiment analysis algorithms. Sentiment analysis is a method of extracting sentiment knowledge out of unstructured text documents. This task raises many important challenges among which is domain dependency. Sentiment classification algorithms trained on one domain do not perform as well in another domain. The purpose of this thesis is to compare solutions of the well known challenges in sentiment analysis and to propose a solution to deal with the domain dependency issue. To do this, the framework developed in this thesis is intended to transfer the knowledge from a specific domain or general knowledge to another specific domain using the Min-cut algorithm. This algorithm can be used to mix the knowledge extracted using machine learning or general methods and that extracted using domain dependent methods. The goal is then to improve the accuracy obtained with usual sentiment analysis algorithms on each domain. The proposed approach will be implemented in a general system that will have as input a corpus of unlabeled documents dealing with a specific topic such as movies and will return as output a grade of positivity for each document. In order to compare the solution for other challenges such as feature selection or negation, the framework will also take as input the method to deal with these challenges.