An empirical study on comparison between transfer learning and semi-supervised learning
University of New Brunswick
Transfer learning and semi-supervised learning attract considerable attention since the traditional machine learning methods yield insufficient performance in many practical applications with scarce labeled data. In such cases, knowledge transfer from a related domain or information extraction of unlabeled data, if done properly, would significantly upgrade the classifier by avoiding costly labeling expense. These two branches of machine learning try to use auxiliary data to make up for the shortage of labeled instances. In this study, a set of experiments are conducted on several typical algorithms for both transfer learning and semi-supervised learning to test whether these auxiliary data should be beneficial. The empirical study shows that these auxiliary instances may not be permanently helpful comparing to the traditional learning methods. However, when a special situation with an extremely small number of labeled instances arises, the auxiliary data would improve the performance significantly. The internal characteristics influencing the performance in each branch is also explored in this study.