Bursty topic detection using acceleration and user influence
University of New Brunswick
In this thesis, we present a system which detects bursty topics from real-time social media data. Bursty topics are topics which get a sudden surge in their mentions online in a very short period. They are detected using acceleration of keywords from the real-time data. The acceleration of a bursty topic is measured using the increase in appearance of keywords of a topic over a small period. Along with acceleration, user influence is used to score keywords in our system to improve the detection by increasing the keyword precision of the detected bursty topics. Bursty topics are formed using the top scoring keywords, based on acceleration and influence, and grouping them based on the similarities in their term document vectors. We use soft frequent pattern mining approach for generating topics. The bursty topics are also linked to bursty topics detected on previous time windows by comparing similarities in their keywords. Bursty topics detected using acceleration are evaluated with and without user influence score, with a baseline topic model. We use Latent Dirichlet Allocation topic model as the baseline. It is found that user influence helped the topic detector improve its precision by 11% on average. The results show that user influence can add great value to bursty topic detection methods.