dc.description.abstract | As the rise of social media, people are more willing to declare their position, give comments and share others’ posts on the platform. Social medias emphasize information immediacy, which leads to stream generate constantly. As a result, how users know the hot topics and the events users interest becomes a difficult challenge. In particular,“Topic Detention and Tracking”(TDT) becomes a popular research project applied on social medias. Traditional TDT research mainly focused on high structured articles, e.g., news articles. This research takes Facebook as the research platform and use “Topic Detention and Tracking” to discuss the short-text documents on the public fan page.
The primary purpose of the research is to allow users to realize events of topics through data visualization using five major themes of detections: story segmentation, first story detection, topic tracking, topic detection, and link detection. The application and capability of these detections and tracking system will then be used for discussion of news and explanation of its commercial purposes. This research divides the system procedure to three stages. The first is data collection and catch, which get the posts information on the public fan pages through the Facebook Graph API and map the posts to certain topic through the keyword mapping. The second stage is data analysis, which get the keywords from the posts by Incremental TF-DF and avoid the problem of excessive term dimension. Then, through the document clustering technology, k-medoids, and the auto-decide clustering numbers algorithm to achieve auto-clustering distinguish events. The third stage is data visualization, which through clustering analysis and data visualization technology to visualize the analysis result in a large scale. | en_US |