摘要(英) |
The popularity of social networks has made them a perfect medium for activity or advertising campaign promotion. Therefore, many people use Facebook pages to announce their advertising campaign. The purpose of this study is to extract activity events by constructing two named entity recognition models, namely activity name and location, via a Web NER model generation tool [1]. We enhance the tool by improving the tokenizer and alignment technique. In addition, we also use a large database of FB checkin places for location name recognition improvement. For entity relation extraction, we apply sequential pattern mining to find rules for start date, end date, and location coupling. We use 1,300 posts from Facebook to test the activity event extraction performance. The experimental results show 0.727, 0.694 F_1-score for activity name and location recognition; and 0.865, 0.72 F_1-score for start and end date extraction. Overall, the extraction performance for activity event extraction is 0.708. |
參考文獻 |
[1] Y. Y. Huang and C.H. Chung, "A Tool for Web NER Model Generation Based on Google Snippets", National Central University graduated paper, 2015.
[2] A. Ritter, O. Etzioni, and S. Clark. Open domain event extraction from Twitter. In Proc. SIGKDD, pages 1104–1112, 2012.
[3] Wang, W.: Chinese news event 5w1h semantic elements extraction for event ontology population. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 197–202. ACM (2012)
[4] N. Kanhabua, S. Romano, and A. Stewart. Identifying relevant temporal expressions for real-world events. In Proceedings of the SIGIR 2012 Workshop on Time-aware Information Access (TAIA ’12), 2012.
[5] Suthasinee Kuptabut and Ponrudee Netisopakul Event Extraction using Ontology Directed Semantic Grammar. Journal of Information Science and Engineering 32,79-96 (2016)
[6] Wallach, H.M. (2004) Conditional Random Fields: An Introduction.University of Pennsylvania CIS Technical Report MS-CIS-04-21.
[7] N. Dalvi, M. Olteanu, M. Raghavan, and P. Bohannon. Deduplicating a places database. In Proceedings of the 23rd international conference on World wide web, pages 409–418. International World Wide Web Conferences Steering Committee, 2014.
[8] Feng, Y., Huang, R., Sun, L.: Two Step Chinese Named Entity Recognition Based on Conditional Random Fields Models. In: Sixth SIGHAN Workshop on CLP, pp. 120–123. ACL Press, Hyderabad.
[9] T.-S. Chen, M.-C, Chen, C.-H, Chang, "基於頁面層級之快速網頁資料擷取與綱要驗證", Conference on Technologies and Applications of Artificial Intelligencester, 2014.
[10] Y.-S. Su, Associated Information Extraction for Enabling Entity Search on Electronic Map, National Central University, 2012.
[11] J. Strötgen and M. Gertz. Heideltime: High quality rule-based extraction and normalization of temporal expressions. In Proceedings of the 5th International Workshop on Semantic Evaluation, 2010.
[12] Yu-Yang Lin Author, Chia-Hui Chang Author, “網頁商家名稱擷取與地址配對之研究” (ROCLING 2014) , Chung-li, Taiwan, September 91-93, 2014. |