摘要(英) |
The development of the dialogue system has become a hot research project in recent years, many companies have this demand. The dialogue system can be divided into two categories according to the purpose. First, the task-oriented dialogue system, such as: customer service, to answer customer questions for specific areas, or personal assistant Siri, can integrate information (mobile phone address book, weather, calendar, time ... and other related information), and supply enquiry; Second, non-task-oriented dialogue system, such as: to accompany the main purpose of the robot Alice, a simple chat dialogue. Our research focuses on the latter. The purpose is to respond to the user′s words. The user′s sentence may be a question, complain, sigh, facts, and so on. chatbots how to answer is key point in this paper.
Short text conversation system can be divided into two categories: Retrieval-based、Generative-based. The former approach depends on the quality of the database, the latter is required to have a grammar check module. In this paper, we hope to solve the problem of Generative-based STC, but adopt Retrieval-based as the base, and use the network as a database to retrieve candidate sentences from Google Abstract, so we do not need to collect a large number of text-rich databases in advance. Practice includes: First, the query keyword generation; Second, the punctuation and candidate sentences of the pre-processing; Third, SVMrank sort sentences.
We use the NTCIR STC2 response evaluation criteria. 3 non-expert evaluate responses of 100 posts. The average score is 0.713. |
參考文獻 |
[1] Kristiina Jokinen and Michael McTear. Spoken Dialogue Systems Chapters 2.1.2, 2.2, 4, 5.1. Morgan & Claypool Publishers, 2010.
[2] D. Goddeau, H. Meng, J. Polifroni, S. Seneff, and S. Busayapongchai. A form-based dialogue manager for spoken language applications. In Proc. ICSLP, pp. 701—704, 1996.
[3] Xu, W. and Rudnicky, A. Task-based dialog management using an agenda. ANLP/NAACL 2000 Workshop on Conversational Systems, pp. 42-47, May 2000.
[4] Colin Matheson, Massimo Poesio, and David Traum, Modelling Grounding and Discourse Obligations Using Update Rules, in Proceedings of the 1st Annual Meeting of the North American Association for Computational Linguistics (NAACL2000), May 2000.
[5] David Traum and Staffan Larsson, The Information State Approach to Dialogue Management in Current and New Directions in Discourse and Dialogue, Ed. Jan van Kuppevelt and Ronnie Smith, Kluwer, pp 325-354, 2003.
[6] Zongcheng Ji, Zhengdong Lu, Hang Li, An Information Retrieval Approach to Short Text Conversation, 2014
[7] Wu, W., Lu, Z., Li, H., Jan. Learning bilinear model for matching queries and documents. Journal of Machine Learning Research 14 (1), 2519–2548, 2013.
[8] Xue, X., Jeon, J., Croft, W. B. Retrieval models for question and answer archives. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. SIGIR ’08. ACM, New York, NY, USA, pp. 475–482. 2008.
[9] Lu, Z., Li, H.. A deep architecture for matching short texts. In: Advances in Neural Information Processing Systems. pp. 1367–1375. 2013.
[10] Anton Leuski and David Traum. NPCEditor: Creating virtual human dialogue using information retrieval techniques. AI Magazine, 32(2):42–56. 2011.
[11] N. Roy, J. Pineau, and S. Thrun. Spoken Dialog Management Using Probabilistic Reasoning . In Proceedings of ACL. 2000.
[12] D. Litman, S. Singh, M. Kearns, and M. Walker. NJFun: A Reinforcement Learning Spoken Dialogue System. In Proceedings of NAACL. 2000.
[13] Shang, L., Lu, Z., and Li, H. Neural responding machine for short-text conversation. In Proceedings of ACL, 2015.
[14] A. Sordoni, M. Galley, M. Auli, C. Brockett, Y. Ji, M. Mitchell, J.-Y. Nie, J. Gao, B. Dolan. A Neural Network Approach to Context-Sensitive Generation of Conversational Responses. In Proc. of NAACL-HLT. Pages 196-205. 2015.
[15] CKIP Chinese Parser, http://parser.iis.sinica.edu.tw/
[16] E-HowNet, http://ehownet.iis.sinica.edu.tw/ehownet.php
[17] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient Estimation of Word Representations in Vector Space. In Proceedings of Workshop at ICLR, 2013.
[18] Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychological Bulletin, Vol. 76, No. 5 pp. 378–382, 1971. |