應用自動資訊擷取於故事書問答之研究;How to Ask and Answer a Robot About a Story Book

NCUIR > College of Electrical Engineering & Computer Science > Graduate Institute of Computer Science and Information Engineering > Electronic Thesis & Dissertation > Item 987654321/89834

Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/89834

Title:	應用自動資訊擷取於故事書問答之研究;How to Ask and Answer a Robot About a Story Book
Authors:	高愷言;Kao, Kai-Yen
Contributors:	資訊工程學系
Keywords:	問題-答案配對生成;對話問答;資訊擷取;Question-Answer Pairs Generation;Question Answering;Information Extraction
Date:	2022-07-28
Issue Date:	2022-10-04 12:01:31 (UTC+8)
Publisher:	國立中央大學
Abstract:	對於教學者來說，如何從故事文本中，產生高品質且通順的問題-答案配對是一件耗時且耗力的事情，其目的不是要讓學生回答不出來，而是需要經過巧妙的設計將文本中的重要資訊當成答案，並且生成與之相對應的問題。本論文透過預訓練模型進行生成式的問題-答案配對產生，接著擷取文本中的資訊，進行模板式生成問題-答案配對。對話問答也是透過預訓練模型，並對目標領域進行fine tuned，來生成適當的回應。本論文的方法主要分為兩個部分，第一部分是生成式問題-答案產生，使用answer-aware的方法，先從文本中parse出名詞短語以及動詞相關語句，並且在輸入的部分加上答案的類別對於BART模型進行fine-tuned，最後透過DistilBERT模型進行問題-答案配對排序，使用答案類別可以讓模型生成的問題-答案配對的品質更好，數量也會增加。最後我們也分析了問題-答案生成人工效能評估，並利用問題回應使用的評估方法ROUGE-L做為評估問題-答案配對的指標，發現其相關性比排序分數還要高，可做為問題-答案配對篩選方式。第二部分是模板式生成：使用pipeline的方法，先將實體擷取出來，組成兩兩配對後輸入ALBERT based模型進行關係擷取，並且在輸入語句時使用上下文的資訊，最後將擷取出來的關係作為模板式生成的要素。;For educators, how to generate high-quality and readability question-answer pairs from the story text is a time-consuming and labor-intensive task. The purpose is not to make students unable to answer, but is to use the important information in the story text as the answer and generate corresponding questions. In this paper, we use the pre-trained model to generate generative-based question-answer pairs. And extracts information from the text, performs a template-based question-answer pairs generation. Question Answering also generates appropriate responses by pre-trained model and fine-tune to the target domain. The method of this paper is mainly divided into two parts. The first part is the generative-based question-answer pairs generation. Using the answer-aware method, first parse out noun phrases and verb-related sentences from the text, and add the answer type to the input for fine-tuning the BART model. Then, the question-answer pairs is sorted by the DistilBERT model. The answer type can make the quality of the question-answer pairs better and the number of them will increase. And we also analyzed the performance of human evaluation on question-answer pairs, and used the evaluation metric ROUGE-L that used in question answering for evaluating question-answer pairs. It is found that its relevance is higher than the ranking score, which can be used as a question-answer pairs filtering method. The second is template-based generation: using the pipeline method: extracting the entities, forming pairs of entity and then inputting to ALBERT based relation extraction model for prediction. And then uses the extracted relations as a element of template-based question generation.
Appears in Collections:	[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	24	View/Open

社群 sharing

Loading...