姓名 施庫瑪(Sipun Kumar Pradhan)  查詢紙本館藏   畢業系所 資訊工程學系
(A Rapid Deep Learning Model for Goal-Oriented Dialog)
摘要(中) 摘要

摘要(英) Open-domain Question Answering (QA) systems aim at providing the exact answer(s) to questions formulated in natural language, without restriction of domain. My research goal in this thesis is to develop learning models that can automatically induce new facts without having to be re-trained, in particular its structure and meaning in order to solve multiple Open-domain QA tasks. The main advantage of this framework is that it requires little feature engineering and domain specificity whilst matching or surpassing state-of-the-art results. Furthermore, it can easily be trained to be used with any kind of Open-domain QA.

I investigate a new class of learning models called memory neural networks. Memory neural networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. I investigate these models in the context of question answering (QA) where the long-term memory effectively acts as a (dynamic) knowledge base, and the output is a textual response. Finally, I show that an end-to-end dialog system based on memory neural networks can reach promising and learn to perform non-trivial operations. I confirm those results by comparing my system to various well-crafted baseline Datasets and future work is discussed.
關鍵字(中) ★ 問題問答
★ 記憶神經網路
★ 長期記憶元件
關鍵字(英) ★ Question Answering
★ Memory neural networks
★ Long-term memory component
論文目次 Contents
Chapter 1 Introduction 1
1.1 Overview 1
Out-Of-order access: 1
Long-term dependency: 2
Unordered set: 3
1.2 Motivation 3
1.3 Brief Literature 4
1.4 Contributions 6
1.5 Thesis Organization 7
Chapter 2 Deep Learning Background 9
2.1 Deep Learning and Artificial Intelligence 9
2.2 Why Deep Learning? 9
2.2.1 Learning Representations 9
2.2.2 Distributed Representations 9
2.2.3 Learning Multiple Levels of Inference 10
2.3 Neural Networks: Definitions and Basics 11
2.3.1 Word Vector Representations 14
2.4 Recurrent Neural Network 15
2.4.1 Adaptive Context Features. 18
2.4.2 Forward Pass 19
2.4.3 Backward Pass 20
2.5 Memory Networks 21
2.5.1 Long Short Term Memory 21
2.5.2 The LSTM Architecture 23
2.5.3 Influence of Preprocessing 25
2.5.4 Gradient Calculation 26
2.5.5 Architectural Enhancements 27
2.5.6 LSTM Equations 27
2.6 Hashing Function 30
Chapter 3 Memory Neural Network 32
3.1 Memory Network Implementation 36
3.1.1 Memory Neural Network Model 36
3.1.2 Training a Memory Neural Network 39
3.1.3 Word Sequences as Input 40
3.1.4 Efficient Memory Via Hashing 41
3.1.5 Modelling Write Time 42
3.1.6 Modelling Previously Unseen Words 44
3.1.7 Exact Matches And Unseen Words 45
Chapter 4 Implementation 46
4.1 Single Layer 47
4.1.1 Memory Representation: 47
4.1.2 Generating the Final Prediction: 48
4.2 Multiple Layers 49
4.3 Synthetic Question and Answering Experiments 52
4.3.1 Sentence Representation: 53
4.3.2 Temporal Encoding: 54
4.3.3 Learning Time Invariance by Injecting Random Noise: 54
Chapter 5 Experimental result 55
5.1 Dataset 55
5.2 Preprocessing. 62
5.3 Baselines 63
5.4 Results 64
5.5 QA With Previously Unseen Words 69
5.6 Combining Simulated Data and Large-Scale QA 69
5.7 Language Modeling Experiments 69
Chapter 6 Conclusion 73
6.1 Main Contributions 73
6.2 Future work 74
指導教授 陳慶瀚(Ching-Han Chen) 審核日期 2016-8-18
