對話機器人是一種自然語言生成的應用,它也被視為一種多輪次的對話系統,可 以維持對話超過一輪次以上。然而,使用者會期望對話機器人可以在幾輪次對話後, 依然記得之前的對話資訊,並在當前輪次的對話中適當地生成有意義、相關和一致的 話語。因此,我們提出了一種對話歷史與上下文資訊的多輪對話生成框架,旨在探討 歷史對話或上下文對話對多輪次對話的影響。 本研究透過階層次遞迴的框架,與先前有關多輪次對話生成任務研究進行比較。 本研究透過實驗說明了不同模型的成效,而模型包含基準模型 Hierarchical Recurrent Encoder-Decoder (HRED)與 Generative Pre-training Transformer (GPT-2) 以及我們提出的 模型 Hierarchal Recurrent framework with History Dialogue (HRHD)模型和 Hierarchal Recurrent framework with Context Dialogue (HRCD) 模型,並利用自動化評估指標以 及人工評估。 多方面的比較我們提出的模型以及基準模型的成效,結果指出 HRHD 模型表現出 較優的性能,而且它在開放域數據集和任務導向的數據集上取得較佳的效果。此外, HRCD 模型優於基準模型 HRED 並接近 GPT-2 模型的效果。通過實驗和分析,我們可 以說明歷史對話和上下文對話的資訊是可以有效改善多輪次對話生成的性能。;Conversation agents are the general application of natural language generation and are regarded as a multi-turn dialogue system that keeps conversation more than one turn. However, the user expects that the conversation agents can recover the entire dialogue history information after several turns of dialogue, and generate meaningful, relevant, and consistent responses appropriately in the current turn of dialogue. Therefore, we propose an architecture based on the history and context information that aims to learn whether history dialogue or context dialogue information has more impact on the multi-turn dialogue. Utilizing the hierarchal recurrent framework, we regard it as the real-world situation that happens in the multi-turn dialogue. We compare our models with previous studies on multi- turn dialogue generation tasks. Moreover, we first investigate the effectiveness of different models, including our baseline Hierarchical Recurrent Encoder-Decoder (HRED) and Generative Pre-training Transformer (GPT-2), and our proposed models: Hierarchal Recurrent framework with History Dialogue (HRHD) model and Hierarchal Recurrent framework with Context Dialogue (HRCD) model and then evaluate with the automatic evaluation metrics and the human evaluation metric. The HRHD model is conceptually simple and entirely shows promising performance. It obtains remarkable results in the open-domain dataset and the task-oriented dataset. Furthermore, the HRCD model outperforms the baseline HRED model and is close to the baseline GPT-2 model. Through the experiments and analysis, we knew that the information of the history dialogue and the context dialogue both improve the performance of the multi-turn dialogue generation.