English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 78937/78937 (100%)
造訪人次 : 39422079      線上人數 : 571
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/86634


    題名: 改進自注意力機制於神經機器翻譯之研究
    作者: 陳明萱;Chen, Ming-Hsuan
    貢獻者: 資訊管理學系
    關鍵詞: 神經機器翻譯;Transformer;自注意力機制;Gate機制;分群演算法;Neural Machine Translation;Transformer;Self-Attention Mechanism;Gate Mechanism;Clustering Algorithms
    日期: 2021-08-02
    上傳時間: 2021-12-07 13:02:39 (UTC+8)
    出版者: 國立中央大學
    摘要: 神經機器翻譯任務之目的為透過深度學習模型將來源語言句子轉換為目標語言,同時得以保留來源句子語意及正確句法。近年來常用的模型之一為 Transformer,透過模型中的自注意力機制捕捉句子的全局資訊,在多項自然語言處理任務中表現良好。然而,有研究指出自注意力機制會學到重複資訊,且無法有效學習文本中的局部資訊。因此,本研究針對 Transformer 中的自注意力機制進行改善,分別加入 Gate 機制與 K-means 分群演算法,進而提出 Gated Attention 與 Clustered Attention,其中 Gated Attention 又涵蓋 Top-k % 方法及 Threshold 方法。透過將 Attention Map 集中化,加強模型捕捉局部資訊之能力,藉此學習到更多元的句子關係,提升其翻譯品質。
      本研究將 Gated Attention 的 Top-k % 方法與 Threshold 方法,以及 Clustered Attention 應用於中英翻譯任務上,以 BLEU 作為評估指標,分別達到 25.30、24.69 及 24.69。其次,同時採用兩種注意力機制的混合組合模型之最佳結果為 24.88,並未比僅採用單一種方法要來得優秀。在實驗中皆證實本研究提出的改進模型優於原始 Transformer,另外亦表明了只使用一種注意力機制更能夠幫助 Transformer 學習文本資訊,且達到 Attention Map 集中化之目的。;The purpose of Neural Machine Translation (NMT) is to translate a source sentence to a target sentence by deep learning models and to be able to preserve the semantic meaning of the source sentence and have correct syntax as well. Recently, Transformer is one of the commonly used models. It can capture the global information of sentences through the Self-Attention Mechanism and performs well in lots of Natural Language Processing (NLP) tasks. However, some studies have indicated that the Self-Attention Mechanism learns repetitive information and cannot learn local information of texts effectively. Therefore, we modify the Self-attention Mechanism in Transformer and propose Gated Attention and Clustered Attention, by adding Gated Mechanism and K-means clustering algorithm respectively. Moreover, Gated Attention includes Top-k% method and Threshold method. These approaches centralize the Attention Map to made model improve the ability to capture local information and learn more different relationship in sentences. Hence Transformer can provide a higher quality translation.
    In this work, we apply Clustered Attention as well as Top-k% method and Threshold method of Gated Attention to Chinese-to-English translation tasks, and then the results are 24.69, 25.30 and 24.69 BLEU, respectively. Secondly, the best result of the hybrid combination model that uses both attention mechanisms at the same time is 24.88 BLEU, which is not better than using a single attention mechanism. In our experiments, we have found that the proposed model outperforms the vanilla Transformer. Furthermore, we have also observed that using only one attention mechanism can help Transformer learn text information better and achieve the goal of Attention Map centralization as well.
    顯示於類別:[資訊管理研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML132檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明