English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 40248797      線上人數 : 141
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/88330


    題名: 學習模態間及模態內之共用表示式;Learning Representations for Inter- and Intra- modality data
    作者: 洪晨瑄;Hung, Chen-Hsuan
    貢獻者: 資訊管理學系
    關鍵詞: 跨模態學習
    日期: 2022-06-10
    上傳時間: 2022-07-13 22:46:56 (UTC+8)
    出版者: 國立中央大學
    摘要: 目前已有許多研究專注於特定領域的表示式,包含自然語言處理領域、電腦視覺領域等,而文字也可以被用來代表特定物體,意即自然語言及圖片可能會共享相同的意義,過去已有許多文獻將文字及圖片結合並應用在圖像描述生成、圖像問答、圖像檢所等任務上。然而,卻鮮少有研究專注於多個語言及圖片之間的共用表示式,所以在我們的研究中,我們以監督式方式使用編碼器-解碼器架構學習模態間及模態內的共用表示式,並且將由編碼器產生的隱藏層作為我們所關注的共用表示式。
    除了學習共用表示式外,我們進一步分析了使用我們的架構學習出來的共用表示式,我們也將此共用表示式與單一模態的專屬表示是做視覺化比較,並且證明我們的共用表示式是能夠同時有學習到文字模態資料以及圖片模態資料。除此之外我們也探討其他會影響共用表示式學習的因素,增加相似字於文字資料來做訓練可以取得較獨特且集中的共用表示式分布,同時也可保持圖片的重建能力以及文字向量的生成能力。當增加另一個語言的文字一起做訓練時,也可以在共用表示式的分布上發現如同新增相似字較獨特且集中的特性,並且仍然可以被正確的重建成原來的圖片以及相對應的文字向量。最後,我們研究了共用表示式的可擴展性,也探討了這個實驗的限制。
    ;Many studies have investigated representation learning for domains such as Natural Language Processing or Computer Vision. Texts can be viewed as a kind of representations that stand for a certain object. In other words, natural language might share the same meaning as in an image. There were plenty of works that requires learning from both texts and images in the tasks like image captioning, visual question answering, image-to-text retrieval and so on. However, the shared representation between multiple languages and an image is seldom discussed. Hence, in this study, we propose an encoder-decoder architecture to learn the shared representations for inter- and intra-modality data. Utilizing the encoder-decoder framework, we regard the latent space vector to be the shared representation since the latent space vectors are learned from the both modalities in a supervised way to capture the shared semantics.
    We also further analyze the shared representations learned via our architecture. Through visualization compared with single-modality representations, we demonstrate that our shared representations does learn from both image modality data and text modality data. We also discuss on other factors that might contribute to the shared representation learning. We find out that including synonyms for our model to learn will lead to more distinct and condensed distribution of shared representations each class while keeping the image reconstructing ability and become general on generating text vectors. When training with additional language, our shared representations are still able to be converted into images and texts correctly. In this case, we also observe the same characteristic on the distribution of shared representations as in adding synonyms. Lastly, we investigate in the scalability of our shared representation learning process and discuss on the limit to this approach.
    顯示於類別:[資訊管理研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML37檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明