English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41631847      線上人數 : 3974
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/89805


    題名: 兩階段自注意力機制於高階特徵之動作識別;Lightweight Informative Feature with Residual Decoupling Self-Attention for Action Recognition
    作者: 鄭媛;Cheng, Yuan
    貢獻者: 資訊工程學系
    關鍵詞: 動作識別;自注意力機制;輕量化;action recognition;self-attention;lightweight
    日期: 2022-07-26
    上傳時間: 2022-10-04 12:00:32 (UTC+8)
    出版者: 國立中央大學
    摘要: 動作識別(Action Recognition)是電腦視覺研究中一門較基礎之領域,由於能延伸的應用非常眾多,因此也是現今仍需不斷精進的技術。隨著深度學習技術不斷地發展,許多圖像識別的研究方法也不斷更新與進步,而這些技術也能套用在動作識別領域以增加其準確率及穩健性,因此本篇論文致力於將許多新提出之方法部分應用於現有的基礎模型並修改其架構,對該基礎模型進行優化。
    本篇論文使用Facebook AI Research提出之SlowFast網路作為欲修改之模型基礎,並參考Actor-Context-Actor Relation Network(ACAR Net)處理高階特徵的概念,提出了Informative Feature with Residual Self-Attention module(IFRSA),並使用了MobileNet提出之separable convolution取代部分卷積層,衍生出輕量化版本Lightweight IFRSA (LIFRSA),且IFRSA中的自注意力機制(self-attention)也以兩階段自注意力機制(decoupling self-attention)取代之,提出了Lightweight Informative Feature with Residual Decoupling Self-Attention(LIFRDeSA)。
    根據實驗結果,本篇論文所提出之方法除了提升了基礎模型的準確率外,同時也考量了模型所需的計算資源,提出輕量且準確率更高之架構。
    ;Action Recognition aims to detect and classify the actions of one or more people in the video, and it can be connected to many different fields and provide several applications, so the accuracy of this basic task becomes an important part for these related researches. Therefore, we focus on enhancing the accuracy of previous work in this paper and manage to reduce its computational cost.
    The base model we used is SlowFast Network, which was a state-of-the-art. We refer to the concept of extracting high-level feature method in Actor-Context-Actor Relation Network(ACAR Net), and propose Informative Feature with Residual Self-Attention module(IFRSA). But the computational cost is very huge, so we first use the separable convolution, which was presented in MobileNet, to replace some convolution in this module. Secondly, the self-attention layer is substituted for decoupling self-attention, then we present Lightweight Informative Feature with Residual Decoupling Self-Attention (LIFRDeSA).
    Experiment on AVA dataset shows that the LIFRDeSA module enhance the accuracy of the baseline, and meanwhile concerning about the computational cost. The model we propose has higher accuracy than the baseline, and the additional part is very lightweight.
    顯示於類別:[資訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML44檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明