English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 43782501      線上人數 : 1706
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/86297


    題名: 以少量視訊建構台灣手語詞分類模型;Using a Small Video Dataset to Construct a Taiwanese-Sign-Language Word Classification Model
    作者: 翁浚銘;Wong, Jun-Ming
    貢獻者: 軟體工程研究所
    關鍵詞: 台灣手語;手語識別;深度學習;Taiwanese sign language;sign language recognition;deep learning
    日期: 2021-08-04
    上傳時間: 2021-12-07 12:29:01 (UTC+8)
    出版者: 國立中央大學
    摘要: 手語是一種視覺語言,利用手形、動作,甚至面部表情傳達訊息以作為聽障人
    士主要的溝通工具。以深度學習技術進行手語辨識在近年來受到矚目,然而神經網
    路訓練資料需仰賴大量手語視訊,其製作過程頗費時繁瑣。本研究提出利用單一手
    語視訊建構深度學習訓練資料的方法,實現在視訊畫面中辨識台灣手語詞彙。
    首先,我們由視訊共享平台中取得一系列手語教學視訊,透過Mask RCNN[1]
    找出所有教學畫面中的手部和面部分割遮罩,再透過空間域數據增強來創建更多不
    同內容的訓練集。我們也採用不同的時間域採樣策略,模擬不同手譯員的速度。最
    後我們以具注意力機制的3D-ResNet 對多種台灣手語辭彙進行分類,實驗結果顯
    示,我們所產生的合成資料集能在手語辭彙辨識上帶來幫助。;Sign languages (SL) are visual languages that use shapes of hands,
    movements, and even facial expressions to convey information, acting
    as the primary communication tool for hearing-impaired people. Sign
    language recognition (SLR) based on deep learning technologies has attracted
    much attention in recent years. Nevertheless, training neural
    networks requires a massive number of SL videos. Their preparation process
    is time-consuming and cumbersome. This research proposes using a
    set of SL videos to build effective training data for the classification of
    Taiwanese Sign Language (TSL) vocabulary. First, we begin with a series
    of TSL teaching videos from the video-sharing platform. Then, Mask
    RCNN[1] is employed to extract the segmentation masks of hands and
    faces in all video frames. Next, spatial domain data augmentation is applied
    to create the training set with different contents. Varying temporal
    domain sampling strategies are also employed to simulate the speeds of
    different signers. Finally, the attention-based 3D-ResNet trained by the
    synthetic dataset is used to classify a variety of TSL vocabulary. The
    experimental results show the promising performance and the feasibility
    to SLR.
    顯示於類別:[軟體工程研究所 ] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML141檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明