中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/84690
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 78818/78818 (100%)
Visitors : 34717178      Online Users : 1549
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/84690


    Title: 以深度學習視覺辨識為基礎之台灣手語訓練系統;A Deep-Learning-Based Vi Sual Recognition Scheme for Taiwan Sign Language Training
    Authors: 蘇柏齊;郭天穎
    Contributors: 資訊工程學系
    Keywords: 台灣手語;視覺辨識;深度學習;注意力模型;Taiwan sign language;visual recognition;deep learning;attention model
    Date: 2020-12-08
    Issue Date: 2020-12-09 10:44:16 (UTC+8)
    Publisher: 科技部
    Abstract: 手語是一種結合手勢、肢體動作與臉部表情的視覺語言,為聽障人士主要的溝通方式。除了聽障者之外,對於可聽但無法正常發聲的人而言,手語亦是有效的訊息傳遞工具。手語是一種很巧妙的語言,事實上很多人想學習手語,理由包括幫助聽障人士、想當手語翻譯員、增加第二種語言或才藝專長、以手語提升嬰幼兒創造力與精細動作發展等。如同口說語言,世界各地的手語表達方式並不相同,為了讓聽障人士與社會能更緊密的結合,因地制宜推廣手語學習有其必要性。本研究旨在開發適用於台灣手語學習的視覺辨識技術,希望讓對於手語有興趣的學習者擁有一套能自我學習或訓練的輔助機制。此機制提供若干基礎手語文句給學習者觀看,學習者可依照範例對著攝影鏡頭自行練習,攝影機所擷取的畫面可被進一步分析以回應學習者是否以正確的方式比劃該手語文句。手語視覺分析系統是基於先進的深度學習視覺辨識方法,本計畫可分成三個主要的部分:(1) 建立適用於台灣手語之人體骨架特徵點擷取、 (2) 開發基於注意力模型之手語文句辨識,以及 (3) 輕量化模型設計。首先,我們會嘗試以Unity3D自建符合此應用需求的人體標記資料集,再搭配現有的相關資料集,訓練可靠的人體特徵點擷取,免除學習者需要配戴特殊裝置或手套等之不便,同時也避免系統使用深度攝影機以讓所提出的方法在將來更具使用效益。接著我們嘗試在自然語言處理中所提出的基於注意力模型,藉此辨識由專家所選取之常用手語文句,並經由視覺辨識後提出合理的評分機制以給予學習者適當的回饋。最後我們希望建立輕量化模型以進一步提升系統整合性與實用性。我們希望所提出的視覺處理系統能夠辨識學習者的手語表達並檢測其正確性,為推廣手語的使用有所貢獻,以協助台灣建置對聽障者更具關懷的社會環境。 ;A sign language is one kind of visual language combing gestures, body motions and facial expressions. It serves as the major communication tool for hearing-impaired people. Besides, for those who are unable to talk normally, the sign language also provides an effective way to express themselves. Nowadays many people are actually very interested in learning a sign language for several different reasons, such as being able to help hearing-impaired people, working as the interpreter using a sign language, learning it as the second language or another skill, or using the sign language to boost the creativity or speed up the development of body motions for infants/children. Just like any language in the world, sign languages differ in areas and countries. In order to embrace hearing-impaired people to form a more closely-connected society, it is necessary to popularize the learning of sign languages with the approaches in line with local circumstances. This research aims at developing visual recognition techniques to facilitate the learning/training of Taiwan Sign Language (TSL) and providing an assisting mechanism to enable interested learners to practice TSL by themselves. Some basic terms/sentences will be selected by TSL experts and can be displayed on a monitor in the proposed system. The TSL learner can watch the videos to learn the corresponding sign language expressions in front of a camera. The captured video will be further analyzed to evaluate whether the learners practice correctly. The proposed visual analysis scheme is based on state-of-the-art deep learning techniques. The project has three main parts: (1) extracting human keypoints specifically for TSL visual recognition, (2) TSL terms/sentences recognition based on attention-based models and (3) developing light-weighted deep learning architecture for TSL learning/training system. First, we will try to employ Unity3D to construct a labelled human keypoint dataset suitable to the visual recognition of TSL. The extracted keypoints can help to avoid the need of wearing sensors or gloves. The system uses an RGB camera, instead of a depth camera, to hopefully broaden the scope of applications in the future. Next, we will implement and test attention-based models proposed in the field of natural language processing to reasonably recognize or translate some basic TSL terms/sentences selected by the experts. After the performance reaches the required accuracy, we consider light-weighted designs of deep learning to further facilitate system implementation and integration. We hope that the proposed visual processing system can correctly recognize and evaluate the expressions of TSL learners to make solid contributions to the popularization of TSL. This may also help to construct a friendlier environment for hearing-impaired people in Taiwan.
    Relation: 財團法人國家實驗研究院科技政策研究與資訊中心
    Appears in Collections:[Department of Computer Science and information Engineering] Research Project

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML227View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明