中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/89935
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 81570/81570 (100%)
Visitors : 47014762      Online Users : 107
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/89935


    Title: 基於Transformer及姿態辨識之即時手語翻譯系統;The Real-Time Sign Language Translation System Based on Transformer and Pose Estimation
    Authors: 余昌翰;Yu, Chang-Han
    Contributors: 資訊工程學系
    Keywords: 深度學習;自然語言處理;影像處理;電腦視覺;Deep Learning;Natural Language Processing;Image Processing;Computer Vision
    Date: 2022-08-12
    Issue Date: 2022-10-04 12:05:12 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 根據台灣衛生福利部2021年的統計,
    在台灣領有身心障礙證明者約為119.8萬人,其中有聽覺機能障礙者有12萬5764人,
    約佔總人口數的5\%。而聽障者們常會因自幼的聽力障礙而導致口語發音、學習上有許多困難,
    故以手語作為他們主要的溝通方式。

    而在現今,有許多手語慣用者在閱讀電視新聞、選舉辯論及直播記者會等
    需大量仰賴聽覺吸收資訊的媒體時,常僅能以字幕的形式進行閱讀,其中之選舉辯論、
    防疫記者會等政府舉辦之公共節目常會配有手語翻譯員,
    手語翻譯員會將主講者中文口語的內容轉換至手語的形式比劃給視聽者觀看,
    進而使手語慣用者能夠更輕鬆的理解媒體內之內容。
    但因手語翻譯員的數量仍有限,故僅能配置在少數的場合。
    於是,如何使聽障者能與一般閱聽者擁有同等的使用體驗,
    為現代媒體目前遇到的一項重大課題。

    本研究結合深度學習中的兩大領域,自然語言處理以及姿態辨識領域的技術,
    開發出了一套能及時進行手語翻譯並使用虛擬人物比劃手語手勢的系統,
    運用3D姿態辨識的模型將手語的單字影片轉化為手勢數據資料集,
    運用第三方語音辨識服務辨識使用者口語轉換至中文句子,
    並且利用自然語言處理模型將中文句子轉換為手語單字序列,
    並將手語單字序列與手勢數據資料集進行比對,
    進而將正確的手語手勢傳遞給虛擬人物,讓其進行比劃手語手勢,
    再串接所有階段成一完整的使用者系統。
    使其可進行即時翻譯手語的系統。

    此外,本研究也實驗並應用多種訊號平滑化的技術,
    改善姿態辨識常有的Temporal Jitter問題,
    使虛擬人物進行手語手勢時能更貼近真人。;According to statistics from the Taiwan′s Ministry of Health and Welfare in 2021,
    There are about 1,198,000 people with physical disability certificates in Taiwan,
    including 125,764 people with hearing impairment.
    About 5% of the total population.
    The hearing-impaired people often have many difficulties in oral pronunciation and learning due to the hearing impairment since childhood.
    Therefore, sign language is often used as its main communication method.

    And many sign language users are reading TV news, election debates and live press conferences, etc.
    Media that relies heavily on hearing to absorb information can often only be read in subtitles, including election debates,
    Regular meetings of public programs organized by the government such as epidemic prevention press conferences
    Equipped with a sign language teacher, the sign language teacher will convert the content of the speaker′s spoken language into the form of sign language,
    Make it easier for sign language users to understand the content.
    However, because the number of sign language teachers is still limited, they can only be deployed in a few occasions. then,
    How to enable the hearing-impaired to have the same experience as ordinary listeners,
    A major issue for modern media.

    This research combines technologies from two major fields in deep learning, natural language processing and gesture recognition.
    Developed a system that can perform sign language translation in time and use virtual characters to make sign language gestures,
    Using the 3D gesture recognition model to convert the single-word video of sign language into a gesture data set,
    Use a third-party speech recognition service to recognize the user′s spoken language and convert it into Chinese sentences,
    And use the natural language processing model to convert Chinese sentences into sign language word sequences,
    And compared the sign language word sequence with the gesture data set,

    Then, the correct sign language gestures are passed to the avatar, so that they can make sign language gestures,
    Then connect all the stages into a complete user system.
    A system that enables instant interpretation of sign language.

    In addition, this study also experimented and applied a variety of signal smoothing techniques,
    Improve the Temporal Jitter problem common in gesture recognition,
    The virtual characters can be closer to real people when they perform sign language gestures.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML58View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明