中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/86608
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 80990/80990 (100%)
Visitors : 42141905      Online Users : 953
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/86608


    Title: 利用虛擬資料建構深度學習訓練集以實現凌空書寫應用;Using Synthetic Data to Construct Deep Learning Datasets for Air-Writing Applications
    Authors: 黃啟軒;Huang, Chi-Hsuan
    Contributors: 資訊工程學系
    Keywords: 指尖偵測;凌空書寫;合成資料;文字辨識;Fingertip detection;air-writing;synthetic datasets;character recognition
    Date: 2021-08-03
    Issue Date: 2021-12-07 13:01:17 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 凌空書寫是一項新穎的人機互動輸入方式,使用者自然地在空中書寫想要輸入於若干機器或設備的文字,藉由攝影機所拍攝的畫面中進行即時指尖偵測,將指尖座標點形成軌跡,進而辨識該軌跡所代表的文字。凌空書寫可做為如智慧型眼鏡的文字輸入方法,非接觸式的書寫方式也能使用於若干衛生敏感場域,例如降低在醫院的使用者因接觸設備而感染病毒的風險。本研究旨在提出基於深度學習之第一人稱以及第三人稱凌空書寫技術。由於深度學習技術的使用需仰賴大量標記資料,我們選擇以Unity3D建立訓練資料集,將所建構的手部虛擬模型合成於隨機影像或單一顏色背景中,藉此有效且快速地生成標記合成資料。我們利用手部模型的改變,模擬書寫過程中的旋轉以及移動來增加資料多樣性。在較複雜的第三人稱場景中,我們更加入隨機變換的人臉以及人體軀幹讓虛擬資料更接近真實情況。我們利用物件偵測模型偵測指尖位置以形成文字軌跡,並刪除書寫過程中所產生的冗餘筆跡,讓處理後筆跡更貼近文字本身。我們結合手寫字與印刷字形成綜合資料集訓練文字辨識模型,採用ResNeSt架構來辨識近5000個中文字。實驗結果顯示我們所產生的大量且精準標記合成資料可有效訓練模型,協助實現包括第一與第三人稱的即時凌空書寫。;Air-writing is the practice of waving a finger in the air to write a character. Through the real-time fingertip detection from frames of captured videos, the trajectory of fingertip can be formed for character recognition. Air-writing may thus serve as a new human-computer interface to input texts for such facilities as smart glasses or computers requiring touchless operations. This research aims to propose deep-learning techniques for first-person and third-person air-writing. We first employed Unity3D to synthesize the hand model, which is superimposed onto randomly chosen images or single-color background to generate labeled data. The object detection model is trained accordingly to detect the fingertip positions. The trajectory can then be extracted to form a single-stroke character, and post-processing is applied to remove redundant connections within a character. A dataset containing handwritten and printed characters is built for training a classification model. The experimental results show that the large volume of high-quality labeled data can effectively train the model realizing the first- and third-person air writing.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML36View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明