以手機輔助視障人士之室內物件偵測及拿取

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：55

、訪客IP：3.143.204.49

姓名

陳蓁晏(Zhen-Yan Chen) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

以手機輔助視障人士之室內物件偵測及拿取

相關論文

★ 直接甲醇燃料電池混合供電系統之控制研究	★ 利用折射率檢測法在水耕植物之水質檢測研究
★ DSP主控之模型車自動導控系統	★ 旋轉式倒單擺動作控制之再設計
★ 高速公路上下匝道燈號之模糊控制決策	★ 模糊集合之模糊度探討
★ 雙質量彈簧連結系統運動控制性能之再改良	★ 桌上曲棍球之影像視覺系統
★ 桌上曲棍球之機器人攻防控制	★ 模型直昇機姿態控制
★ 模糊控制系統的穩定性分析及設計	★ 門禁監控即時辨識系統
★ 桌上曲棍球：人與機械手對打	★ 麻將牌辨識系統
★ 相關誤差神經網路之應用於輻射量測植被和土壤含水量	★ 三節式機器人之站立控制

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

我們在家或辦公室常常會有拿某件物品的需要，若是視障人士獨自在一個不熟悉的室內環境中，要偵測並拿取物品非常困難。本論文就是要協助視障人士解決這個問題。我們開發一種手機上的導盲輔助行動應用程式(Mobile Application, APP)，視障者操作此APP，可以選擇要拿取的物品，並偵測該物品所在位置及距離，利用聲音及震動回饋引導視障者至該物品附近並拿取。
　　本論文的研究分為三大部分，第一部分為物件偵測模型(Object detection)的訓練，被偵測物件為室內環境常見的物品，取自於多種公開資料集，並且設計出一套自動過濾標註的系統，免除大量人力來執行標註。第二部分旨在修改物件偵測模型架構，使模型可以被手機晶片運算，並使用模型輕量化 (Neural Network Quantization)達成多種輸入影像尺寸(image-size)、運算精度(precision)、模型架構的輕量化轉換後，上述各種組合的模型架構之推論FPS (Frames Per Second)平均可快三倍。第三部分為設計一個手機上的APP，藉由手機內部的感測器來幫助視障人士引導方向跟姿態。並具有自動校正的演算法以適用於任一支手機，無須再手動進行參數校正，並設計符合視障人士習慣的操作介面，以提高使用便利性。本系統使用情境如下，當視障人士在APP選擇指定想拿取的物品後，透過移動手機，使鏡頭看到指定物件後，APP以中文語音及提示音的方式，引導視障人士走向該物品的方向，至足夠靠近物品後拿取物品。有別於過去研究需穿戴某些輔具，本系統僅需手機APP即可運行，免除繁複穿戴過程，祈使視障人士能以最輕便方式完成日常取物之需。

摘要(英)

We often need to reach for certain objects at home or in the office, but it is a difficult task for the visually impaired, especially in an unfamiliar indoor environment. This thesis focuses on assisting the visually impaired in reaching for the object in the indoor environment. We develop a Mobile Application (App) on a cell phone such that the visually impaired people can operate it to select the objects to be taken. The cell phone with the App can detect the location and distance of the objects, and guide the visually impaired person to take the objects by using sound and vibration feedback from the cell phone.
The first part of this study is the training of the object detection model. The detected objects are for daily use in the indoor environment and are adopted from various open datasets. A system is designed to automatically filter the labels so that a large amount of human effort is saved to perform the labeling. The second part aims to modify the object detection model architecture so that the model can be computed by the cell phone chip, and use Neural Network Quantization to achieve a variety of input image-size, computational precision, and lightweight conversion of the model architecture. Thus, the FPS of inference of various model architectures can be three times faster on average. The third part is to design the APP on cell phones to guide the visually impaired in correct direction and posture by using the sensors inside the cell phone. We also design an automatic correction algorithm for applying to any cell phone without manual parameter correction, and establish the operation interface to meet the habits of the visually impaired to improve the convenience of use. The scenarios for the study are as follows. The visually impaired person selects the specified object in the APP and lets the cell phone search the selected object. Then the APP guides him/he in the right direction with Chinese voice and prompt tone to get close enough to the object and take it. Unlike previous studies that required the wearing of computing devices, this system only requires a cell phone with the APP so that the visually impaired can complete their daily needs most conveniently.

關鍵字(中)

★ 視障者輔具
★ 行動應用程式
★ 語音導航
★ 即時性運算
★ 深度學習
★ 物件偵測
★ 半自動標註
★ 輕量化

關鍵字(英)

★ Visually impaired people
★ Mobile Application
★ Object detection
★ Cell phone
★ Neural networks

論文目次

摘要 i
ABSTRACT ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 viii
第一章緒論 1
1.1 研究動機與背景 1
1.2 文獻回顧 1
1.3 論文目標 3
1.4 論文架構 4
第二章系統架構及軟硬體介紹 5
2.1 系統架構及開發流程 5
2.2 硬體介紹 6
2.2.1 手機規格介紹 6
2.2.2 手機系統架構 8
2.3 軟體介紹 9
第三章物件偵測資料的半自動標註系統 11
3.1 資料集介紹 11
3.2 整合多資料集名稱方法 12
3.3 自動過濾標註的方法 15
3.3.1 資料集的標註轉換 15
3.3.2 剔除群體物件之標註 17
3.3.3 濾除過小之標註 18
3.3.4 自動過濾系統 19
3.4 單物件自動標註方法 20
3.5 半自動標註系統流程 23
3.5.1 標註編號的整理方法 24
第四章物件偵測網路 26
4.1 訓練資料前處理 26
4.1.1 公平分割訓練資料的方法 26
4.1.2 訓練資料的擴充 27
4.2 物件偵測網路模型比較及選擇 28
4.3 模型架構之改進 29
4.4 物件偵測模型部署至手機 30
4.4.1 模型的預測層裁斷 30
4.5 輕量化模型 31
第五章應用手機元件之開發 34
5.1 手機影像端處理 34
5.2 手機內感測器 36
5.2.1 手機感測器座標系統 36
5.2.2 旋轉矩陣之推導 37
5.2.3 手機感測器三維空間轉換 39
第六章引導取物之播報與判斷機制 41
6.1 手機應用程式系統控制 41
6.1.1 Talkback操作方式與起始介面 42
6.2 聲音播報控制 42
6.3 方向指引控制 43
6.4 畫面引導機制 47
6.4.1 引導至畫面中央區域之機制 49
6.4.2 使用者可拿取之系統判斷 49
第七章實驗結果 51
7.1 自動過濾標註之成效於使用公開資料集 51
7.1.1 過濾標註和公平分割訓練方法於同類別和同架構之成效 51
7.1.2 真實環境的測試情況 54
7.1.3 不同輸入影像尺寸之訓練準確度 55
7.2 結合單物件自動標註方法之模型數據結果 55
7.3 物件偵測網路之不同架構效果 58
7.4 輕量轉換模型成果 61
7.4.1 輕量轉換前後模型結果 61
7.4.2 輕量模型於各手機提升速度表現 63
7.5 引導取物之應用程式 64
第八章結論與未來展望 67
8.1 結論 67
8.2 未來展望 67
參考文獻 69

參考文獻

[1] R. Bourne et al., "Trends in prevalence of blindness and distance and near vision impairment over 30 years: an analysis for the global burden of disease study," The Lancet Global Health, no. 2, pp. 130-143, 2021.
[2] 衛生福利部統計處, "身心障礙人數(年)," 2022. [Online]. Available: https://statdb.dgbas.gov.tw/pxweb/Dialog/viewplus.asp?ma=SW0109A1A&ti=%A8%AD%A4%DF%BB%D9%C3%AA%A4H%BC%C6-%A6~&path=../PXfile/SocialWelfare/&lang=9&strList=L.
[3] 賴怡靜, "基於深度學習之距離估測與自動避障的戶外導航機器人," 碩士, 電機工程學系, 國立中央大學, 2018.
[4] 邱文欣, "基於深度學習之單眼距離估測與機器人戶外行走控制," 碩士, 電機工程學系, 國立中央大學, 2019.
[5] 汪孟璇, "基於深度學習之道路資訊辨識導盲系統," 碩士, 電機工程學系, 國立中央大學, 2020.
[6] 沈鴻儒, "基於深度學習之道路障礙物偵測與盲人行走輔助技術," 碩士, 電機工程學系, 國立中央大學, 2020.
[7] 城筱筑, "基於AI技術之視障人士的路況分析及障礙物辨識與測距," 碩士, 電機工程學系, 國立中央大學, 2021.
[8] 謝易軒, "基於AI技術之視障人士的行進避障及超商辨識與引導," 碩士, 電機工程學系, 國立中央大學, 2021.
[9] C. Godard, O. Mac Aodha, and G. J. Brostow, "Unsupervised monocular depth estimation with left-right consistency," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 270-279.
[10] A. Paszke, et al., "Enet: A deep neural network architecture for real-time semantic segmentation," 2016. [Online]. Available: https://doi.org/10.48550/arXiv.1606.02147.
[11] "Stereolabs ZED." [Online]. Available: https://www.stereolabs.com/zed/.
[12] "Nvidia Jetson AGX Xavier." [Online]. Available: https://www.nvidia.com/zh-tw/autonomous-machines/embedded-systems/jetson-agx-xavier/.
[13] R. P. Poudel, S. Liwicki, and R. Cipolla, "Fast-scnn: Fast semantic segmentation network," 2019. [Online]. Available: https://doi.org/10.48550/arXiv.1902.04502.
[14] J. Redmon and A. Farhadi, "YOLOv3: An incremental improvement," 2018. [Online]. Available: https://doi.org/10.48550/arXiv.1804.02767.
[15] "roLabelImg," 2017. [Online]. Available: https://github.com/cgvict/roLabelImg.
[16] "Stereolabs ZED 2." [Online]. Available: https://www.stereolabs.com/zed-2/.
[17] G. Jocher et al., "YOLOv5 (ultralytics)," 2022. [Online]. Available: https://doi.org/10.5281/zenodo.3908559.
[18] S. Al-Khalifa and M. Al-Razgan, "Ebsar: Indoor guidance for the visually impaired," Computers & Electrical Engineering, vol. 54, pp. 26-39, Aug. 2016.
[19] T. Priya, K. S. Sravya, and S. Umamaheswari, "Machine-Learning-Based device for visually impaired person," in Artificial Intelligence and Evolutionary Computations in Engineering Systems, Singapore, 2020, pp. 79-88.
[20] A. Rodrigues, et al., "Getting smartphones to Talkback: Understanding the smartphone adoption process of blind users," Conference on Computers & Accessibility, Lisbon, Portugal, 2015.
[21] M. Avila, et al., "Remote assistance for blind users in daily life: A survey about Be My Eyes," The 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments, Corfu, Island, Greece, 2016.
[22] X. Nguyen, et al., "Artificial vision: The effectiveness of the OrCam in patients with advanced inherited retinal dystrophies," Acta Ophthalmol., no. 4, pp. 986-993, Jun. 2022.
[23] K. Matusiak, P. Skulimowski, and P. Strurniłło, "Object recognition in a mobile phone application for visually impaired users," The 6th International Conference on Human System Interactions, Jun. 2013, pp. 479-484.
[24] L. Ţepelea, I. Gavriluţ, and A. Gacsádi, "Smartphone application to assist visually impaired people," The 14th International Conference on Engineering of Modern Electric Systems, Jun. 2017, pp. 228-231.
[25] S. M. Felix, S. Kumar, and A. Veeramuthu, "A smart personal AI assistant for visually impaired people," The 2nd International Conference on Trends in Electronics and Informatics, May 2018, pp. 1245-1250.
[26] D. Croce et al., "An indoor and outdoor navigation system for visually impaired people," IEEE Access, vol. 7, pp. 170406-170418, 2019.
[27] D. Ahmetovic, et al., "ReCog: Supporting blind people in recognizing personal objects," Conference on Human Factors in Computing Systems, 2020, pp. 1–12.
[28] G. Senarathne, et al., "BlindAid : Android-based mobile application guide for visually challenged people," The 12th Annual Information Technology, Electronics and Mobile Communication Conference, Oct. 2021, pp. 39-45.
[29] W. Liu et al., "SSD: Single shot multiBox detector," in European Conference on Computer Vision2016, pp. 21-37.
[30] M. M. Islam, et al., "Developing walking assistants for visually impaired people: A review," IEEE Sensors Journal, vol. 19, no. 8, pp. 2814-2828, 2019.
[31] N. Martiniello, et al., "Exploring the use of smartphones and tablets among people with visual impairments: Are mainstream devices replacing the use of traditional visual aids?," Assistive Technology, vol. 34, no. 1, pp. 34-45, Jan. 2022.
[32] "Samsung Galaxy S9," 2018. [Online]. Available: https://www.samsung.com/tw/support/mobile-devices/what-are-the-new-samsung-galaxy-s9-and-s9-plus-specs/.
[33] "Google Pixel 6 Pro," 2021. [Online]. Available: https://store.google.com/tw/product/pixel_6_pro?hl=zh-TW.
[34] "realme 5 Pro," 2019. [Online]. Available: https://www.realme.com/tw/realme-5-pro.
[35] "Sony Xperia 1 iii 智慧型手機," 2021. [Online]. Available: https://store.sony.com.tw/product/show/ff8080817c9632be017cb62ab0b2191e.
[36] "NanoReview." [Online]. Available: https://nanoreview.net/en.
[37] "Smartphone Processors Ranking," 2022. [Online]. Available: https://nanoreview.net/en/soc-list/rating.
[38] T.-Y. Lin et al., "Microsoft COCO: Common objects in context," European Conference on Computer Vision, 2014, pp. 740-755.
[39] S. Shao, et al., "Objects365: A large-scale, high-quality dataset for object detection," International Conference on Computer Vision, 2019, pp. 8429-8438.
[40] M. Everingham, et al., "The pascal visual object classes (VOC) challenge," International journal of computer vision, vol. 88, no. 2, pp. 303-338, 2010.
[41] "LabelImg." [Online]. Available: https://github.com/heartexlabs/labelImg.
[42] "JSON Data format." [Online]. Available: https://cocodataset.org/#format-data.
[43] "Background Subtraction." [Online]. Available: https://docs.opencv.org/4.x/d1/dc5/tutorial_background_subtraction.html.
[44] L. Roeder, "Netron : Visualizer for neural network, deep learning, and machine learning models," 2017. [Online]. Available: https://doi.org/10.5281/zenodo.5854962.
[45] A. Bochkovskiy, C.-Y. Wang, and H.-Y. Liao, "YOLOv4: Optimal speed and accuracy of object detection," 2020. [Online]. Available: https://doi.org/10.48550/arXiv.2004.10934.
[46] M. Tan, R. Pang, and Q. V. Le, "EfficientDet: Scalable and efficient object detection," Conference on Computer Vision and Pattern Recognition, Jun. 2020, pp. 10778-10787.
[47] "Camera-samples." [Online]. Available: https://github.com/android/camera-samples.
[48] "Yolov5s_android," 2021. [Online]. Available: https://github.com/lp6m/yolov5s_android.
[49] "Tensorflow." [Online]. Available: https://github.com/tensorflow.
[50] A. Bewley, et al., "Simple online and realtime tracking," International Conference on Image Processing, Sept. 2016, pp. 3464-3468.
[51] L. Biewald, "Experiment Tracking with Weights and Biases," 2020. [Online]. Available: https://www.wandb.com/.

指導教授

王文俊(Wen-June Wang)

審核日期

2022-9-15

推文