基於深度學習之道路障礙物偵測與盲人行走輔助技術

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：95

、訪客IP：3.17.77.161

姓名

沈鴻儒(Hong-Ru Shen) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

基於深度學習之道路障礙物偵測與盲人行走輔助技術

相關論文

★ 直接甲醇燃料電池混合供電系統之控制研究	★ 利用折射率檢測法在水耕植物之水質檢測研究
★ DSP主控之模型車自動導控系統	★ 旋轉式倒單擺動作控制之再設計
★ 高速公路上下匝道燈號之模糊控制決策	★ 模糊集合之模糊度探討
★ 雙質量彈簧連結系統運動控制性能之再改良	★ 桌上曲棍球之影像視覺系統
★ 桌上曲棍球之機器人攻防控制	★ 模型直昇機姿態控制
★ 模糊控制系統的穩定性分析及設計	★ 門禁監控即時辨識系統
★ 桌上曲棍球：人與機械手對打	★ 麻將牌辨識系統
★ 相關誤差神經網路之應用於輻射量測植被和土壤含水量	★ 三節式機器人之站立控制

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本論文使用深度學習技術進行戶外障礙物辨識與障礙物距離偵測，同時設計一套戶外穿戴式導盲裝置，提供視障人士於戶外行走時更為安全、輕便的導盲系統。
障礙物距離偵測分兩種方式，其一為使用單眼攝影機透過單眼深度估測神經網路輸出原影像之視差影像，依照不同視差區間使用不同迴歸方程式，分段將視差影像轉換為深度影像。障礙物偵測可使用語義分割神經網路或物件偵測神經網路，其中物件偵測神經網路經修改後，可多預測偵測框旋轉角度，使偵測框更加貼合障礙物。結合障礙物偵側結果，將單眼影像上障礙物偵測區域依據閥值篩選機制輸出障礙物距離數值，再利用障礙物距離數值計算出障礙物高度與方位。結合上述障礙物距離偵測結果，單眼深度搭配物件偵測應用於穿戴式導盲裝置，當影像裡出現物件偵測未能辨識之障礙物類型，將以單眼深度搭配語義分割非道路之區域偵測障礙物距離，進行後續避障控制；其二為使用雙眼攝影機計算出原影像之深度影像，使用物件偵測辨識障礙物，將影像上障礙物偵測區域中每一像素深度值由小排列至大，取其第一四分位數作為障礙物距離數值，雙眼深度搭配物件偵測同樣應用於穿戴式導盲裝置，可進行避障控制，同時運算速率高，可整合招牌追蹤系統。
基於安全考量，視障人士行走於街上需靠著右側行走，本論文設計一套靠右行走演算法，依據相機擺放高度與相機參數，使用透視投影法找出實際深度與寬度投影至影像平面上之座標點，依此法畫出距離道路右側寬度之參考線，以及使用者左半身寬度之參考線，形成兩側寬度參考線，搭配語義分割之道路區域畫出道路邊線，依照兩側寬度參考線與道路邊線之間相對關係，提醒視障人士進行左右修正、直走、迴轉等動作。同時，投過前述之障礙物資訊，設計一套避障控制方法，依據障礙物位於前方之距離、高度與方位，執行避開障礙物、跨過障礙物或是停止等動作。
綜合障礙物辨識與距離偵測，以及靠右邊走的控制，將可協助視障人士安全的走在馬路右側，避開障礙物，前往目的地。

摘要(英)

In the thesis, deep learning technology is used for outdoor obstacle identification and obstacle distance detection. Meanwhile, a set of outdoor wearable guide device is designed to provide a safer and more portable guide system for visually impaired people during outdoor walking.
Obstacle distance detection points in two ways, one is for the use of monocular camera through monocular depth estimation neural network to get the disparity image of the original image, through the regression analysis to the disparity image is converted to a depth image, image on obstacle detection area through the histogram statistics way to calculate obstacle distance output value, then calculate the obstacle height and position according to the obstacle distance. Obstacle detection uses semantic segmentation neural network or object detection neural network, calculate the obstacle distance from the obstacle identification results. Among them, after modification of the neural network of object detection, the rotation angle of the bounding box can be predicted to make the bounding box fit the obstacle better. Combined with the above obstacle distance detection results, the semantic segmentation collocation monocular depth is applied to the blind guide robot, and the object detection collocation monocular depth is applied to the wearable guide device for subsequent obstacle avoidance control. Second for the use of the stereo camera to calculate the depth image of the original image, using the object detection to identify obstacles, image on obstacle detection area in depth of each pixel values arranged from low to high, take the first quartile as obstacle distance, depth from stereo camera collocation object detection is also used in wearable guide device for the obstacle avoidance control, but high speed of operation, at the same time can be integrated signboard tracking system.
Based on security considerations, visually impaired people need to walk on the right side of the road. The thesis designs a keep to right side algorithms, according to the camera position height and the camera intrinsic, use the perspective projection method to find out the actual depth and width of the projection to the coordinate of image plane. According to this method, drawing the reference line of the width distance from road to user, as well as the user′s left half body width of the reference line, formed the reference line of both sides. Match the semantic segmentation to locate the road area , in accordance with the relative relationship between width on both sides of the reference line and road edge, reminding the visually impaired that do corrections by going straight, moving to the left or right, and the rotation. At the same time, based on the obstacle messages, setting of obstacle avoidance control method is designed to perform actions such as avoiding obstacles, stepping over obstacles or stopping walking, according to the height, orientation and distance from the obstacles. Combined with each algorithm, leading the visually impaired to the destination.

關鍵字(中)

★ 穿戴式裝置
★ 避障控制
★ 深度學習
★ 單眼深度估測
★ 物件偵測

關鍵字(英)

★ Wearable device
★ Obstacle avoidance control
★ Deep learning
★ Monocular depth estimation
★ Object detection

論文目次

摘要 i
Abstract ii
致謝 iv
目錄 v
圖目錄 vii
表目錄 xi
第一章緒論 1
1.1 研究動機與背景 1
1.2 文獻回顧 2
1.3 論文目標 4
1.4 論文架構 5
第二章系統架構與硬體介紹 6
2.1 系統架構 6
2.2 硬體架構 7
2.2.1 穿戴式裝置端 8
2.2.2 手機端 11
第三章障礙物辨識與距離估測 13
3.1 單眼深度估測 13
3.1.1 網路架構 13
3.1.2 訓練資料 17
3.1.3 視差轉換深度之迴歸方程式 19
3.2 物件偵測神經網路 19
3.2.1 網路架構 20
3.2.2 訓練資料 25
3.3 障礙物深度估測計算方式 27
3.3.1 單眼攝影機之障礙物深度估測 27
3.3.2 雙眼攝影機之障礙物深度估測 30
3.3.3 障礙物高度 32
第四章路徑規劃與避障控制 34
4.1 靠右行走演算法 34
4.1.1 行走安全寬度參考線 34
4.1.2 行走指令 39
4.2 避障之路徑規劃 42
4.2.1 避障區間 43
4.2.2 障礙物距離播報 44
4.2.3 避障修正與避障停止 44
4.2.4 迴轉 44
第五章實驗結果 46
5.1 障礙物距離偵測 46
5.1.1 障礙物偵測 46
5.1.2 單眼深度估測 47
5.1.3 障礙物距離與高度 48
5.2 穿戴式裝置之行走實測 52
5.2.1 靠右行走演算法 53
5.2.2 避障控制 60
第六章結論與未來展望 68
6.1 結論 68
6.2 未來展望 69
參考文獻 70

參考文獻

[1] (2020年, 6月). 衛生福利部統計處 [Online]. Available:
https://dep.mohw.gov.tw/DOS/cp-2976-13815-113.html.
[2] (2018年, 10月) 退休導盲犬6年後與訓導員再相見 [Online]. Available: https://kknews.cc/zh-tw/pet/6byomjq.html.
[3] 邱文欣, "基於深度學習之單眼距離估測與機器人戶外行走控制,"碩士, 電機工程學系, 國立中央大學, 桃園市, 2019.
[4] H. Fu, M. Gong, C. Wang, K. Batmanghelich and D. Tao, "Deep Ordinal Regression Network for Monocular Depth Estimation," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 2002-2011.
[5] D. Wofk, F. Ma, and T. Yang, S. Karaman and V. Sze, "FastDepth: Fast Monocular Depth Estimation on Embedded Systems," in IEEE International Conference on Robotics and Automation (ICRA), 2019, pp. 6101-6108.
[6] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, " MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv preprint arXiv:1704.04861, 2017.
[7] R. Díaz and A. Marathe, "Soft Labels for Ordinal Regression," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4738–4747.
[8] F. Ma and S. Karaman, "Sparse-to-Dense: Depth prediction from Sparse Depth Samples and a Single Image,"in IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 4796-4803.
[9] C. Godard, O. Mac Aodha and G. J. Brostow, "Unsupervised Monocular Depth Estimation with Left-Right Consistency," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6602-6611.
[10] T. Zhou, M. Brown, N. Snavely and D. G. Lowe, "Unsupervised Learning of Depth and Ego-Motion from Video," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6612-6619.
[11] Z. Yin and J. Shi, "GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 1983-1992.
[12] A. Gordon, H. Li, R. Jonschkowski and A. Angelova, " Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras," arXiv preprint arXiv:1904.04998, 2019.
[13] J. Watson, M. Firman, G. Brostow and D. Turmukhambetov, "Self-Supervised Monocular Depth Hints," in IEEE International Conference on Computer Vision (ICCV), 2019, pp. 2162-2171.
[14] C. Godard, O. Mac Aodha, M. Firman and G. Brostow, "Digging into Self-Supervised Monocular Depth Estimation," arXiv preprint arXiv:1806.01260, 2018.
[15] V. Casser, S. Pirk, R. Mahjourian and A. Angelova, "Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos," arXiv preprint arXiv:1811.06152, 2018.
[16] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 580-587.
[17] J. R. Uijlings, K. E. Van De Sande, T. Gevers and A. W. Smeulders, "Selective Search for Object Recognition," in International Journal of Computer Vision, 2013, pp. 154-171.
[18] R. Girshick, "Fast R-CNN," in IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440-1448.
[19] K. He, X. Zhang, S. Ren and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015, pp. 1904-1916.
[20] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017, pp. 1137-1149.
[21] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779-788.
[22] J. Redmon and A. Farhadi, "YOLO9000: better, faster, stronger," arXiv preprint arXiv:1612.08242, 2016.
[23] S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," arXiv preprint arXiv:1502.03167, 2015.
[24] J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
[25] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
[26] M. Simon, S. Milz, K. Amende and H. Gross, "Complex-YOLO: An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds," in The European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 197-209.
[27] 汪孟璇, "基於深度學習之道路資訊辨識導盲系統,"碩士, 電機工程學系, 國立中央大學, 桃園市, 2020.
[28] 廖浤鈞, "基於深度學習之關聯式追蹤網路,"碩士, 資訊工程學系, 國立中央大學, 桃園市, 2020.
[29] (2020年, 7月). 嵌入式開發板 [Online]. Available: https://www.nvidia.com/zh-tw/autonomous-machines/embedded-systems/jetson-agx-xavier/.
[30] (2020年, 7月). 無線網卡 [Online]. Available: https://ark.intel.com/content/www/tw/zh/ark/products/94150/intel-dual-band-wireless-ac-8265.html.
[31] (2020年, 7月). 網路攝影機 [Online]. Available: https://www.logitech.com/zh-tw/product/c930e-webcam.
[32] (2020年, 7月). ZED雙眼攝影機 [Online]. Available: https://www.stereolabs.com/zed/.
[33] (2020年, 7月). GPS模組 [Online]. Available:
https://www.u-blox.com/zh/product/neo-m8-series.
[34] (2020年, 7月). 行動電源 [Online]. Available: https://www.enerpad.com.tw/products?limit=50&offset=0&price=0%2C10000&sort=createdOn-desc&tags=AC42K.
[35] (2020年, 7月). 智慧型手機 [Online]. Available: https://www.samsung.com/tw/support/model/SM-G960FZPDBRI/.
[36] J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "ImageNet: A Large-Scale Hierarchical Image Database," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248-255.
[37] (2017年, 7月). RoLabelImg [Online]. Available: https://github.com/cgvict/roLabelImg.
[38] (2019年, 2月). 透視投影 [Online]. Available: https://zh.wikipedia.org/wiki/%E9%80%8F%E8%A7%86%E6%8A%95%E5%BD%B1.

指導教授

王文俊(Wen-June Wang)

審核日期

2020-7-21

推文