摘要(英) |
In the thesis, deep learning technology is used for outdoor obstacle identification and obstacle distance detection. Meanwhile, a set of outdoor wearable guide device is designed to provide a safer and more portable guide system for visually impaired people during outdoor walking.
Obstacle distance detection points in two ways, one is for the use of monocular camera through monocular depth estimation neural network to get the disparity image of the original image, through the regression analysis to the disparity image is converted to a depth image, image on obstacle detection area through the histogram statistics way to calculate obstacle distance output value, then calculate the obstacle height and position according to the obstacle distance. Obstacle detection uses semantic segmentation neural network or object detection neural network, calculate the obstacle distance from the obstacle identification results. Among them, after modification of the neural network of object detection, the rotation angle of the bounding box can be predicted to make the bounding box fit the obstacle better. Combined with the above obstacle distance detection results, the semantic segmentation collocation monocular depth is applied to the blind guide robot, and the object detection collocation monocular depth is applied to the wearable guide device for subsequent obstacle avoidance control. Second for the use of the stereo camera to calculate the depth image of the original image, using the object detection to identify obstacles, image on obstacle detection area in depth of each pixel values arranged from low to high, take the first quartile as obstacle distance, depth from stereo camera collocation object detection is also used in wearable guide device for the obstacle avoidance control, but high speed of operation, at the same time can be integrated signboard tracking system.
Based on security considerations, visually impaired people need to walk on the right side of the road. The thesis designs a keep to right side algorithms, according to the camera position height and the camera intrinsic, use the perspective projection method to find out the actual depth and width of the projection to the coordinate of image plane. According to this method, drawing the reference line of the width distance from road to user, as well as the user′s left half body width of the reference line, formed the reference line of both sides. Match the semantic segmentation to locate the road area , in accordance with the relative relationship between width on both sides of the reference line and road edge, reminding the visually impaired that do corrections by going straight, moving to the left or right, and the rotation. At the same time, based on the obstacle messages, setting of obstacle avoidance control method is designed to perform actions such as avoiding obstacles, stepping over obstacles or stopping walking, according to the height, orientation and distance from the obstacles. Combined with each algorithm, leading the visually impaired to the destination. |
參考文獻 |
[1] (2020年, 6月). 衛生福利部統計處 [Online]. Available:
https://dep.mohw.gov.tw/DOS/cp-2976-13815-113.html.
[2] (2018年, 10月) 退休導盲犬6年後與訓導員再相見 [Online]. Available: https://kknews.cc/zh-tw/pet/6byomjq.html.
[3] 邱文欣, "基於深度學習之單眼距離估測與機器人戶外行走控制,"碩士, 電機工程學系, 國立中央大學, 桃園市, 2019.
[4] H. Fu, M. Gong, C. Wang, K. Batmanghelich and D. Tao, "Deep Ordinal Regression Network for Monocular Depth Estimation," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 2002-2011.
[5] D. Wofk, F. Ma, and T. Yang, S. Karaman and V. Sze, "FastDepth: Fast Monocular Depth Estimation on Embedded Systems," in IEEE International Conference on Robotics and Automation (ICRA), 2019, pp. 6101-6108.
[6] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto and H. Adam, " MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv preprint arXiv:1704.04861, 2017.
[7] R. Díaz and A. Marathe, "Soft Labels for Ordinal Regression," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4738–4747.
[8] F. Ma and S. Karaman, "Sparse-to-Dense: Depth prediction from Sparse Depth Samples and a Single Image,"in IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 4796-4803.
[9] C. Godard, O. Mac Aodha and G. J. Brostow, "Unsupervised Monocular Depth Estimation with Left-Right Consistency," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6602-6611.
[10] T. Zhou, M. Brown, N. Snavely and D. G. Lowe, "Unsupervised Learning of Depth and Ego-Motion from Video," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 6612-6619.
[11] Z. Yin and J. Shi, "GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 1983-1992.
[12] A. Gordon, H. Li, R. Jonschkowski and A. Angelova, " Depth from Videos in the Wild: Unsupervised Monocular Depth Learning from Unknown Cameras," arXiv preprint arXiv:1904.04998, 2019.
[13] J. Watson, M. Firman, G. Brostow and D. Turmukhambetov, "Self-Supervised Monocular Depth Hints," in IEEE International Conference on Computer Vision (ICCV), 2019, pp. 2162-2171.
[14] C. Godard, O. Mac Aodha, M. Firman and G. Brostow, "Digging into Self-Supervised Monocular Depth Estimation," arXiv preprint arXiv:1806.01260, 2018.
[15] V. Casser, S. Pirk, R. Mahjourian and A. Angelova, "Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos," arXiv preprint arXiv:1811.06152, 2018.
[16] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 580-587.
[17] J. R. Uijlings, K. E. Van De Sande, T. Gevers and A. W. Smeulders, "Selective Search for Object Recognition," in International Journal of Computer Vision, 2013, pp. 154-171.
[18] R. Girshick, "Fast R-CNN," in IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1440-1448.
[19] K. He, X. Zhang, S. Ren and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2015, pp. 1904-1916.
[20] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2017, pp. 1137-1149.
[21] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779-788.
[22] J. Redmon and A. Farhadi, "YOLO9000: better, faster, stronger," arXiv preprint arXiv:1612.08242, 2016.
[23] S. Ioffe and C. Szegedy, "Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift," arXiv preprint arXiv:1502.03167, 2015.
[24] J. Redmon and A. Farhadi, "Yolov3: An incremental improvement," arXiv preprint arXiv:1804.02767, 2018.
[25] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
[26] M. Simon, S. Milz, K. Amende and H. Gross, "Complex-YOLO: An Euler-Region-Proposal for Real-Time 3D Object Detection on Point Clouds," in The European Conference on Computer Vision (ECCV) Workshops, 2018, pp. 197-209.
[27] 汪孟璇, "基於深度學習之道路資訊辨識導盲系統,"碩士, 電機工程學系, 國立中央大學, 桃園市, 2020.
[28] 廖浤鈞, "基於深度學習之關聯式追蹤網路,"碩士, 資訊工程學系, 國立中央大學, 桃園市, 2020.
[29] (2020年, 7月). 嵌入式開發板 [Online]. Available: https://www.nvidia.com/zh-tw/autonomous-machines/embedded-systems/jetson-agx-xavier/.
[30] (2020年, 7月). 無線網卡 [Online]. Available: https://ark.intel.com/content/www/tw/zh/ark/products/94150/intel-dual-band-wireless-ac-8265.html.
[31] (2020年, 7月). 網路攝影機 [Online]. Available: https://www.logitech.com/zh-tw/product/c930e-webcam.
[32] (2020年, 7月). ZED雙眼攝影機 [Online]. Available: https://www.stereolabs.com/zed/.
[33] (2020年, 7月). GPS模組 [Online]. Available:
https://www.u-blox.com/zh/product/neo-m8-series.
[34] (2020年, 7月). 行動電源 [Online]. Available: https://www.enerpad.com.tw/products?limit=50&offset=0&price=0%2C10000&sort=createdOn-desc&tags=AC42K.
[35] (2020年, 7月). 智慧型手機 [Online]. Available: https://www.samsung.com/tw/support/model/SM-G960FZPDBRI/.
[36] J. Deng, W. Dong, R. Socher, L. Li, Kai Li and Li Fei-Fei, "ImageNet: A Large-Scale Hierarchical Image Database," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248-255.
[37] (2017年, 7月). RoLabelImg [Online]. Available: https://github.com/cgvict/roLabelImg.
[38] (2019年, 2月). 透視投影 [Online]. Available: https://zh.wikipedia.org/wiki/%E9%80%8F%E8%A7%86%E6%8A%95%E5%BD%B1. |