輕量化卷積神經網路的車門開啟防撞警示

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：53

、訪客IP：3.136.236.126

姓名

李佩瑩(Pei-Ying Lee) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

輕量化卷積神經網路的車門開啟防撞警示
(Collision warning for car door opening with a light convolutional neural network)

相關論文

★ 適用於大面積及場景轉換的視訊錯誤隱藏法	★ 虛擬觸覺系統中的力回饋修正與展現
★ 多頻譜衛星影像融合與紅外線影像合成	★ 腹腔鏡膽囊切除手術模擬系統
★ 飛行模擬系統中的動態載入式多重解析度地形模塑	★ 以凌波為基礎的多重解析度地形模塑與貼圖
★ 多重解析度光流分析與深度計算	★ 體積守恆的變形模塑應用於腹腔鏡手術模擬
★ 互動式多重解析度模型編輯技術	★ 以小波轉換為基礎的多重解析度邊線追蹤技術(Wavelet-based multiresolution edge tracking for edge detection)
★ 基於二次式誤差及屬性準則的多重解析度模塑	★ 以整數小波轉換及灰色理論為基礎的漸進式影像壓縮
★ 建立在動態載入多重解析度地形模塑的戰術模擬	★ 以多階分割的空間關係做人臉偵測與特徵擷取
★ 以小波轉換為基礎的影像浮水印與壓縮	★ 外觀守恆及視點相關的多重解析度模塑

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

由於台灣汽機車數量逐年攀升，且人口密度高、道路窄小和停車位不足的問題，人、車爭道和兩車併排等現象層出不窮，使得在路邊停靠而下車開門時未注意後方來車造成碰撞的傷亡事故屢屢發生，因此如何防範不當開車門肇禍已成為目前重要的研究議題。在本論文中，我們提出一個基於輕量化卷積神經網路 (convolution neural network, CNN) 的車門開啟防撞警示系統，以相機作為感測器監測後方的行人、自行車、機車及汽車，在可能碰撞前警示駕駛人，保障駕駛、乘客與其他用路人的安全。
　　本論文分為兩個部分：第一部分為輕量化的卷積神經網路，使用 MobileNet V2 width1.6 取代原先 YOLOv3 中的 Darknet-53 架構作為特徵提取器，藉此減少網路執行時所需耗費的計算量和參數儲存空間，再透過 YOLOv3 中類似特徵金字塔網路 (feature pyramid networks, FPN) 以三種不同尺寸的特徵圖做後方移動物件的偵測與辨識；第二部分則是利用第一部分輸出的移動物件座標與類別，以俯瞰轉換將原始影像轉成平行於地面的虛擬影像平面，進而計算出縱向 (latitude)、橫向 (longitude) 的距離及估計碰撞時間 (time to collision, TTC) 作為警示的依據。
　　在實驗中，我們所使用的YOLOv3-MobileNet V2 width1.6架構相較於YOLOv3降低約2.45倍的參數量及3.24倍的計算量。以960×540解析度的影片測試，平均執行速度為每秒 28 張影像，物件偵測系統的mAP達到88.43%。

摘要(英)

In most cities, the traffic condition is always crowded and chaotic. Sometimes, the drivers are hard to stop their cars away from the moving stream of roads, or less-morality drivers arbitrarily stop their cars to get off. In these situations, the abrupt car door opening may be collided by the following cars or autobikes.
To avoid the collision, we here propose a “car-door-open warning system” based on a constructed light convolutional neural network combining a location estimator. At first, a light convolutional neural network is constructed to detect and recognized the moving objects in the images; the objects include the coming cars, autobikes, pedestrians, and other moving objects. The modified MobileNet V2 instead of the original Darknet-53 in YOLOv3 was used for shrinking the amount of computation and network parameters. Then, we transform the locations of detected objects from the coordinate system of captured images into the coordinate system of the transformed top-view images to estimate the relative locations of the coming objects.
To evaluate the performance of the proposed system, several experiments and comparisons were conducted and reported. Based on 3164 images, the mAP of the object detection system can reach 88.43%, and the average execution speed on 960×540 images is 28 frames per second. Comparing to the original YOLOv3, we reduced the parameter and computation amount by 2.45 times and 3.24 times, respectively.

關鍵字(中)

★ 車門開啟防撞警示系統
★ 卷積神經網路

關鍵字(英)

★ DOW
★ CNN

論文目次

摘要　i
Abstract　ii
致謝　　ii
目錄　　iv
圖目錄　vi
表目錄　viii
第一章緒論　1
　　1.1 研究動機　1
　　1.2 系統架構　2
　　1.3 論文架構　3
第二章相關研究　4
　　2.1 卷積神經網路物件偵測系統相關發展　4
　　2.2 卷積神經網路的輕量化　8
第三章移動物件的偵測與辨識　12
　　3.1 YOLOv3架構　12
　　3.2 基於YOLOv3與MobileNet V2 width1.6架構　17
第四章移動物件之距離與碰撞時間估算　24
　　4.1 相機參數校正　24
　　4.2 俯瞰轉換　31
　　4.3 距離與碰撞時間估計　34
第五章實驗與結果　36
　　5.1 實驗設備介紹　36
　　5.2 卷積神經網路之訓練　36
　　5.3 卷積神經網路架構的比較和評估　38
　　5.4 移動物件之碰撞時間估計　42
　　5.5 車門開啟防撞警示系統結果展示　44
第六章　　結論及未來展望　47
參考文獻　48

參考文獻

[1] J. Redmon and A. Farhadi, ′′Yolov3: an incremental improvement,′′ arXiv:1804.02767, 2018.
[2] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. C. Chen, ′′MobileNet V2: Inverted residuals and linear bottlenecks,′′ arXiv:1801.04381, 2019.
[3] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Columbus, Ohio, Jun.23-28, 2014, pp.580-587.
[4] J. Uijlings, K. Sande, T. Gevers, and A. Smeulders, “Selective search for object recognition,” Int. Journal of Computer Vision (IJCV), vol.104, is.2, pp.154-171, 2013.
[5] R. Girshick, "Fast R-CNN," in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, Dec.11-18, 2015, pp.1440-1448.
[6] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.39, is.6, pp.1137-1149, 2016.
[7] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask R-CNN," in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, Oct.22-29, 2017, pp. 2980-2988.
[8] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, and A. C. Berg, “ SSD: Single shot multibox detector,” in European Conf. on Computer Vision (ECCV), Amsterdam, Holland, Oct.8-16, 2016, pp.21-37.
[9] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: unified, real-time object detection," in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp.779-788.
[10] J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, Jul.21-26, 2017, pp.6517-6525.
[11] J. MacQueen, “Some methods for classification and analysis of multivariate observations,” in Proc. 5th Berkeley Symp. on Mathematical Statistics and Probability, Berkeley, CA, Jun.21-Jul.18, vol.1, 1967, pp.281-297.
[12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. Neural Information Processing Systems (NIPS), Lake Tahoe, Nevada, Dec.3-8, 2012, pp.1097-1105.
[13] M. Lin, Q. Chen, and S. Yan, “Netwok in network,” in Proc. Int. Conf. Learn. Represent (ICLR), Banff, Canada, Apr.14-16, 2014, pp.274-278.
[14] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.1-9.
[15] N. Iandola, S. Han, W. Moskewicz, K. Ashraf, W. Dally and K. Keutzer, ′′Squeezenet: Alexnet-level accuracy with 50x fewer parameters and 1mb model size,′′ arXiv: 1602.07360, 2016.
[16] A. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, ′′ Mobilenets: efficient convolutional neural networks for mobile vision applications,′′ arXiv:1704.04861, 2017.
[17] F. Chollet, ′′Xception: deep learning with depthwise deparable convolutions,′′ in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Honolulu, Hawaii, Jul.22-25, 2017, pp.1800-1807.
[18] X. Zhang, X. Zhou, M. Lin, and J. Sun, ′′ShuffleNet: an extremely efficient convolutional neural network for mobile devices,′′ arXiv:1707.01083, 2017.
[19] G. Huang, Z. Liu, L. V. D. Maaten and K. Q. Weinberger, ′′Densely Connected Convolutional Networks,′′ in Proc. IEEE Conf. on Pattern Recognition and Computer Vision (CVPR), Honolulu, Hawaii, Jul.22-25, 2017, pp.4700-4708.
[20] T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, ′′ Feature pyramid networks for object detection,′′ arXiv:1612.03144, 2017.
[21] K. He, X. Zhang, S. Ren, and J. Sun, ′′Deep residual learning for image Recognition,′′ arXiv: 1512.03385, 2015.
[22] D. C. Brown, ′′Close-range camera calibration," Photogrammetric Engineering, vol.37, no.8, pp.855-866, 1971.
[23] W. Faig, "Calibration of close-range photogrammetry systems: Mathematical formulation," Photogrammetric Engineering and Remote Sensing, vol.41, no.12, pp.1479-1486, 1975.
[24] O. Faugeras, T. Luong, and S. Maybank, "Camera self-calibration: Theory and experiments," in Proc. of 2nd European Conf. on Computer Vision, Santa Margherita Ligure, Italy, May 19-22, 1992, pp.321-334.
[25] D. Gennery, "Stereo-camera calibration," in Proc. of 10th Image Understanding Workshop, Los Angeles, CA, Nov.7-8, 1979, pp.101-108.
[26] G. Wei and S. Ma, "A complete two-plane camera calibration method and experimental comparisons," in Proc. of 4th Int. Conf. on Computer Vision, Berlin, Germany, May.11-14, 1993, pp.439-446.
[27] J. Weng, P. Cohen, and M. Herniou, "Camera calibration with distortion models and accuracy evaluation," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.14, no.10, pp.965-980, 1992.
[28] R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd Edition, Cambridge University Press, 2004.
[29] Z. Zhang, "A flexible new technique for camera calibration," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.22, no.11, pp.1330-1334, 2000.
[30] D. Marquardt, "An algorithm for least-squares estimation of nonlinear parameters," SIAM Journal on Applied Mathematics, vol.11, pp.431-441, 1963.

指導教授

曾定章(Ding-Chang Tseng)

審核日期

2019-7-24

推文