BB-YOLO：基於二值主幹網路之 YOLO 物件偵測硬體加速器設計與實現

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：121

、訪客IP：18.225.234.175

姓名

陳治嘉(Chih-Chia Chen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

BB-YOLO：基於二值主幹網路之 YOLO 物件偵測硬體加速器設計與實現
(BB-YOLO : Design and Implementation of Hardware Accelerator for YOLO Object Detection Network Based on Binary Backbone)

相關論文

★ 整合GRAFCET虛擬機器的智慧型控制器開發平台	★ 分散式工業電子看板網路系統設計與實作
★ 設計與實作一個基於雙攝影機視覺系統的雙點觸控螢幕	★ 智慧型機器人的嵌入式計算平台
★ 一個即時移動物偵測與追蹤的嵌入式系統	★ 一個固態硬碟的多處理器架構與分散式控制演算法
★ 基於立體視覺手勢辨識的人機互動系統	★ 整合仿生智慧行為控制的機器人系統晶片設計
★ 嵌入式無線影像感測網路的設計與實作	★ 以雙核心處理器為基礎之車牌辨識系統
★ 基於立體視覺的連續三維手勢辨識	★ 微型、超低功耗無線感測網路控制器設計與硬體實作
★ 串流影像之即時人臉偵測、追蹤與辨識─嵌入式系統設計	★ 一個快速立體視覺系統的嵌入式硬體設計
★ 即時連續影像接合系統設計與實作	★ 基於雙核心平台的嵌入式步態辨識系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2029-7-22以後開放)

摘要(中)

深度學習在物件偵測領域展現過人成效，然而其涉及龐大運算量與記憶體佔用，難以運用在計算資源受限且需即時運算之邊緣裝置場域上。為了解決該問題，本論文提出基於二值主幹網路之YOLO物件偵測網路，其主幹網路為二值卷積運算為主之ReActNet，該策略大幅減少模型參數量與模型大小，並將該網路更進一步透過階層式模組化設計方法，提出具有彈性架構之BB-YOLO二值主幹網路物件偵測硬體加速器，並加入管線化設計與定點運算取代浮點數運算，以提升神經網路推論速度並減少硬體資源使用量。根據實驗結果分析，該硬體加速器單一影像偵測所需時間為9.0967µs，相較於具有圖形處理器之主機平台，展現出優異推論加速效果。本論文提出之BB-YOLO硬體加速器不僅具有彈性架構特點，同時在推論上展現即時性，從而提供在硬體資源有限場域中實現即時物件偵測的一種解決方法。

摘要(英)

Deep learning has demonstrated remarkable performance in the field of object detection. However, its extensive computational and memory requirements make it challenging to deploy on edge devices with limited computing resources requiring real-time operations. To address this issue, this paper proposes a YOLO-based object detection network based on a binary backbone network. The backbone network primarily utilizes binary convolution operations from ReActNet, significantly reducing model parameters and size. Furthermore, a flexible architecture BB-YOLO is introduced using a hierarchical modular design approach. Additionally, a hardware accelerator for BB-YOLO integrates pipelined design and fixed-point arithmetic, replacing floating-point operations to enhance neural network inference speed and reduce hardware resource usage. Experimental results show that the hardware accelerator requires 9.0967μs for single image detection, demonstrating excellent inference acceleration performance compared to a computer server with Graphics Processing Unit. The proposed BB-YOLO hardware accelerator not only features a flexible architecture but also enables real-time inference. Consequently, it offers a viable solution for real-time object detection in hardware-constrained environments.

關鍵字(中)

★ 二值卷積
★ 物件偵測網路
★ FPGA
★ 硬體加速器

關鍵字(英)

論文目次

目錄
摘要 I
Abstract II
誌謝 III
目錄 V
圖目錄 VII
表目錄 X
第一章、緒論 1
1.1 研究背景 1
1.2 研究目標 3
1.3 論文架構 3
第二章、文獻回顧 4
2.1 二值卷積神經網路 4
2.1.1 Binarized Neural Network 4
2.1.2 XNOR-Net 6
2.1.3 ReActNet 8
2.2 YOLO 12
2.2.1 YOLOv5 13
2.2.2 YOLOv8 17
第三章、二值主幹網路物件偵測硬體加速器設計 21
3.1 系統設計方法論 21
3.1.1 IDEF0 階層式模組化設計 22
3.1.2 GRAFCET 離散事件建模 24
3.2 二值主幹網路物件偵測硬體加速器架構與IDEF0 26
3.2.1 BB-YOLO二值主幹網路物件偵測硬體加速器架構 26
3.2.2 BB-YOLO二值主幹網路物件偵測硬體加速器IDEF0 28
3.3 二值主幹網路物件偵測硬體加速器GRAFCET 30
3.3.1 Backbone主幹網路模組GRAFCET 32
3.3.2 Neck和Head前半部分模組GRAFCET 40
3.3.3 Head後半部分模組GRAFCET 52
3.4 硬體加速器管線化設計 54
第四章、實驗結果 61
4.1 實驗軟硬體開發環境 61
4.2 軟體物件偵測實驗 62
4.2.1 物件偵測資料集 62
4.2.2 BB-YOLO二值主幹物件偵測網路化約架構 63
4.2.3 軟體物件偵測實驗結果 64
4.3 硬體合成與驗證 66
4.3.1 管線化控制器模組 66
4.3.2 G1模組 68
4.3.3 G2_1模組 71
4.3.4 G2_2模組 72
4.3.5 G3模組 74
4.3.6 硬體合成資源 75
4.4 軟硬體實驗結果分析 76
第五章、結論與未來展望 77
5.1 結論 77
5.2 未來展望 78
參考文獻 79

參考文獻

參考文獻
[1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017.
[2] L. Jiao, F. Zhang, F. Liu, S. Yang, L. Li, Z. Feng, and R. Qu, "A survey of deep learning-based object detection," IEEE access, vol. 7, pp. 128837-128868, 2019.
[3] A. B. Nassif, I. Shahin, I. Attili, M. Azzeh, and K. Shaalan, "Speech recognition using deep neural networks: A systematic review," IEEE access, vol. 7, pp. 19143-19165, 2019.
[4] R. Raina, A. Madhavan, and A. Y. Ng, "Large-scale deep unsupervised learning using graphics processors," in Proceedings of the 26th annual international conference on machine learning, pp. 873-880, 2009.
[5] L. Du, R. Zhang, and X. Wang, "Overview of two-stage object detection algorithms," in Journal of Physics: Conference Series, vol. 1544, no. 1, p. 012033, 2020.
[6] Y. Zhang, X. Li, F. Wang, B. Wei, and L. Li, "A comprehensive review of one-stage networks for object detection," in 2021 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), pp. 1-6, 2021.
[7] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in neural information processing systems, vol. 28, 2015.
[8] J. Terven, D.-M. Córdova-Esparza, and J.-A. Romero-González, "A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS," Machine Learning and Knowledge Extraction, vol. 5, no. 4, pp. 1680-1716, 2023.
[9] L. Liu, W. Ouyang, X. Wang, P. Fieguth, J. Chen, X. Liu, and M. Pietikäinen, "Deep learning for generic object detection: A survey," International journal of computer vision, vol. 128, pp. 261-318, 2020.
[10] A. Chawla, H. Yin, P. Molchanov, and J. Alvarez, "Data-free knowledge distillation for object detection," in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3289-3298, 2021.
[11] S. Liang, H. Wu, L. Zhen, Q. Hua, S. Garg, G. Kaddoum, M. M. Hassan, and K. Yu, "Edge YOLO: Real-time intelligent object detection system based on edge-cloud cooperation in autonomous vehicles," IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 12, pp. 25345-25360, 2022.
[12] R. Li, Y. Wang, F. Liang, H. Qin, J. Yan, and R. Fan, "Fully quantized network for object detection," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2810-2819, 2019.
[13] H. Peng and S. Chen, "BDNN: Binary convolution neural networks for fast object detection," Pattern Recognition Letters, vol. 125, pp. 91-97, 2019.
[14] Z. Wang, Z. Wu, J. Lu, and J. Zhou, "Bidet: An efficient binarized object detector," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2049-2058, 2020.
[15] R. Sayed, H. Azmi, H. Shawkey, A. Khalil, and M. Refky, "A systematic literature review on binary neural networks," IEEE Access, 2023.
[16] S. Mittal, "A survey of FPGA-based accelerators for convolutional neural networks," Neural computing and applications, vol. 32, no. 4, pp. 1109-1139, 2020.
[17] Z. Liu, Z. Shen, M. Savvides, and K.-T. Cheng, "Reactnet: Towards precise binary neural network with generalized activation functions," in Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16, pp. 143-159, 2020.
[18] Ultralytics. (2020). YOLOv5 Official Document. Available: https://docs.ultralytics.com/zh/yolov5/
[19] Ultralytics. (2023). YOLOv8 Official Document. Available: https://docs.ultralytics.com/zh/
[20] M. Courbariaux, Y. Bengio, and J.-P. David, "Binaryconnect: Training deep neural networks with binary weights during propagations," Advances in neural information processing systems, vol. 28, 2015.
[21] I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, and Y. Bengio, "Binarized neural networks," Advances in neural information processing systems, vol. 29, 2016.
[22] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
[23] A. Krizhevsky and G. Hinton, "Learning multiple layers of features from tiny images," 2009.
[24] M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, "Xnor-net: Imagenet classification using binary convolutional neural networks," in European conference on computer vision, pp. 525-542, 2016.
[25] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, "Mobilenets: Efficient convolutional neural networks for mobile vision applications," arXiv preprint arXiv:1704.04861, 2017.
[26] C.-Y. Wang, I.-H. Yeh, and H.-Y. M. Liao, "YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information," arXiv preprint arXiv:2402.13616, 2024.
[27] Z. Zhang, X. Lu, G. Cao, Y. Yang, L. Jiao, and F. Liu, "ViT-YOLO: Transformer-based YOLO for object detection," in Proceedings of the IEEE/CVF international conference on computer vision, pp. 2799-2808, 2021.
[28] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, "Yolov4: Optimal speed and accuracy of object detection," arXiv preprint arXiv:2004.10934, 2020.
[29] S. Elfwing, E. Uchibe, and K. Doya, "Sigmoid-weighted linear units for neural network function approximation in reinforcement learning," Neural networks, vol. 107, pp. 3-11, 2018.
[30] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, "Feature pyramid networks for object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2117-2125, 2017.
[31] S. Liu, L. Qi, H. Qin, J. Shi, and J. Jia, "Path aggregation network for instance segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 8759-8768, 2018.
[32] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, and C. L. Zitnick, "Microsoft coco: Common objects in context," in Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740-755, 2014.
[33] C.-Y. Wang, H.-Y. M. Liao, and I.-H. Yeh, "Designing network design strategies through gradient path analysis," arXiv preprint arXiv:2211.04800, 2022.
[34] C.-H. Chen, M.-Y. Lin, and X.-C. Guo, "High-level modeling and synthesis of smart sensor networks for Industrial Internet of Things," Computers & Electrical Engineering, vol. 61, pp. 48-66, 2017.
[35] Roboflow. (2022). Roboflow: An Online Annotation Tool Platform. Available: https://roboflow.com.
[36] jmedel. (2023). People Detection Dataset. Available: https://universe.roboflow.com/jmedel/people-detection-f0fgt

指導教授

陳慶瀚(Ching-Han Chen)

審核日期

2024-7-23

推文