論文名稱 BB-YOLO:基於二值主幹網路之 YOLO 物件偵測硬體加速器設計與實現
(BB-YOLO : Design and Implementation of Hardware Accelerator for YOLO Object Detection Network Based on Binary Backbone)
摘要(中) 深度學習在物件偵測領域展現過人成效,然而其涉及龐大運算量與記憶體佔用,難以運用在計算資源受限且需即時運算之邊緣裝置場域上。為了解決該問題,本論文提出基於二值主幹網路之YOLO物件偵測網路,其主幹網路為二值卷積運算為主之ReActNet,該策略大幅減少模型參數量與模型大小,並將該網路更進一步透過階層式模組化設計方法,提出具有彈性架構之BB-YOLO二值主幹網路物件偵測硬體加速器,並加入管線化設計與定點運算取代浮點數運算,以提升神經網路推論速度並減少硬體資源使用量。根據實驗結果分析,該硬體加速器單一影像偵測所需時間為9.0967µs,相較於具有圖形處理器之主機平台,展現出優異推論加速效果。本論文提出之BB-YOLO硬體加速器不僅具有彈性架構特點,同時在推論上展現即時性,從而提供在硬體資源有限場域中實現即時物件偵測的一種解決方法。
摘要(英) Deep learning has demonstrated remarkable performance in the field of object detection. However, its extensive computational and memory requirements make it challenging to deploy on edge devices with limited computing resources requiring real-time operations. To address this issue, this paper proposes a YOLO-based object detection network based on a binary backbone network. The backbone network primarily utilizes binary convolution operations from ReActNet, significantly reducing model parameters and size. Furthermore, a flexible architecture BB-YOLO is introduced using a hierarchical modular design approach. Additionally, a hardware accelerator for BB-YOLO integrates pipelined design and fixed-point arithmetic, replacing floating-point operations to enhance neural network inference speed and reduce hardware resource usage. Experimental results show that the hardware accelerator requires 9.0967μs for single image detection, demonstrating excellent inference acceleration performance compared to a computer server with Graphics Processing Unit. The proposed BB-YOLO hardware accelerator not only features a flexible architecture but also enables real-time inference. Consequently, it offers a viable solution for real-time object detection in hardware-constrained environments.
關鍵字(中) ★ 二值卷積
★ 物件偵測網路
★ 硬體加速器
論文目次 目錄
摘要 I
Abstract II
誌謝 III
目錄 V
圖目錄 VII
表目錄 X
第一章、緒論 1
1.1 研究背景 1
1.2 研究目標 3
1.3 論文架構 3
第二章、文獻回顧 4
2.1 二值卷積神經網路 4
2.1.1 Binarized Neural Network 4
2.1.2 XNOR-Net 6
2.1.3 ReActNet 8
2.2 YOLO 12
2.2.1 YOLOv5 13
2.2.2 YOLOv8 17
第三章、二值主幹網路物件偵測硬體加速器設計 21
3.1 系統設計方法論 21
3.1.1 IDEF0 階層式模組化設計 22
3.1.2 GRAFCET 離散事件建模 24
3.2 二值主幹網路物件偵測硬體加速器架構與IDEF0 26
3.2.1 BB-YOLO二值主幹網路物件偵測硬體加速器架構 26
3.2.2 BB-YOLO二值主幹網路物件偵測硬體加速器IDEF0 28
3.3 二值主幹網路物件偵測硬體加速器GRAFCET 30
3.3.1 Backbone主幹網路模組GRAFCET 32
3.3.2 Neck和Head前半部分模組GRAFCET 40
3.3.3 Head後半部分模組GRAFCET 52
3.4 硬體加速器管線化設計 54
第四章、實驗結果 61
4.1 實驗軟硬體開發環境 61
4.2 軟體物件偵測實驗 62
4.2.1 物件偵測資料集 62
4.2.2 BB-YOLO二值主幹物件偵測網路化約架構 63
4.2.3 軟體物件偵測實驗結果 64
4.3 硬體合成與驗證 66
4.3.1 管線化控制器模組 66
4.3.2 G1模組 68
4.3.3 G2_1模組 71
4.3.4 G2_2模組 72
4.3.5 G3模組 74
4.3.6 硬體合成資源 75
4.4 軟硬體實驗結果分析 76
第五章、結論與未來展望 77
5.1 結論 77
5.2 未來展望 78
參考文獻 79
