博碩士論文 111552022 詳細資訊

姓名 陳彥廷(Yen-Ting Chen)
論文名稱
論文名稱 ERCNet:以精簡的ECA分支增強ReActNet
(ERCNet: Enhancing ReActNet with a Compact ECA Branch)
摘要(中) 自2016年以來,Courbariaux率先開創了二值神經網路,大幅降低了卷積神經網絡的參數量和計算成本。後續的研究持續不斷的縮小與浮點數網路能力差距。其中,ReActNet在眾多二值模型中嶄露頭角。
從實驗表明,ERCNet在CIFAR100上的Top-1準確率比原始ReActNet高出2.39%,而記憶體佔用量和計算量則分別降低了約10%和8%。在物件偵測實驗中,將ERCNet移入YOLOv8骨幹。在KITTI數據集上,我們的ERCNet比浮點數YOLOv8更為表現出色,達到94.8%的mAP50,分別超越YOLOv8-L和-N 1.9%和11.2%。
摘要(英) Since 2016, Courbariaux pioneered Binary Neural Network to dramatically decrease the storage and computation cost of CNN for lightweight application, researchers have made continued efforts to drill the cost as well as minimize the representation capacity loss and accuracy gap to its real-valued counterpart. Among them, ReActNet achieving 62.16% Top-1 accuracy on CFAR100 sets a new horizon on this competition landscape. In this thesis, we strive for further polishing its performance yet at even a lower overall cost.
We redesign the General Building block of the ReActNet (GBR) in an effort to elevating the accuracy on CIFAR100 image classification dataset, PSCAL VOC 07+12 object detection dataset, and KITTI vision benchmark suits, yet at a lower memory footprint and lower computation cost. The GBR comprises a single Down-sampling Block (DB) and a plurality of Common Blocks (CB). Firstly, we eliminate all the 1x1 Binary Convolutional (BConv) layers of the CBs to reduce the weight parameters as well as the network size. Second, the 1x1 Bconv duplicate of the DB is replaced by the Efficient Channel Attention (ECA) to enrich the representation capacity. Third, a Batch Normalization (BN) unit is added right after the Concatenator of the DB to render the data distribution more suitable for the performance optimization. Finally, the shortcut connection is resided after the RPReLU activation unit so as to balance the information preservation from the shortcut path and information transformation from the residual path. Our experiment shows that the enhanced network (ERCNet) delivers 2.39% higher Top-1 accuracy on CIFAR100 than the original ReActNet yet at around 10% lower memory and 8% lower computation flops. It generates 81.8% mAP50 under YOLOv8 framework on Pascal VOC 07+12 data set, surpassing the ReActNet by 0.8%. Furthermore, it is extremely encouraging that on the KITTI dataset, our ERCNET wins a landslide victory over all the models of the official YOLOv8 backbone, presenting 94.8% mAP50 which transcends YOLOv8-L &-N by 1.9% and 11.2%, respectively. On the other hand, we also find that our ERCNET performs slightly inferiorly to the default YOLOv8 backbone when regressing both on Pascal VOC 07+12.
Our experiments indicate that ERCNet demonstrates better performance than CNN in some particular data sets such as KITTI, yet at a lower memory and computation cost. As such, ERCNet makes it further suitable for having BNN on specific dataset applications in lightweight devices.
關鍵字(中) ★ 二值化卷積神經網路
★ 有效率通道注意力機制
★ 影像辨識
★ 物件偵測
關鍵字(英) ★ Binary neural network
★ Efficient channel attention
★ Classification
★ Object detection
指導教授 陳慶瀚(Ching-Han Chen) 審核日期 2024-6-12
