用於邊緣計算的全新輕量化物件偵測系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：83

、訪客IP：18.191.189.110

姓名

張育珉(Yu-Min Zhang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

用於邊緣計算的全新輕量化物件偵測系統
(CSL-YOLO: A New Lightweight Object Detection System for Edge Computing)

相關論文

★ 基於標靶訓練策略與強預測器的神經網路架構搜索方法	★ 基於自注意力與擬合平面感知局部幾何之三維點雲分類網路
★ 利用ε-greedy強化基於Transformer的物件偵測演算法之效能

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

由於高階的GPU始終定價較高體積較大的等一些較高的門檻，開發輕量級的物件偵測器至關重要，為了減少計算資源的無謂損耗，如何降低冗餘的計算起著重要的作用。本論文提出了一種全新的輕量級卷積方法Cross-Stage Lightweight(CSL) Module，它以廉價的運算來生成較為冗餘的特徵圖。在中間擴展深度的階段，我們將過去使用的pointwise convolution更換為depthwise convolution以生成候選的特徵圖。我們提出的CSL-Module可以顯著地降低計算成本，在CIFAR-10上進行的實驗表明了CSL-Module可以逼近convolution-3x3的擬合能力。最後我們以CSL-Module及其衍伸模組為基礎建構了全新的輕量級物件偵測器CSL-YOLO，與Tiny-YOLOv4相比，在MS-COCO上進行的實驗表明了CSL-YOLO僅以其43% FLOPs和52% parameters即可達到更好的物件偵測性能，達到了state-of-the-art的水準。

摘要(英)

The development of lightweight object detectors is essential due to the limited computation resources. To reduce the computation cost, how to generate redundant features plays a significant role. This paper proposes a new lightweight Convolution method Cross-Stage Lightweight (CSL) Module, to generate redundant features from cheap operations. In the intermediate expansion stage, we replaced Pointwise Convolution with Depthwise Convolution to produce candidate features. The proposed CSL-Module can reduce the computation cost significantly. Experiments conducted at MS-COCO show that the proposed CSL-Module can approximate the fitting ability of Convolution-3x3. Finally, we use the module to construct a lightweight detector CSL-YOLO, achieving better detection performance with only 43% FLOPs and 52% parameters than Tiny-YOLOv4.

關鍵字(中)

★ 輕量級物件偵測器

關鍵字(英)

★ YOLO
★ MS-COCO

論文目次

1. 前言 1
2. 文獻回顧 3
2.1. 輕量級卷積方法 3
2.2. 輕量級物件偵測器 6
2.2.1. SSD系列 6
2.2.2. YOLO系列 6
2.2.3. SSD與YOLO的主要差異 7
3. 研究方法 8
3.1. CSL-Module 8
3.1.1. 比較其他輕量級卷積方法 9
3.1.2. 理論速度分析 10
3.2. 構建輕量級元件 11
3.2.1. 輕量級骨幹網路CSL-Bone 12
3.2.2. 輕量級特徵金字塔特徵網路CSL-FPN 13
4. 實作細節及局部實驗 14
4.1. CSL-Module的實作細節 14
4.2. CSL-Bone的實作細節 15
4.3. CSL-FPN的實作細節 17
4.4. CSL-YOLO的實作細節 18
4.4.1. Anchors Constraint 18
4.4.2. Non-Exponential Prediction 19
4.4.3. 損失函數 20
5. 實驗結果 22
5.1. 資料集 22
5.2. 在MS-COCO上測試CSL-YOLO 23
6. 結論 26
6.1. 貢獻 26
6.2. 未來展望 27
7. 參考文獻 28

參考文獻

[1] Alexey Bochkovskiy, Chien-Yao Wang, and Hong-Yuan Mark Liao. 2020. Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020).
[2] Navaneeth Bodla, Bharat Singh, Rama Chellappa, and Larry S Davis. 2017. SoftNMS–improving object detection with one line of code. In Proceedings of the IEEE international conference on computer vision. 5561–5569.
[3] François Chollet. 2017. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1251–1258.
[4] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition. Ieee, 248–255.
[5] Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, and Chang Xu. 2020. Ghostnet: More features from cheap operations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1580–1589.
[6] Kaiming He, Ross Girshick, and Piotr Dollár. 2019. Rethinking imagenet pretraining. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4918–4927.
[7] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770–778.
[8] Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861 (2017).
[9] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700–4708.
[10] Rachel Huang, Jonathan Pedoeem, and Cuixian Chen. 2018. YOLO-LITE: a real-time object detection algorithm optimized for non-GPU computers. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 2503–2510.
[11] Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. 2016. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360 (2016).
[12] Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[13] Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2117–2125.
[14] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision. 2980–2988.
[15] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740–755.
[16] Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In European conference on computer vision. Springer, 21–37.
[17] Diganta Misra. 2019. Mish: A self regularized non-monotonic neural activation function. arXiv preprint arXiv:1908.08681 4 (2019).
[18] Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Icml.
[19] Prajit Ramachandran, Barret Zoph, and Quoc V Le. 2017. Searching for activation functions. arXiv preprint arXiv:1710.05941 (2017).
[20] Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779–788.
[21] Joseph Redmon and Ali Farhadi. 2017. YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7263–7271.
[22] Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).
[23] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497 (2015).
[24] Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, and Silvio Savarese. 2019. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 658–666.
[25] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4510–4520.
[26] Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[27] Chien-Yao Wang, Hong-Yuan Mark Liao, Yueh-Hua Wu, Ping-Yang Chen, Jun-Wei Hsieh, and I-Hau Yeh. 2020. CSPNet: A new backbone that can enhance learning capability of CNN. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops. 390–391.
[28] Robert J Wang, Xiang Li, and Charles X Ling. 2018. Pelee: A real-time object detection system on mobile devices. arXiv preprint arXiv:1804.06882 (2018).
[29] Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, and David Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).
[30] Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. 2018. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6848–6856.
[31] Zhaohui Zheng, Ping Wang, Wei Liu, Jinze Li, Rongguang Ye, and Dongwei Ren. 2020. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 12993–13000.

指導教授

范國清謝君偉(Kuo-Chin Fan Jun-Wei Hsieh)

審核日期

2021-8-2

推文