利用權重標準分流二進位神經網路做邊緣計算之影像辨識

、線上人數：124

、訪客IP：3.133.134.153

姓名	梁字清(Zi-Qing Liang) 查詢紙本館藏	畢業系所	資訊工程學系
論文名稱	利用權重標準分流二進位神經網路做邊緣計算之影像辨識 (Weight Standardization Fractional Binary Neural Network (WSFracBNN) for Image Recognition in Edge Computing)
檔案	[Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] 至系統瀏覽論文 (2029-7-2以後開放)
摘要(中)	現今的模型為了得到更好的準確率會將網路設計的更加龐大，模型的運算量也成指數增長，在這個情況下要應用於邊緣計算相當有難度。而Binary Neural Networks （BNNs）二進制神經網路是將卷積核(Filter)權重和激勵值量化至1位元（Bit）的模型，這種模型非常適合ARM、FPGA等小晶片或其他邊緣計算裝置，為了設計一個對邊緣計算裝置更友善的模型，如何降低模型浮點數運算量起著重要的作用。Batch normalization （BN）是二進制神經網路的重要工具，然而在卷積層被量化至1位元（Bit）的情況下，BN層的浮點數計算成本變得較為高昂，本論文透過移除模型的BN層來降低浮點數運算量，並加入Scaled Weight Standardization Convolution（WS-Conv）方法來避免無BN層後準確率大幅降低的問題，並透過一系列的優化方式提升模型的性能。具體來說我們的模型在沒有BN層的情況下仍使模型的計算成本及準確度保持著競爭力，再加入一系列訓練方法讓模型在Cifar-100的準確率仍高於Baseline 0.6%，而總運算量則只有Baseline的46%，其中在BOPs不變的情況下FLOPs降低至接近0，使其更適合FPGA等嵌入式平台。
摘要(英)	In order to achieve better accuracy, modern models have become increasingly large, leading to an exponential increase in computational load, making it challenging to apply them to edge computing. Binary Neural Networks (BNNs) are models that quantize the filter weights and activations to 1-bit. These models are highly suitable for small chips like ARM, FPGA, and other edge computing devices. To design a model that is more friendly to edge computing devices, it is crucial to reduce the floating-point operations (FLOPs). Batch normalization (BN) is an essential tool for binary neural networks; however, when convolution layers are quantized to 1-bit, the floating-point computation cost of BN layers becomes significantly high. This thesis aims to reduce the floating-point operations by removing the BN layers from the model and introducing the Scaled Weight Standardization Convolution (WS-Conv) method to avoid the significant accuracy drop caused by the absence of BN layers, and to enhance the model performance through a series of optimizations. Specifically, our model maintains competitive computational cost and accuracy even without BN layers. Furthermore, by incorporating a series of training methods, the model′s accuracy on CIFAR-100 is 0.6% higher than the baseline, while the total computational load is only 46% of the baseline. With unchanged BOPs, the FLOPs are reduced to nearly zero, making it more suitable for embedded platforms like FPGA.
關鍵字(中)	★ 人工智慧 ★ 模型辨識 ★ 邊緣計算 ★ 深度學習 ★ 二進制神經網路 ★ 影像辨識 ★ 模型壓縮 ★ 網路量化	關鍵字(英)	★ Artificial Intelligence ★ Model Recognition ★ Edge Computing ★ Deep Learning ★ Binary Neural Networks ★ Image Recognition ★ Model Compression ★ Network Quantization
論文目次	目　錄摘　要 i ABSTRACT ii 圖目錄 v 表目錄 vii 一、前言 1 1-1 研究動機 1 1-2 目的 1 二、文獻回顧 3 2-1 輕量型卷積方法 3 2-2 Binary Neural Networks（BNNs）量化模型 5 2-2-1 BiRealNet模型 5 2-2-2 ReActNet模型 7 2-2-3 FracBNN模型 9 三、研究方法 12 3-1 模型架構 12 3-2 Scaled Weight Standardization Convolution 14 3-3 適應性梯度裁剪（Adaptive Gradient Clipping） 15 3-4 知識蒸餾（Knowledge Distillation） 16 四、實驗結果 17 4-1 實驗環境 17 4-2 資料集 18 4-3 訓練方式 18 4-4 優化器（Optimizer）選擇 19 4-5 在Cifar100上測試WSFracBNN 22 4-6 執行效率分析 23 五、實驗結果分析與討論 25 六、結論 29 6-1 貢獻 29 6-2 未來展望 30 參考文獻 31
參考文獻	[1]Liu, Z., Wu, B., Luo, W., Yang, X., Liu, W., & Cheng, K. T. （2018）. Bi-real net: Enhancing the performance of 1-bit cnns with improved representational capability and advanced training algorithm. In Proceedings of the European conference on computer vision （ECCV）（pp. 722-737）. [2]Liu, Z., Shen, Z., Savvides, M., & Cheng, K. T. （2020）. Reactnet: Towards precise binary neural network with generalized activation functions. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIV 16 （pp. 143-159）. Springer International Publishing. [3]Zhang, Y., Pan, J., Liu, X., Chen, H., Chen, D., & Zhang, Z. （2021, February）. FracBNN: Accurate and FPGA-efficient binary neural networks with fractional activations. In The 2021 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays （pp. 171-182）. [4]Iandola, F. N., Han, S., Moskewicz, M. W., Ashraf, K., Dally, W. J., & Keutzer, K. （2016）. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv preprint arXiv:1602.07360. [5]Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., ... & Adam, H. （2017）. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861. [6]Zhang, X., Zhou, X., Lin, M., & Sun, J. （2018）. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE conference on computer vision and pattern recognition （pp. 6848-6856）. [7]Chollet, F. （2017）. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition （pp. 1251-1258）. [8]Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L. C. （2018）. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition （pp. 4510-4520）. [9]He, K., Zhang, X., Ren, S., & Sun, J. （2016）. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition （pp. 770-778）. [10]Krizhevsky, A., & Hinton, G. （2010）. Convolutional deep belief networks on cifar-10. Unpublished manuscript, 40（7）, 1-9. [11]Ioffe, S., & Szegedy, C. (2015, June). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International conference on machine learning (pp. 448-456). pmlr. [12]Liu, Z., Shen, Z., Li, S., Helwegen, K., Huang, D., & Cheng, K. T. (2021, July). How do adam and training strategies help bnns optimization. In International conference on machine learning (pp. 6936-6946). PMLR. [13]Brock, A., De, S., Smith, S. L., & Simonyan, K. (2021, July). High-performance large-scale image recognition without normalization. In International Conference on Machine Learning (pp. 1059-1071). PMLR. [14]Tu, Z., Chen, X., Ren, P., & Wang, Y. (2022, October). Adabin: Improving binary neural networks with adaptive binary sets. In European conference on computer vision (pp. 379-395). Cham: Springer Nature Switzerland. [15]Rastegari, M., Ordonez, V., Redmon, J., & Farhadi, A. (2016, September). Xnor-net: Imagenet classification using binary convolutional neural networks. In European conference on computer vision (pp. 525-542). Cham: Springer International Publishing. [16]Summers, C., & Dinneen, M. J. (2019). Four things everyone should know to improve batch normalization. arXiv preprint arXiv:1906.03548. [17]Martinez, B., Yang, J., Bulat, A., & Tzimiropoulos, G. (2020). Training binary neural networks with real-to-binary convolutions. arXiv preprint arXiv:2003.11535. [18]Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531. [19]Zhuang, B., Shen, C., Tan, M., Liu, L., & Reid, I. (2018). Towards effective low-bitwidth convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7920-7928). [20]Merity, S., Keskar, N. S., & Socher, R. (2017). Regularizing and optimizing LSTM language models. arXiv preprint arXiv:1708.02182. [21]Brock, A., De, S., & Smith, S. L. (2021). Characterizing signal propagation to close the performance gap in unnormalized resnets. arXiv preprint arXiv:2101.08692. [22]Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Fei-Fei, L. (2009, June). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition (pp. 248-255). Ieee. [23]Krizhevsky, A. (n.d.). Learning Multiple Layers of Features from Tiny Images. CIFAR-10 and CIFAR-100 datasets. Retrieved from https://www.cs.toronto.edu/~kriz/cifar.html
指導教授	范國清林志隆(Kuo-Chin Fan Chih-Lung Lin)	審核日期	2024-7-2
推文	facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu
網路書籤	Google bookmarks del.icio.us hemidemi myshare

博碩士論文 111522092 詳細資訊