使用機器學習的快速人臉辨識系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：45

、訪客IP：3.145.176.131

姓名

池宇非(Yu-Fei Chih) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

使用機器學習的快速人臉辨識系統
(Fast Facial Recognition System using Machine Learning)

相關論文

★ 獨立成份分析法於真實環境中聲音訊號分離之探討	★ 口腔核磁共振影像的分割與三維灰階值內插
★ 數位式氣喘尖峰氣流量監測系統設計	★ 結合人工電子耳與助聽器對中文語音辨識率的影響
★ 人工電子耳進階結合編碼策略的中文語音辨識成效模擬--結合助聽器之分析	★ 中文發聲之神經關聯性的腦功能磁振造影研究
★ 利用有限元素法建構3維的舌頭力學模型	★ 以磁振造影為基礎的立體舌頭圖譜之建構
★ 腎小管之草酸鈣濃度變化與草酸鈣結石關係之模擬研究	★ 口腔磁振影像舌頭構造之自動分割
★ 微波輸出窗電性匹配之研究	★ 以軟體為基準的助聽器模擬平台之發展-噪音消除
★ 以軟體為基準的助聽器模擬平台之發展-回饋音消除	★ 模擬人工電子耳頻道數、刺激速率與雙耳聽對噪音環境下中文語音辨識率之影響
★ 用類神經網路研究中文語音聲調產生之神經關聯性	★ 教學用電腦模擬生理系統之建構

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

摘要
在2019年10月新冠肺炎疫情開始爆發，常常可以觀察到在出入口時常以人臉辨識的技術來替代人力，在口罩成為日常生活的一部分，面部遮擋常常造成辨識的錯誤，人臉辨識的效果並不好。由於以上的原因，研究面部遮擋為了要快速的人臉辨識需要執行模型壓縮並觀察模型壓縮策略在特定的情況其準確率和執行效率的變化，為了在這些受到限制的裝置上運行，模型受到計算能力和存儲空間所限制，壓縮空間可以確保這些設備如智能手機、嵌入式系統等等也能夠運行模型，模型壓縮有三種，剪枝(Pruning)、量化(Quantization)和蒸餾(Distillation)，本篇論文主要討論剪枝。根據重要程度來篩選刪除對象，還有訓練時間和測試時間都可以利用剪枝來改善，讓模型在移動和電池供電的裝置上運行。
藉由Resnet50、Vgg16、Mobilenet模型，以數據庫VggFace2中共30000張，測試準確率時有隨機100個個案，每個個案15張，總共1500張平衡取樣，探討剪枝的方法，目標是在三個網路中得到更好的準確率(維持90%以上)並縮減模型大小，在實驗當中分成三個部分，實驗一是驗證了依序剪枝比單次剪枝何者對神經網路損壞較小，分次剪枝更能夠在不影響模型的權重下(非結構剪枝)縮減模型大小並分析特徵圖(eigenface)，讓Resnet50和Vgg16剪枝和準確率提升，Mobilenet會損失部分權重訊息，分別達到99%、99%和80%，驗證分次剪枝較單次剪枝對一般或較複雜的模型能提升較多的準確率，實驗二是進一步分析每一層準確率下降對剪枝比例差的影響(敏感度)，依照敏感度決定剪枝(結構剪枝)，Resnet50和Vgg16分別在敏感度為-0.07時所對應的剪枝比例(17.57% 和 21.54%)來達到93.93%和92.19%的準確率，模型大小分別為414428KB和14142KB，推理時間分別是3.87ms和1.9ms。實驗三是針對口罩人臉辨識，利用模擬口罩圖片和部分臉部(眼睛)特徵測試庫，在固定敏感度情況下再加入以注意力為主的機制進一步改善Vgg16和Resnet50在多次剪枝的準確度分別從(85%和84%)提升到95.13%和96.81%、95.37%和96.89%，模型大小分別為444001KB和16225KB，推理時間分別是1.51ms和1.28ms。本篇論文的貢獻首先在沒設定敏感度情況下，驗證多次剪枝對比一次剪枝的成效，再以固定敏感度決定剪枝比例進一步改善模型的效能，最後再加入注意力的機制提升口罩人臉的辨識。本論文所訓練的模型能夠有效的降低模型大小提升推理速度，重要的是也能提高模型的準確度。

摘要(英)

Abstract
The COVID-19 pandemic, which erupted in October 2019, prompted a widespread shift to facial recognition technology at entrances and exits, replacing manual checks. However, with the integration of masks into daily life, facial occlusion frequently led to recognition errors, undermining the effectiveness of facial recognition systems. In response to these challenges, this study focuses on facial occlusion, particularly in the context of rapid facial recognition, necessitating model compression. The investigation evaluates the impact of model compression strategies, specifically pruning, on accuracy and execution efficiency under specific conditions.

To run efficiently on constrained devices such as smartphones and embedded systems, where computational power and storage are limited, model compression becomes crucial. The three main compression methods include pruning, quantization, and distillation, with this paper specifically delving into the pruning technique.

Using Resnet50, Vgg16, and Mobilenet models and a VggFace2 database containing 30,000 images, the study conducts tests with 100 random cases, each comprising 15 images, totaling 1500 balanced samples. The goal is to achieve better accuracy (maintain above 90%) and reduce model size across the three networks. The experiment comprises three main parts:

The first part validates the effectiveness of sequential pruning compared to single pruning. Sequential pruning, particularly non-structural pruning, proves more effective in reducing model size without significantly compromising accuracy. Analysis of eigenface feature maps is performed, resulting in pruning and accuracy improvements for Resnet50 and Vgg16, achieving 99% accuracy for both, while Mobilenet incurs a loss of some weight information, reaching 80%.

The second part involves a detailed analysis of the impact of accuracy decline in each layer on pruning ratio differences (sensitivity). Pruning ratios are determined based on sensitivity, resulting in accuracy rates of 93.93% for Resnet50 and 92.19% for Vgg16, with corresponding model sizes of 414,428KB and 14,142KB and inference times of 3.87ms and 1.9ms.

The third part addresses masked facial recognition, introducing a simulated mask image and partial facial feature test dataset. An attention-based mechanism is incorporated to further improve accuracy in multiple pruning cycles for Vgg16 and Resnet50. Accuracy rates increase from 85% and 84% to 95.13% and 96.81%, and 95.37% and 96.89%, respectively. Model sizes are 444,001KB and 16,225KB, with inference times of 1.51ms and 1.28ms.

In conclusion, this paper contributes by validating the effectiveness of multiple pruning cycles compared to a single cycle without setting sensitivity. It enhances model efficiency by determining pruning ratios based on fixed sensitivity and introduces an attention mechanism to improve accuracy, especially in facial recognition scenarios with masks. The trained models effectively reduce model size, improve inference speed, and significantly enhance overall accuracy.

關鍵字(中)

★ 模型壓縮
★ 模型剪枝
★ Vgg16
★ Resnet50
★ 深度學習

關鍵字(英)

★ Model Compression
★ Model Pruning
★ Vgg16
★ Resnet50
★ Deep learning

論文目次

目錄
摘要 I
Abstract III
第一章緒論 1
1.1研究動機 1
1.2背景介紹 3
1.3文獻探討 4
1.4研究目的 11
1.5論文架構 12
第二章相關研究探討 14
2.1傳統的檢測器 14
2.2深度學習檢測器 15
2.3 多任務卷積網路 (Multi-task Cascaded Convolutional Neural Networks, MTCNN) 18
2.4類神經網路由來 20
2.5神經網路架構介紹 21
2.6感知器 (Perceptron) 22
2.7深層神經網路 (Deep Neural Network, DNN) 23
2.7.1卷積神經網路 (Convolutional Neural Networks, CNN) 23
2.7.2卷積層 24
2.7.3池化層 25
2.7.4 全連接層 25
2.7.5 激活函數 25
2.8 循環神經網路(Recurrent Neural Network, RNN) 28
2.9移動網路(MobileNets) 29
2.10視覺幾何組16 (Visual Geometry Group, Vgg16) 30
2.11 殘差網路 (Residual Network , Resnet) 30
2.12 剪枝方法介紹 34
第三章實驗方法 37
3.1軟硬體規格 37
3.2模型客觀評量標準 37
3.3數據庫介紹 39
3.4神經網路結果 40
3.5模型壓縮 41
3.6剪枝的方式 42
3.7實驗方法 45
3.7.1實驗一 45
3.7.2實驗二 46
3.7.3實驗三 48
3.7.4實驗數據比較 50
第四章實驗結果 52
4.1實驗一結果 52
4.2實驗二結果 59
4.3實驗三結果 69
4.4數據結果比較 70
第五章結論與未來與展望 73
5.1結論 73
5.2未來展望 74
參考文獻 76

參考文獻

參考文獻
1. Yann LeCun, Léon Bottou, Yoshua Bengio, Patrick Haffner. (1998). "Gradient-Based Learning Applied to Document Recognition," Proceedings of the IEEE, 86(11), 2278-2324.
2. Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton. (2012). "ImageNet Classification with Deep Convolutional Neural Networks," Advances in Neural Information Processing Systems, 25, 1097-1105.
3. Karen Simonyan, Andrew Zisserman. (2014). "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv preprint arXiv:1409.1556.
4. Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun. (2016). "Deep Residual Learning for Image Recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770-778.
5. Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam. (2017). "MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications," arXiv preprint arXiv:1704.04861.
6. Yann LeCun, John S. Denker, Sara A. Solla. (1990). "Optimal Brain Damage," Advances in Neural Information Processing Systems, 2, 598-605.
7. Hassibi, B., & Stork, D. G. (1993). "Second order derivatives for network pruning: Optimal Brain Surgeon," Advances in Neural Information Processing Systems, 5, 164-171.
8. Song Han, Hao Zhu, William J. Dally. (2016). "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding," International Conference on Learning Representations (ICLR).
9. Molchanov, P., Tyree, S., Karras, T., Aila, T., & Kautz, J. (2016). "Pruning Convolutional Neural Networks for Resource Efficient Inference," International Conference on Learning Representations (ICLR), Workshop Track.
10. He, Y., Zhang, X., & Sun, J. (2017). "Channel Pruning for Accelerating Very Deep Neural Networks," International Conference on Computer Vision and Pattern Recognition (CVPR), 1389-1397.
11. Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., & Zhang, C. (2017). "Learning Efficient Convolutional Networks Through Network Slimming," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2818-2826.
12. Chin, T. J., Gray, A., Razavian, A. S., Karimi, H., & Shaban, A. (2019). "Towards Efficient Model Compression via Learned Global Ranking," arXiv preprint arXiv:1905.04760.
13. Wang, T., Gong, C., Liu, X., & Tao, D. (2019). "Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 4340-4349.
14. 黃慶昀(2020)。卷積神經網路於細粒度影像分類之模型剪枝評估。﹝碩士論文。國立臺北大學﹞臺灣博碩士論文知識加值系統。
15. Li, H., Kadav, A., Durdanovic, I., Samet, H., & Graf, H. P. (2017). "Pruning Filters for Efficient ConvNets," arXiv preprint arXiv:1608.08710.
16. Luo, J.-H., Wu, J., & Lin, W. (2017). "ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 5068-5076.

指導教授

吳炤民(Chao-Min Wu)

審核日期

2024-1-24

推文