使用機器學習的快速人臉辨識系統;Fast Facial Recognition System using Machine Learning

NCU Institutional Repository > 資訊電機學院 > 電機工程研究所 > 博碩士論文 > Item 987654321/93607

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/93607

題名:	使用機器學習的快速人臉辨識系統;Fast Facial Recognition System using Machine Learning
作者:	池宇非;Chih, Yu-Fei
貢獻者:	電機工程學系
關鍵詞:	模型壓縮;模型剪枝;Vgg16;Resnet50;深度學習;Model Compression;Model Pruning;Vgg16;Resnet50;Deep learning
日期:	2024-01-24
上傳時間:	2024-03-05 17:55:26 (UTC+8)
出版者:	國立中央大學
摘要:	摘要在2019年10月新冠肺炎疫情開始爆發，常常可以觀察到在出入口時常以人臉辨識的技術來替代人力，在口罩成為日常生活的一部分，面部遮擋常常造成辨識的錯誤，人臉辨識的效果並不好。由於以上的原因，研究面部遮擋為了要快速的人臉辨識需要執行模型壓縮並觀察模型壓縮策略在特定的情況其準確率和執行效率的變化，為了在這些受到限制的裝置上運行，模型受到計算能力和存儲空間所限制，壓縮空間可以確保這些設備如智能手機、嵌入式系統等等也能夠運行模型，模型壓縮有三種，剪枝(Pruning)、量化(Quantization)和蒸餾(Distillation)，本篇論文主要討論剪枝。根據重要程度來篩選刪除對象，還有訓練時間和測試時間都可以利用剪枝來改善，讓模型在移動和電池供電的裝置上運行。藉由Resnet50、Vgg16、Mobilenet模型，以數據庫VggFace2中共30000張，測試準確率時有隨機100個個案，每個個案15張，總共1500張平衡取樣，探討剪枝的方法，目標是在三個網路中得到更好的準確率(維持90%以上)並縮減模型大小，在實驗當中分成三個部分，實驗一是驗證了依序剪枝比單次剪枝何者對神經網路損壞較小，分次剪枝更能夠在不影響模型的權重下(非結構剪枝)縮減模型大小並分析特徵圖(eigenface)，讓Resnet50和Vgg16剪枝和準確率提升，Mobilenet會損失部分權重訊息，分別達到99%、99%和80%，驗證分次剪枝較單次剪枝對一般或較複雜的模型能提升較多的準確率，實驗二是進一步分析每一層準確率下降對剪枝比例差的影響(敏感度)，依照敏感度決定剪枝(結構剪枝)，Resnet50和Vgg16分別在敏感度為-0.07時所對應的剪枝比例(17.57% 和 21.54%)來達到93.93%和92.19%的準確率，模型大小分別為414428KB和14142KB，推理時間分別是3.87ms和1.9ms。實驗三是針對口罩人臉辨識，利用模擬口罩圖片和部分臉部(眼睛)特徵測試庫，在固定敏感度情況下再加入以注意力為主的機制進一步改善Vgg16和Resnet50在多次剪枝的準確度分別從(85%和84%)提升到95.13%和96.81%、95.37%和96.89%，模型大小分別為444001KB和16225KB，推理時間分別是1.51ms和1.28ms。本篇論文的貢獻首先在沒設定敏感度情況下，驗證多次剪枝對比一次剪枝的成效，再以固定敏感度決定剪枝比例進一步改善模型的效能，最後再加入注意力的機制提升口罩人臉的辨識。本論文所訓練的模型能夠有效的降低模型大小提升推理速度，重要的是也能提高模型的準確度。 ;Abstract The COVID-19 pandemic, which erupted in October 2019, prompted a widespread shift to facial recognition technology at entrances and exits, replacing manual checks. However, with the integration of masks into daily life, facial occlusion frequently led to recognition errors, undermining the effectiveness of facial recognition systems. In response to these challenges, this study focuses on facial occlusion, particularly in the context of rapid facial recognition, necessitating model compression. The investigation evaluates the impact of model compression strategies, specifically pruning, on accuracy and execution efficiency under specific conditions. To run efficiently on constrained devices such as smartphones and embedded systems, where computational power and storage are limited, model compression becomes crucial. The three main compression methods include pruning, quantization, and distillation, with this paper specifically delving into the pruning technique. Using Resnet50, Vgg16, and Mobilenet models and a VggFace2 database containing 30,000 images, the study conducts tests with 100 random cases, each comprising 15 images, totaling 1500 balanced samples. The goal is to achieve better accuracy (maintain above 90%) and reduce model size across the three networks. The experiment comprises three main parts: The first part validates the effectiveness of sequential pruning compared to single pruning. Sequential pruning, particularly non-structural pruning, proves more effective in reducing model size without significantly compromising accuracy. Analysis of eigenface feature maps is performed, resulting in pruning and accuracy improvements for Resnet50 and Vgg16, achieving 99% accuracy for both, while Mobilenet incurs a loss of some weight information, reaching 80%. The second part involves a detailed analysis of the impact of accuracy decline in each layer on pruning ratio differences (sensitivity). Pruning ratios are determined based on sensitivity, resulting in accuracy rates of 93.93% for Resnet50 and 92.19% for Vgg16, with corresponding model sizes of 414,428KB and 14,142KB and inference times of 3.87ms and 1.9ms. The third part addresses masked facial recognition, introducing a simulated mask image and partial facial feature test dataset. An attention-based mechanism is incorporated to further improve accuracy in multiple pruning cycles for Vgg16 and Resnet50. Accuracy rates increase from 85% and 84% to 95.13% and 96.81%, and 95.37% and 96.89%, respectively. Model sizes are 444,001KB and 16,225KB, with inference times of 1.51ms and 1.28ms. In conclusion, this paper contributes by validating the effectiveness of multiple pruning cycles compared to a single cycle without setting sensitivity. It enhances model efficiency by determining pruning ratios based on fixed sensitivity and introduces an attention mechanism to improve accuracy, especially in facial recognition scenarios with masks. The trained models effectively reduce model size, improve inference speed, and significantly enhance overall accuracy.
顯示於類別:	[電機工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	37	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....