dc.description.abstract | Abstract
The COVID-19 pandemic, which erupted in October 2019, prompted a widespread shift to facial recognition technology at entrances and exits, replacing manual checks. However, with the integration of masks into daily life, facial occlusion frequently led to recognition errors, undermining the effectiveness of facial recognition systems. In response to these challenges, this study focuses on facial occlusion, particularly in the context of rapid facial recognition, necessitating model compression. The investigation evaluates the impact of model compression strategies, specifically pruning, on accuracy and execution efficiency under specific conditions.
To run efficiently on constrained devices such as smartphones and embedded systems, where computational power and storage are limited, model compression becomes crucial. The three main compression methods include pruning, quantization, and distillation, with this paper specifically delving into the pruning technique.
Using Resnet50, Vgg16, and Mobilenet models and a VggFace2 database containing 30,000 images, the study conducts tests with 100 random cases, each comprising 15 images, totaling 1500 balanced samples. The goal is to achieve better accuracy (maintain above 90%) and reduce model size across the three networks. The experiment comprises three main parts:
The first part validates the effectiveness of sequential pruning compared to single pruning. Sequential pruning, particularly non-structural pruning, proves more effective in reducing model size without significantly compromising accuracy. Analysis of eigenface feature maps is performed, resulting in pruning and accuracy improvements for Resnet50 and Vgg16, achieving 99% accuracy for both, while Mobilenet incurs a loss of some weight information, reaching 80%.
The second part involves a detailed analysis of the impact of accuracy decline in each layer on pruning ratio differences (sensitivity). Pruning ratios are determined based on sensitivity, resulting in accuracy rates of 93.93% for Resnet50 and 92.19% for Vgg16, with corresponding model sizes of 414,428KB and 14,142KB and inference times of 3.87ms and 1.9ms.
The third part addresses masked facial recognition, introducing a simulated mask image and partial facial feature test dataset. An attention-based mechanism is incorporated to further improve accuracy in multiple pruning cycles for Vgg16 and Resnet50. Accuracy rates increase from 85% and 84% to 95.13% and 96.81%, and 95.37% and 96.89%, respectively. Model sizes are 444,001KB and 16,225KB, with inference times of 1.51ms and 1.28ms.
In conclusion, this paper contributes by validating the effectiveness of multiple pruning cycles compared to a single cycle without setting sensitivity. It enhances model efficiency by determining pruning ratios based on fixed sensitivity and introduces an attention mechanism to improve accuracy, especially in facial recognition scenarios with masks. The trained models effectively reduce model size, improve inference speed, and significantly enhance overall accuracy. | en_US |