dc.description.abstract | With the development of technology, the algorithm of artificial intelligence continues to evolve. From various artificial intelligence method has been proposed began in the 1950s, to the rise of machine learning algorithm in the 1980s. Various of artificial intelligence algorithm such as decision forests, support vector machines neural networks and other algorithms have been proposed and further imporved to enhance their performance. Eventually, with the exploding of deep learning algorithms in the past decade, by using the GPU or other accelerator hardware, deep neural networks have achieved significant improvements in various tasks.
A practical deep learning face recognition system can be divide into four main tasks: face detection, face alignment, feature extractor and feature matching. This task might be time-consuming if we execute each task with the original image as input data. Under the optimization of deep neural network, it is possible to integrate face detection task and face alignment task into a single detection network, localizing the face location by feature pyramid combined with anchor boxes and aligning the face position by training the regression layer of the neural network. After that, the feature extraction task and feature matching task can be combined by using convolution to extract the face feature and full connection layer with softmax function to match the person identification.
In this paper, we propose a multi-task training method based on feature pyramid and triplet loss to train a single-stage face detection and face recognition deep neural network. Every task’s data is pass through the same backbone network, in order to avoid the duplicate computation by sharing the weights and computations. The whole network are established using feature pyramid and anchor boxes to localize the face position, using triplet loss to establish the feature extractor and finally matching the feature through a simple math function. On a Nvidia 2080Ti GPU accelerator, this system can achieve 212 FPS for 640x640 resolution input. | en_US |