摘要: | 中文摘要
人臉偵測是許多人臉相關應用的關鍵步驟,如人臉校正、人臉驗證、人臉識別以及人群行為分析等等。然而,小尺寸、遮蔽、光線、姿態變形、表情以及其他負面因素經常出現在真實世界的影像中,並為人臉偵測帶來巨大挑戰。此外,運算成本也是人臉偵測在實時應用的一大難題。
傳統方法使用人工設計的運算以滑動窗來偵測人臉的位置,這需要花費更多運算並且會影像正確率,尤其在偵測小尺寸人臉時更是如此。近來,基於深度卷積神經網路(CNN)的通用物件偵測方法獲得巨大成功。現代的物件偵測器包含一階段方法(如YOLO、SSD)與二階段方法(如Faster RCNN, RFCN)。一階段方法廣泛地使用單次前饋全卷積神經網絡來直接預測每個提取框的類別和對應的邊界框而不像二階段方法需要對每個提取框分別進行分類運算與邊界框調整。因此,一階段方法擁有更低的計算成本,而兩階段方法通常能獲得較高的準確度。
在本篇研究中,我發布了用於人臉偵測的RetinaNet,同時解決了小尺寸人臉與運算成本的問題;特別的是,同時改進了一階段與二階段方法。 ;Abstract
Face detection is a critical step for many face-related applications, such as face alignment, face verification, face identification, crowed behavior analysis etc. However, small size, occlusion, illumination, pose deformation, expression and other disadvantageous factors often appear in real-world images, which bring great challenges to face detection. Besides, computation cost is also a big challenge for face detection in real-time application.
Traditional approach use manual operation with slide windows to skim and detect face location, it cost much computation and affect accuracy, especially with small size face. Recently, generic object detection based on deep convolution neural networks (CNNs) has achieved great success. It utilizes modern object detectors including one stage methods (e.g., YOLO, SSD) and two stage methods (e.g., Faster RCNN, RFCN). One stage methods refer broadly to architectures that use a single feed-forward full convolutional neural network to directly predict each proposal’s class and corresponding bounding box without requiring a second stage per-proposal classification operation and box refinement . Therefore, one stage methods success in computation cost whereas two stage mothods winner accuracy performance.
In this research, I deployed RetiaNet for face detection, it could solve the small size problem as well as computation cost; especially, it has benefit of both one-stage and two-stage methods . |