RetinaNet應用於人臉偵測

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：32

、訪客IP：18.117.184.144

姓名

柯金泉(Kha Kim Tuyen) 查詢紙本館藏

畢業系所

資訊工程學系在職專班

論文名稱

RetinaNet應用於人臉偵測
(Face Detection with RetinaNet)

相關論文

★ Single and Multi-Label Environmental Sound Recognition with Gaussian Process	★ 波束形成與音訊前處理之嵌入式系統實現
★ 語音合成及語者轉換之應用與設計	★ 基於語意之輿情分析系統
★ 高品質口述系統之設計與應用	★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測
★ 基於風格向量空間之個性化協同過濾服裝推薦系統	★ 金融商品走勢預測
★ 整合深度學習方法預測年齡以及衰老基因之研究	★ 漢語之端到端語音合成研究
★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進	★ 基於深度學習之指數股票型基金趨勢預測
★ 探討財經新聞與金融趨勢的相關性	★ 基於卷積神經網路的情緒語音分析
★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活	★ 運用LLM自動生成食譜方法與系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

中文摘要

人臉偵測是許多人臉相關應用的關鍵步驟，如人臉校正、人臉驗證、人臉識別以及人群行為分析等等。然而，小尺寸、遮蔽、光線、姿態變形、表情以及其他負面因素經常出現在真實世界的影像中，並為人臉偵測帶來巨大挑戰。此外，運算成本也是人臉偵測在實時應用的一大難題。

傳統方法使用人工設計的運算以滑動窗來偵測人臉的位置，這需要花費更多運算並且會影像正確率，尤其在偵測小尺寸人臉時更是如此。近來，基於深度卷積神經網路(CNN)的通用物件偵測方法獲得巨大成功。現代的物件偵測器包含一階段方法(如YOLO、SSD)與二階段方法(如Faster RCNN, RFCN)。一階段方法廣泛地使用單次前饋全卷積神經網絡來直接預測每個提取框的類別和對應的邊界框而不像二階段方法需要對每個提取框分別進行分類運算與邊界框調整。因此，一階段方法擁有更低的計算成本，而兩階段方法通常能獲得較高的準確度。

在本篇研究中，我發布了用於人臉偵測的RetinaNet，同時解決了小尺寸人臉與運算成本的問題；特別的是，同時改進了一階段與二階段方法。

摘要(英)

Abstract

Face detection is a critical step for many face-related applications, such as face alignment, face verification, face identification, crowed behavior analysis etc. However, small size, occlusion, illumination, pose deformation, expression and other disadvantageous factors often appear in real-world images, which bring great challenges to face detection. Besides, computation cost is also a big challenge for face detection in real-time application.

Traditional approach use manual operation with slide windows to skim and detect face location, it cost much computation and affect accuracy, especially with small size face. Recently, generic object detection based on deep convolution neural networks (CNNs) has achieved great success. It utilizes modern object detectors including one stage methods (e.g., YOLO, SSD) and two stage methods (e.g., Faster RCNN, RFCN). One stage methods refer broadly to architectures that use a single feed-forward full convolutional neural network to directly predict each proposal’s class and corresponding bounding box without requiring a second stage per-proposal classification operation and box refinement . Therefore, one stage methods success in computation cost whereas two stage mothods winner accuracy performance.

In this research, I deployed RetiaNet for face detection, it could solve the small size problem as well as computation cost; especially, it has benefit of both one-stage and two-stage methods .

關鍵字(中)

★ RetinaNet應用於人臉偵測

關鍵字(英)

★ Face Detection with RetinaNet

論文目次

Chapter 1 Introduction .1
1.1 Motivation 1
1.2 Thesis overview.2
Chapter 2 Background..3
2.1 Face detection .3
2.1.1 Definition..3
2.1.2 Evaluation.4
2.1.2.1. Detection representation4
2.1.2.2. Mean IOU.4
2.1.2.3. Detection evaluation5
2.2 Related work7
2.2.1 Hand-engineered approaches..7
2.2.2 Single-stage approaches 7
2.2.3 Two-stage approaches 8
2.3 Popular approaches..8
2.3.1 A Convolutional Neural Network Cascade for Face Detection..8
2.3.1.1 Introduction .8
2.3.1.2 Model’s tructure 8
2.3.2 MTCNN..11
2.3.2.1 Introduction ..11
2.3.2.2 Model’s tructure .11
2.3.3 HyperFace..14
Chapter 3 Face-RetinaNet 16
3.1 Feature pyramid Network..16
3.2 RetinaNet18
3.2.1 Structure..18
3.2.2 Anchors19
3.3 Proposed Approach: Face-RetinaNet .20
3.4 Backbone structure 21
3.4.1 ResNet..21
3.4.1.1 Structure.21
3.4.1.2. Configuration22
Chapter 4 Experiment and Results..25
4.1 BaseLine Model..25
4.2 Dataset Benchmark25
4.2.1 AFW dataset .25
4.2.1.1 Dataset Introduction .25
4.2.1.2. Result 26
a) Comparison with base-line 26
b) Comparison on AFW benchmark..26
4.2.2 WiderFace dataset: 27
4.2.2.1. Dataset introduction:.27
4.2.2.2. Results..28
a) Comparison with baseline .28
b) Comparison on WiderFace benchmark ..28
4.3 Runtime efficiency 30
Chapter 5 Conclusions and future works.31
Bibliographies .32

參考文獻

REFERENCES

[1] Karen Simonyan & Andrew Zisserman, “Very deep convolutional networks for large-scale image recognition”, ICLR 2015.
[2] Christian Szegedy et al, “Rethinking the Inception Architecture for Computer Vision”, CVPR 2016.
[3] KaMing-He et al, “Deep Residual Learning for Image Recognition”, CVPR 2015.
[4] Gao Huang et al, “Densely Connected Convolutional Networks”, CVPR 2016.
[5] Shaoqing Ren, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks”, CVPR2016.
[6] Jifeng Dai, “R-FCN: Object Detection via Region-based Fully Convolutional Networks”, CVPR 2016.
[7] Tsung-Yi Lin, “Feature Pyramid Networks for Object Detection”, CVPR 2017.
[8] Zhaowei Cai, “Cascade R-CNN: Delving into High Quality Object Detection”, CVPR 2017.
[9] W. Liu et al. “Ssd: Single shot multibox detector”. In European conference on computer vision”, CVPR 2016.
[10] Joseph Redmon, “You Only Look Once: Unified, Real-Time Object Detection”, CVPR 2016.
[11] Tsung-Yi Lin, “Focal Loss for Dense Object Detection”, CVPR 2018.
[12] Ming-Hsuan Yang, “Detecting faces in images: a survey”, TPAMI 2002.
[13] Vidit Jain and Erik Learned-Miller, “FDDB: A Benchmark for Face Detection in Unconstrained Settings”, Technical Report UM-CS-2010-009, Dept. of Computer Science, University of Massachusetts, Amherst. 2010.
[14] X. Zhu, D. Ramanan. "Face detection, pose estimation and landmark localization in the wild“, CVPR 2012.
[15] Shuo Yang, “WIDER FACE: A Face Detection Benchmark”, CVPR 2016.
[16] Mark Everingham et al, “The PASCAL Visual Object Classes (VOC) Challenge”, International Journal of Computer Vision
[17] Huaizu Jiang, Erik Learned-Miller, “Face Detection with the Faster R-CNN”, CVPR 2016.
[18] Yitong Wang, “Detecting Faces Using Region-based Fully Convolutional Networks”, CVPR 2017.
[19] Jian-qing Zhu, Can-hui Cai, “Real-time face detection using Gentle AdaBoost algorithm and nesting cascade structure”, ISPACS 2012.
[20] Rajeev Ranjan, “A Deep Pyramid Deformable Part Model for Face Detection”, CVPR 2015.
[21] Haoxiang Li et al, “A convolutional neural network cascade for face detection”, CVPR 2015.
[22] Kaipeng Zhan, “Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks”, CVPR 2016.
[23] Lichao Huang, “DenseBox: Unifying Landmark Localization with End to End Object Detection”, CVPR 2015.
[24] Jiahui Yu, “UnitBox: An Advanced Object Detection Network”, CVPR 2016.
[25] Zekun Hao, “Scale-Aware Face Detection”, CVPR 2017.
[26] Shifeng Zhang, “S3FD: Single Shot Scale-invariant Face Detector”, ICCV 2017.
[27] Jianfeng Wang, “Face Attention Network: An Effective Face Detector for the Occluded Faces”, CVPR 2017.
[28] Chenchen Zhu, “CMS-RCNN: Contextual Multi-Scale Region-based CNN for Unconstrained Face Detection”, CVPR 2016.
[29] Jifeng Dai, “R-FCN: Object Detection via Region-based Fully Convolutional Networks”, CVPR 2016.

指導教授

王家慶(Jia-Ching Wang)

審核日期

2019-5-31

推文