以階層式深度卷積網路實現少樣本的人臉辨識系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：26

、訪客IP：18.216.137.32

姓名

沈明訢(Ming-Hsin Shen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

以階層式深度卷積網路實現少樣本的人臉辨識系統
(Low-shot Face Recognition using Hierarchical Deep Convolutional Neural Networks)

相關論文

★ 適用於大面積及場景轉換的視訊錯誤隱藏法	★ 虛擬觸覺系統中的力回饋修正與展現
★ 多頻譜衛星影像融合與紅外線影像合成	★ 腹腔鏡膽囊切除手術模擬系統
★ 飛行模擬系統中的動態載入式多重解析度地形模塑	★ 以凌波為基礎的多重解析度地形模塑與貼圖
★ 多重解析度光流分析與深度計算	★ 體積守恆的變形模塑應用於腹腔鏡手術模擬
★ 互動式多重解析度模型編輯技術	★ 以小波轉換為基礎的多重解析度邊線追蹤技術(Wavelet-based multiresolution edge tracking for edge detection)
★ 基於二次式誤差及屬性準則的多重解析度模塑	★ 以整數小波轉換及灰色理論為基礎的漸進式影像壓縮
★ 建立在動態載入多重解析度地形模塑的戰術模擬	★ 以多階分割的空間關係做人臉偵測與特徵擷取
★ 以小波轉換為基礎的影像浮水印與壓縮	★ 外觀守恆及視點相關的多重解析度模塑

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

近年來機器學習蓬勃發展，人臉偵測 (face detection) 與人臉辨識 (face recognition) 技術被廣泛地應用在各種實務、商務、娛樂系統上，像是門禁系統、監控系統、身分認證的登入系統、智慧型裝置等。然而，人臉偵測與人臉辨識常常受到許多因素影響，包含光照環境的不同、表情的不同、臉部旋轉、遮蔽物、及樣本數少的情形等。因此，我們利用卷積神經網路克服以上問題，並提高人臉辨識系統的辨識率。
本論文分為兩個部分：第一個部分使用更快速的區域卷積神經網路 (Faster Region Convolutional Neural Network, Faster R-CNN)，在樣本充足的條件下克服光影變化、模糊雜訊、臉部旋轉等因素，並且能夠及時偵測與辨識人臉；第二個部分使用雙胞胎神經網路 (Siamese neural network)，在樣本不充足的條件下提升小樣本類別的辨識率。用我們自己蒐集的多種光源、角度、清晰度變化的人臉資料庫，透過階層式卷積神經網路架構來訓練學習人臉特徵。
在實驗分析中，我們以自己拍攝的影片做測試 (包含不同光線變化，不同角度的人臉影像)。偵測方面，依不同參數的調整，偵測率可以達到 96.84%，誤判率 0%。本實驗設計的更快速區域卷積網路的辨識率為 99.65%；在 1920×1080 解析度的影片測試下平均速度為每秒 12.76 張影像；在 960×540解析度的影片測試下平均速度為每秒 24.03張影像。雙胞胎網路，以特徵差異當作是少樣本的分類網路，最後得到整體辨識率98.17%，少樣本類別辨識率92.4%，我們改善了樣本不足的問題，鑄造較好的分類器。

摘要(英)

In recent years, machine learning has flourishingly developed in face detection and face recognition which are widely used in variety applications, such as access control, monitoring, identity authentication, smart devices, etc. However, face detection and face recognition are always encountered difficult factors, such as different lighting conditions, different facial expressions, facial rotation, occlusion, and small number of samples. Based on the traditional methods, the detected and recognized result are not accepted. Thus, in this study, we use convolutional neural networks to overcome the problems, and improve the recognition rate in face recognition system.
The proposed system consists of two parts. In the first part, we use the faster R-CNN (Faster Region Convolutional Neural Network) with sufficient samples to recognize faces with overcoming the various lighting conditions, blurred, and various views of faces. In the second part, we use the Siamese neural network to recognize faces in the minor classes with a few samples.
In the experiments, we use our own videos to test the face detection and recognition in various environments such as different lighting conditions, face sizes, and face directions. In the detection stage, the detection rate can reach 96.84%, false positive rate (Misjudgment Ratio) is almost 0%. In the case of face recognition of 1920×1080 images, the recognition rate is 99.65% with 12.76 frames per second (FPS). In the other case of 960×540 images, the FPS is 24.03. With the Siamese network, we distinguish two face images to achieve the recognition rate being 92.4%.

關鍵字(中)

★ 深度學習
★ 人臉辨識
★ 即時系統
★ 少樣本

關鍵字(英)

論文目次

摘要 ii
Abstract iii
誌謝 iv
目錄 v
圖目錄 vii
表目錄 ix
第一章緒論 1
1.1 研究動機 2
1.2 系統架構 3
1.3 論文架構 7
第二章相關研究 8
2.1 卷積神經網路相關的偵測系統 8
2.2 類別不平衡學習 11
2.3 單點學習 12
第三章更快速區域卷積神經網路 14
3.1 更快速區域卷積網路簡介 14
3.2 卷積神經網路架構 16
3.3 建議區域網路 18
3.4 快速區域卷積網路 24
第四章少樣本學習 26
4.1 雙胞胎網路簡介 26
4.2 雙胞胎網路架構 30
第五章實驗與結果 32
5.1 實驗設備介紹 32
5.2 更快速區域網路實驗與結果展示 32
5.3 少樣本實驗結果與展示 41
第六章結論及未來展望 46
參考文獻 47

參考文獻

[1] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and F. Li, “Imagenet: a large-scale hierarchical image database,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Miami, FL, Jun.20-25, 2009, pp.2-9.
[2] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” in Proc. of Neural Information Processing Systems 2012 (NIPS 2012), Advances in Neural Information Processing Systems 25, Lake Tahoe, Nevada, Dec.3-8, 2012, pp.1-9.
[3] B. M. Lake, R. Salakhutdinov, and J. B. Tenenbaum, “Human level concept learning through probabilistic program induction,” Science, vol.350, pp.1332-1338, 2015.
[4] L. A. Schmidt, Meaning and Compositionality as Statistical Induction of Categories and Constraints, Ph.D. dissertation, Dept. of Brain and Cognitive Sciences, Univ. of Massachusetts Institute of Technology, MA, 2009.
[5] B. Lake, R. Salakhutdino, J. Gros, and J. B. Tenenbaum, “One shot learning of simple visual concepts,” in Proc. Conf. on the Cognitive Science Society, Boston, MA, Jul.20-23, 2011, pp. 2-7.
[6] G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Trans. on Signal Processing Magazine, vol.29, Is.6, pp.82-97, 2012.
[7] T. Mikolov, M. Karafiat, L. Burget, J. Cernocky, and S. Khudanpur, “Recurrent neural network based language model,” in Interspeech Conf., Makuhari, Japan, Sep.26-30, 2010, pp.1045-1048.
[8] F. Li, R. Fergus, and P. Perona, “One-shot learning of object categories,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.28, Is.4, pp.594-611, 2006.
[9] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.39, Is.6, pp.1137-1149, 2016.
[10] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.8-10, 2015, pp.3431-3440.
[11] P. Pinheiro and R Collobert, ”From image-level to pixel-level labeling with convolutional networks,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.1713-1721.
[12] J. Wang, Y. Yang, J. Mao, Z. Huang, C. Huang, and W. Xu, “CNN-RNN: A unified framework for multi-label image classification,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.26 - Jul.1, 2016, pp.2285-2294.
[13] L. Wang, W. Ouyang, X. Wang, and H. Lu, ”Visual tracking with fully convolutional networks,” in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, Dec.11-18, 2015, pp.3119-3127.
[14] N. Chawla, K. Bowyer, L. Hall, and W. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research (JAIR), vol.16, Is.1, pp.321-357, 2002.
[15] D. Tax, One-class Classiﬁcation: Concept-learning in The Absence of Counter-examples, Ph.D. Dissertation, Delft University of Technology, Netherlands, 2001.
[16] G. Koch, R. Zemel, and R. Salakhutdinov, Siamese Neural Networks for One-Shot Image Recognition, Master thesis, Sci. Graduate Dept. of Computer Science, Univ. of Toronto, Canada, 2015.
[17] O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, and D. Wierstra, “Matching networks for one shot learning,” in Proc. of Conf. on Neural Information Processing Systems (NIPS), Barcelona, Spain, Dec. 5-10, 2016.
[18] B. Hariharan and R. Girshick, “Low-shot visual recognition by shrinking and hallucinating features,” in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Venice, Italy, Oct.22-29, 2017.
[19] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, Jun.23-28, 2014, pp.580-587.
[20] J. Uijlings, K. Sande, T. Gevers, and A. Smeulders, “Selective search for object recognition,” Int. Journal of Computer Vision (IJCV), vol.104, Is.2, pp.154-171, 2013.
[21] R. Girshick, “Fast R-CNN,” in Proc. of IEEE Int. Conf. on Computer Vision (ICCV), Santiago, Chile, Dec.11-18, 2015, pp.1440-1448.
[22] K. Simonyan and A. Zisserman, “Very deep convolutional network for large-scale image recognition,” in Proc. Int. Conf. on Learning Represent (ICIR), San Diego, CA, May 7-9, 2015, pp.1150-1210.
[23] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.1-9.
[24] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, Jun.27-30, 2016, pp.770-778.
[25] Y. Bengio, P. Simard,and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Trans. on Neural Networks, vol.5, Is.2, pp.157-166, 1994.
[26] S. Ioffe and C. Szegedy, “Normalization: accelerating deep network training by reducing internal covariate shift,” in Proc. Int. Conf. on Machine Learning (ICML), Lille, France, Jul.5-11, 2015, pp.29-37.
[27] K. He and J. Sun, “Convolutional neural networks at constrained time cost,” in Proc. of IEEE Int. Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.7-12, 2015, pp.5353-5360.
[28] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol.37, Is.9, pp.1904-1916, 2015.
[29] S. Zagoruyko and N. Komodakis, “Learning to compare image patches via convolutional neural networks,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, MA, Jun.8-10, 2015, pp.4353-4361.
[30] Jia, Y., E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, ”Caffe: Convolutional architecture for fast feature embedding,” in Proc. of the 22nd ACM Int. Conf. on Multimedia, Orlando, FL, 2014, pp.675-678.
[31] T. Beier, S. Neely, “Feature-based image metamorphosis,” in Proc. of the 19th annual conf. on Computer graphics and interactive techniques, New York, NY, July, 1992, pp.35-42.
[32] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Proc. of Conf. on Neural Information Processing Systems (NIPS), Montréal, Canada, Dec.8-13, 2014, pp.2672-2680.

指導教授

曾定章

審核日期

2017-8-16

推文