對於三維人臉識別的資料擴充應用

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：78

、訪客IP：3.135.217.47

姓名

葉千瑋(Chien-Wei Yeh) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

對於三維人臉識別的資料擴充應用
(Data Augmentation for 3D Face Recognition)

相關論文

★ Single and Multi-Label Environmental Sound Recognition with Gaussian Process	★ 波束形成與音訊前處理之嵌入式系統實現
★ 語音合成及語者轉換之應用與設計	★ 基於語意之輿情分析系統
★ 高品質口述系統之設計與應用	★ 深度學習及加速強健特徵之CT影像跟骨骨折辨識及偵測
★ 基於風格向量空間之個性化協同過濾服裝推薦系統	★ RetinaNet應用於人臉偵測
★ 金融商品走勢預測	★ 整合深度學習方法預測年齡以及衰老基因之研究
★ 漢語之端到端語音合成研究	★ 基於 ARM 架構上的 ORB-SLAM2 的應用與改進
★ 基於深度學習之指數股票型基金趨勢預測	★ 探討財經新聞與金融趨勢的相關性
★ 基於卷積神經網路的情緒語音分析	★ 運用深度學習方法預測阿茲海默症惡化與腦中風手術存活

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

人臉識別是近年來受關注的熱門科技之一，特別是在深度學習與硬體設備的幫助下，實用價值更提升、辨識精準度越高。其中，訓練資料集的數量與深度學習的準確度有高度的相關性，目前大部分知名的人臉識別模型都使用到百萬張以上的人臉影像作為訓練資料，此外資料品質、資料集的分布偏差也會影響模型學習的成效。然而，相較於二維人臉識別，深度學習在三維人臉識別的發展較受限，很大的原因在三維臉部資料集的缺乏。在此篇論文，我們嘗試使用針對三維人臉的資料擴充方法來提升三維人臉識別的穩健性。利用合成的大量虛擬三維人臉資料，我們在人臉表情、臉部角度做變化增加資料的多樣性，並且在實驗探討：使用虛擬合成的資料是否可以增加三維人臉識別的強健性？我們證實使用虛擬合成的人臉資料可以有效地幫助三維人臉識別系統。

摘要(英)

In recent years, deep learning has important increased the performance of 2D face recognition systems with the use of large-scale labeled image data. Deep neural networks can be closely approaching human-level depend heavily on the amount and quality of facial training data. However, contrast with 2D face recognition, training discriminative deep features for 3D face recognition is very difficult. Because of the unavailability of large training datasets, recognition accuracies have already saturated on existing 3D face datasets due to their small gallery sizes. Unlike 2D photograph, the collection of annotated high-quality large 3D facial scan datasets cannot be sourced from the web. In this paper, we show that using synthetically generated data as CNN training dataset can effectively work for 3D face recognition by fine-tuning the CNN with real-world data. We propose a 3D augmentation method for enlarging 3D facial data, we can generate 3D facial data with arbitrary amounts of facial identities, facial expression and pose variations by using 3D morphable face model. Finally, in our experiment we use two real-world 3D facial datasets to be compared. Our method outperforms the 3D face recognition system training only with real-world dataset. As well as, we find the significant accuracy improvement with the help from synthetic 3D facial data.

關鍵字(中)

★ 三維人臉識別
★ 三維人臉形變模型
★ 資料擴充
★ 合成資料
★ 三維人臉重建

關鍵字(英)

★ 3D Face Recognition
★ 3D Morphable Model
★ Data Augmentation
★ Data Synthesis
★ 3D Face Reconstruction

論文目次

中文摘要 I
Abstract II
圖目錄 III
表目錄 V
章節目次 VI
第一章緒論 1
1.1 研究背景 1
1.2 研究動機與目的 2
1.3 研究方法與章節概要 3
第二章相關研究 4
2.1 資料擴充 4
2.2 合成資料 6
2.3 三維人臉重建 8
2.3.1 通用人臉模型 8
2.3.2 三維人臉形變模型 9
2.3.3 基於深度學習的三維人臉重建 14
第三章深度學習與人臉識別相關研究 16
3.1 深度學習概論 16
3.1.1 類神經網路 17
3.1.2 深度學習 18
3.1.3 卷積神經網路 19
3.2 二維人臉識別 21
3.2.1 特徵臉 22
3.2.2 局部二值模式 23
3.2.3 DeepFace 23
3.2.4 FaceNet 24
3.3 三維人臉識別 26
第四章實驗架構 28
4.1 人臉深度圖合成器 29
4.1.1 人臉形狀 31
4.1.2 人臉表情 32
4.1.3 人臉三維角度變化 33
4.2 特徵向量學習器 36
4.3 實驗驗證 38
第五章實驗設計與實驗結果 40
5.1 電腦軟硬體配置 40
5.2 資料集說明 41
5.2.1 真實人臉資料集 41
5.2.2 虛擬合成人臉資料集 42
5.3 實驗設計 43
5.3.1 訓練參數 43
5.3.2 實驗度良方式 43
5.3.3 實驗基準：虛擬合成資料集與真實資料集的差異基準 44
5.4 實驗結果與比較 45
5.4.1 合成條件控制 45
5.4.2 真實資料集輔助 47
5.4.3 極端減少真實資料集輔助 48
5.4.4 增加合成個體 51
5.5 延伸實驗 52
第六章結論與未來研究方向 54
參考文獻 56

參考文獻

[1] Hinton, G. E., Osindero, S., & Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural computation, 18(7), 1527-1554.
[2] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[3] Mikolov, Tomáš, et al. "Recurrent neural network based language model." Eleventh Annual Conference of the International Speech Communication Association. 2010.
[4] Taigman, Y., Yang, M., Ranzato, M. A., & Wolf, L. (2014). Deepface: Closing the gap to human-level performance in face verification. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1701-1708).
[5] Huang, G. B., Mattar, M., Berg, T., & Learned-Miller, E. (2008, October). Labeled faces in the wild: A database forstudying face recognition in unconstrained environments.
[6] Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 815-823).
[7] Deng, J., Guo, J., Xue, N., & Zafeiriou, S. (2019). Arcface: Additive angular margin loss for deep face recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 4690-4699).
[8] Kim, D., Hernandez, M., Choi, J., & Medioni, G. (2017, October). Deep 3D face identification. In 2017 IEEE International Joint Conference on Biometrics (IJCB) (pp. 133-142). IEEE.
[9] Parkhi, O. M., Vedaldi, A., & Zisserman, A. (2015, September). Deep face recognition. In bmvc (Vol. 1, No. 3, p. 6).
[10] Blanz, V., & Vetter, T. (1999, July). A morphable model for the synthesis of 3D faces. In Siggraph (Vol. 99, No. 1999, pp. 187-194).
[11] Ahlberg, J. (2001). Candide-3-an updated parameterised face.
[12] Morphing [Online]. Available: https://en.wikipedia.org/wiki/Morphing . [Accessed: 14-July 2018]
[13] Jolliffe, I. (2011). Principal component analysis (pp. 1094-1096). Springer Berlin Heidelberg.
[14] Cao, C., Weng, Y., Zhou, S., Tong, Y., & Zhou, K. (2013). Facewarehouse: A 3d facial expression database for visual computing. IEEE Transactions on Visualization and Computer Graphics, 20(3), 413-425.
[15] Huber, P., Hu, G., Tena, R., Mortazavian, P., Koppen, P., Christmas, W. J., ... & Kittler, J. (2016, February). A multiresolution 3d morphable face model and fitting framework. In Proceedings of the 11th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications.
[16] Booth, J., Antonakos, E., Ploumpis, S., Trigeorgis, G., Panagakis, Y., & Zafeiriou, S. (2017, July). 3D face morphable models" In-The-Wild". In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5464-5473). IEEE.
[17] Booth, J., Roussos, A., Ponniah, A., Dunaway, D., & Zafeiriou, S. (2018). Large scale 3D morphable models. International Journal of Computer Vision, 126(2-4), 233-254.
[18] . Paysan, P., Knothe, R., Amberg, B., Romdhani, S., & Vetter, T. (2009, September). A 3D face model for pose and illumination invariant face recognition. In 2009 Sixth IEEE International Conference on Advanced Video and Signal Based Surveillance (pp. 296-301). Ieee.
[19] Gerig, T., Morel-Forster, A., Blumer, C., Egger, B., Luthi, M., Schönborn, S., & Vetter, T. (2018, May). Morphable face models-an open framework. In 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018) (pp. 75-82). IEEE.
[20] Lüthi, M., Gerig, T., Jud, C., & Vetter, T. (2017). Gaussian process morphable models. IEEE transactions on pattern analysis and machine intelligence, 40(8), 1860-1873.
[21] Jackson, A. S., Bulat, A., Argyriou, V., & Tzimiropoulos, G. (2017). Large pose 3D face reconstruction from a single image via direct volumetric CNN regression. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1031-1039).
[22] Feng, Y., Wu, F., Shao, X., Wang, Y., & Zhou, X. (2018). Joint 3d face reconstruction and dense alignment with position map regression network. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 534-551).
[23] Howard, A. G. (2013). Some improvements on deep convolutional neural network based image classification. arXiv preprint arXiv:1312.5402.
[24] Mirza, M., & Osindero, S. (2014). Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784.
[25] Gatys, L. A., Ecker, A. S., & Bethge, M. (2015). A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576.
[26] Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., ... & Moore, R. (2013). Real-time human pose recognition in parts from single depth images. Communications of the ACM, 56(1), 116-124.
[27] Gupta, S., Girshick, R., Arbeláez, P., & Malik, J. (2014, September). Learning rich features from RGB-D images for object detection and segmentation. In European conference on computer vision (pp. 345-360). Springer, Cham.
[28] Gupta, A., Vedaldi, A., & Zisserman, A. (2016). Synthetic data for text localisation in natural images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2315-2324).
[29] Qiu, W., & Yuille, A. (2016, October). Unrealcv: Connecting computer vision to unreal engine. In European Conference on Computer Vision (pp. 909-916). Springer, Cham.
[30] What is Unreal Engine [Online]. Available:
https://www.unrealengine.com/en-US/ . [Accessed: 14-July 2018]
[31] Gaidon, A., Wang, Q., Cabon, Y., & Vig, E. (2016). Virtual worlds as proxy for multi-object tracking analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4340-4349).
[32] Abbasnejad, I., Sridharan, S., Nguyen, D., Denman, S., Fookes, C., & Lucey, S. (2017). Using synthetic data to improve facial expression analysis with 3D convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1609-1618).
[33] Kortylewski, A., Egger, B., Schneider, A., Gerig, T., Morel-Forster, A., & Vetter, T. (2018). Empirically analyzing the effect of dataset biases on deep face recognition systems. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (pp. 2093-2102).
[34] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[35] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[36] Kortylewski, A., Schneider, A., Gerig, T., Egger, B., Morel-Forster, A., & Vetter, T. (2018). Training deep face recognition systems with synthetic data. arXiv preprint arXiv:1802.05891.
[37] López, A. M., Xu, J., Gómez, J. L., Vázquez, D., & Ros, G. (2017). From Virtual to Real World Visual Perception Using Domain Adaptation—The DPM as Example. In Domain adaptation in computer vision applications (pp. 243-258). Springer, Cham.
[38] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
[39] Quinlan, J. R. (1986). Induction of decision trees. Machine learning, 1(1), 81-106.
[40] Ho, T. K. (1995, August). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition (Vol. 1, pp. 278-282). IEEE.
[41] Zurada, J. M. (1992). Introduction to artificial neural systems (Vol. 8). St. Paul: West publishing company.
[42] McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4), 115-133.
[43] Rosenblatt, F. (1958). The perceptron: a probabilistic model for information storage and organization in the brain. Psychological review, 65(6), 386.
[44] Minsky, M., & Papert, S. A. (2017). Perceptrons: An introduction to computational geometry. MIT press.
[45] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1988). Learning representations by back-propagating errors. Cognitive modeling, 5(3), 1.
[46] Vanishing gradient problem [Online]. Available:
https://en.wikipedia.org/wiki/Vanishing_gradient_problem . [Accessed: 14-July 2019]
[47] ImageNet [Online]. Available:
https://en.wikipedia.org/wiki/ImageNet . [Accessed: 14-July 2019]
[48] O′Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.
[49] Turk, M. A., & Pentland, A. P. (1991, June). Face recognition using eigenfaces. In Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 586-591). IEEE.
[50] Ojala, T., Pietikainen, M., & Harwood, D. (1994, October). Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In Proceedings of 12th International Conference on Pattern Recognition (Vol. 1, pp. 582-585). IEEE.
[51] Ahonen, T., Hadid, A., & Pietikäinen, M. (2004, May). Face recognition with local binary patterns. In European conference on computer vision (pp. 469-481). Springer, Berlin, Heidelberg.
[52] Weinberger, K. Q., & Saul, L. K. (2009). Distance metric learning for large margin nearest neighbor classification. Journal of Machine Learning Research, 10(Feb), 207-244.
[53] Mian, A. S., Bennamoun, M., & Owens, R. (2008). Keypoint detection and local feature matching for textured 3D face recognition. International Journal of Computer Vision, 79(1), 1-12. 123
[54] Mian, A., Bennamoun, M., & Owens, R. (2007). An efficient multimodal 2D-3D hybrid approach to automatic face recognition. IEEE transactions on pattern analysis and machine intelligence, 29(11), 1927-1943.
[55] Besl, P. J., & McKay, N. D. (1992, April). Method for registration of 3-D shapes. In Sensor fusion IV: control paradigms and data structures (Vol. 1611, pp. 586-606). International Society for Optics and Photonics.
[56] Gupta, S., Markey, M. K., & Bovik, A. C. (2010). Anthropometric 3D face recognition. International journal of computer vision, 90(3), 331-349.
[57] Blanz, V., Scherbaum, K., & Seidel, H. P. (2007, October). Fitting a morphable model to 3D scans of faces. In 2007 IEEE 11th International Conference on Computer Vision (pp. 1-8). IEEE.
[58] Kakadiaris, I. A., Passalis, G., Toderici, G., Murtuza, M. N., Lu, Y., Karampatziakis, N., & Theoharis, T. (2007). Three-dimensional face recognition in the presence of facial expressions: An annotated deformable model approach. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(4), 640-649.
[59] Annotated FaceModel [Online]. Available:
https://link.springer.com/referenceworkentry/10.1007%2F978-0-387-73003-5_829. [Accessed: 14-July 2019]
[60] Wavelet [Online]. Available:
https://en.wikipedia.org/wiki/Wavelet. [Accessed: 14-July 2019]
[61] Savran, A., Alyüz, N., Dibeklioğlu, H., Çeliktutan, O., Gökberk, B., Sankur, B., & Akarun, L. (2008, May). Bosphorus database for 3D face analysis. In European Workshop on Biometrics and Identity Management (pp. 47-56). Springer, Berlin, Heidelberg.
[62] Yin, L., Wei, X., Sun, Y., Wang, J., & Rosato, M. J. (2006, April). A 3D facial expression database for facial behavior research. In 7th international conference on automatic face and gesture recognition (FGR06) (pp. 211-216). IEEE.
[63] Vijayan, V., Bowyer, K. W., Flynn, P. J., Huang, D., Chen, L., Hansen, M., ... & Kakadiaris, I. A. (2011, October). Twins 3D face recognition challenge. In 2011 International Joint Conference on Biometrics (IJCB) (pp. 1-7). IEEE.
[64] Gupta, S., Markey, M. K., & Bovik, A. C. (2010). Anthropometric 3D face recognition. International journal of computer vision, 90(3), 331-349.
[65] Zhang, K., Zhang, Z., Li, Z., & Qiao, Y. (2016). Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499-1503.
[66] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., ... & Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1-9).
[67] Large Scale Visual Recognition Challenge (ILSVRC) [Online]. Available:
http://image-net.org/challenges/LSVRC/. [Accessed: 14-July 2019]
[68] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[69] Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017, February). Inception-v4, inception-resnet and the impact of residual connections on learning. In Thirty-First AAAI Conference on Artificial Intelligence.
[70] Receiver operating characteristic [Online]. Available:
https://en.wikipedia.org/wiki/Receiver_operating_characteristic .[Accessed: 14-July 2019]

指導教授

王家慶(Jia-Ching Wang)

審核日期

2019-8-20

推文