兩階段串聯式分類器與單一迴積特徵映射之物件偵測與辨識

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：27

、訪客IP：3.15.140.0

姓名

何岡峯(Gang-feng Ho) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

兩階段串聯式分類器與單一迴積特徵映射之物件偵測與辨識
(Object Detection and Recognition Using a Two-level Cascade Classifier and a Single Convolutional Feature Map)

相關論文

★ 使用視位與語音生物特徵作即時線上身分辨識	★ 以影像為基礎之SMD包裝料帶對位系統
★ 手持式行動裝置內容偽變造偵測暨刪除內容資料復原的研究	★ 基於SIFT演算法進行車牌認證
★ 基於動態線性決策函數之區域圖樣特徵於人臉辨識應用	★ 基於GPU的SAR資料庫模擬器：SAR回波訊號與影像資料庫平行化架構 (PASSED)
★ 利用掌紋作個人身份之確認	★ 利用色彩統計與鏡頭運鏡方式作視訊索引
★ 利用欄位群聚特徵和四個方向相鄰樹作表格文件分類	★ 筆劃特徵用於離線中文字的辨認
★ 利用可調式區塊比對並結合多圖像資訊之影像運動向量估測	★ 彩色影像分析及其應用於色彩量化影像搜尋及人臉偵測
★ 中英文名片商標的擷取及辨識	★ 利用虛筆資訊特徵作中文簽名確認
★ 基於三角幾何學及顏色特徵作人臉偵測、人臉角度分類與人臉辨識	★ 一個以膚色為基礎之互補人臉偵測策略

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在近幾年來，物件偵測與辨識在電腦視覺上的研究與應用相當的多。而其中人臉與車牌的偵測和辨識是相當常見的領域。首先在偵測方面，通常是先訓練一個分類器去偵測物體的所在位置。而常見的分類器有類神經網路、支持向量機…，但運算量通常很大。後來有學者提出以Adaboost訓練的串聯式分類器並利用Haar特徵結合積分影像去快速地過濾掉背景影像以達到即時的偵測，但卻需要相當長的訓練時間。
在本研究中，一個基於迴積類神經網路的特徵映像與兩階式串聯式分類器被提出來做物件的偵測。首先迴積與取樣運算可用來減緩物體受光線、旋轉及扭曲的干擾。然後接在後面的兩個分類器以粗至精細的機制去過濾掉大量的背景影像。由於輸入至粗階分類器的是取樣的特徵映像，其要檢查的視窗數縮小至原影像大小的四分之一。而剩下來的視窗再進一步由細階分類器來檢查。除了改善偵測流程，提出來的架構也大大地提升了訓練的速度。由較小的視窗所產生出來的少量特徵數被用來訓練一個粗階分類器。此外在訓練細階分類器時，特徵分級演算法從大量的特徵數只保留住少數有鑑別力的特徵並在不減少偵測效果的情況下以加快訓練的速度。最後的實驗，我們提出的偵測演算法和其它著名演算法比較後皆有較佳的結果。
在人臉辨識方面，我們在本研究中提出了正交化最近鄰居特徵線嵌入(Orthogonal Nearest Neighbour Feature Line Embedding，ONNFLE)。由於最近特徵線嵌入(Nearest Feature Line Embedding，NFLE)會具有內外插誤差，而且當樣本增加時會大幅增加計算量。因此為改良這些缺點，我們在產生最近特徵線時先選擇鄰近樣本點，再由這樣鄰近樣本點產生最近特徵線。如此則可以降低上述所提及的內外插誤差以及計算量增加的問題。在最後的實驗結果中，都能顯示出我們改良後的辨識演算法皆有顯著的效果。

摘要(英)

Recently, the researches and applications about object detection and recognition grow rapidly in the area of computer vision. Among these, the detection and recognition of human face and license plate are typical applications. To achieve the detection goal, an object is firstly detected by a trained classifier. The commonly used classifiers are neural networks, support vector machines, etc. However, the computation load is very heavy. To remedy the drawback, researcher proposed a cascade classifier trained by Adaboost and combined Haar-like features with integral images to quickly filter out background regions to achieve real-time detection task. However, it is still time consuming in training the classifiers.
In this dissertation, an object detector is proposed based on a convolution/sub-sampling feature map and a two-level cascade classifier. First, a convolution/subsampling operation is proposed to alleviate the suffering of the illumination, rotation, and distortion variances. Then, two classifiers are concatenated to check a large number of windows using a coarse-to-fine strategy. Since the sub-sampled feature map with enhanced pixels is fed into the coarse-level classifier, the size of the feature map is drastically reduced to a quarter of the original image. A few surviving windows with detailed data are further checked with the fine-level classifier. In addition to improving the detection process, the proposed mechanism also speeds up the training process. A few features generated from the prototypes within the small window are selected and trained to obtain the coarse-level classifier. Moreover, a feature ranking algorithm is proposed to reduce the huge feature pool to a small set for speeding up the training process without losing the generality of the feature pool. Finally, some experiments were conducted to show the feasibility of the proposed method.
As to the recognition, a novel manifold learning algorithm, called orthogonal nearest neighbour feature line embedding (ONNFLE), for face recognition is also proposed. In the proposed ONNFLE, two drawbacks of our earlier proposed nearest feature line embedding (NFLE) method are resolved. They are the extrapolation/interpolation error, and high computational load. The extrapolation error occurs if the distance from a specified point to one line is small when that line passes through two farther points. The scatter matrix generated by the invalid discriminant vectors does not efficiently preserve the locally topological structure which results in incorrect selection while reducing recognition performance. To remedy this problem, the nearest neighbour (NN) selection strategy is used in the proposed method. In addition, the high computational load is also reduced using a selection strategy. Finally, some experiments were conducted to demonstrate the effectiveness of the proposed algorithm.

關鍵字(中)

★ 人臉偵測
★ 車牌偵測
★ 人臉辨識
★ 迴積類神精網路

關鍵字(英)

★ Face Detection
★ Plate Detection
★ Face Recognition
★ Convolutional Neural Network

論文目次

摘要 I
Abstract II
Chapter 1 : Introduction 1
1.1 Motivation 1
1.2 Organization of the Dissertation 2
Chapter 2 : Review of the Related Works 4
2.1 Previous Methods for Object Detection 4
2.1.1 Convolutional Neural Network (CNN) 5
2.1.2 Adaboost 6
2.1.3 Forward Feature Selection (FFS) 8
2.2 Previous Methods for Face Recognition 11
2.2.1 Locally Preserving Projection Algorithm 14
2.2.2 Modified Nearest Feature Line Method 15
2.2.3 Nearest Feature Line Embedding 16
Chapter 3 : Two-Level Classifier 18
3.1 Proposed System Overview 18
3.2 Single Convolution-Subsampling Feature Map 20
3.3 Feature Ranking 22
3.4 Post Process 24
3.4.1 Post Process for Face Detection 24
3.4.2 Post Process for Plate Detection 26
Chapter 4 : Face Recognition 28
4.1 Nearest Neighbour Feature Line Embedding (NNFLE) 28
4.2 Orthogonal Nearest Neighbour Feature Line Embedding (ONNFLE) 31
Chapter 5 : Experimental Results 34
5.1 Face Detection 34
5.2 License Plate Detection 42
5.3 Face Recognition 52
Chapter 6 : Conclusions and Future Works 60
References 61

參考文獻

[1] H. A. Rowley, S. Baluja, and T. Kanade, “Neural network-based face detection,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 23–38, January 1998.
[2] M. Yang, D. Kriegman, and N. Ahuja, “Detecting faces in images: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 1, pp. 34 - 58, 2002.
[3] E. Hjelmas and B. K. Low, “Face detection: A survey,” Computer Vision and Image Understanding, vol. 83, no. 3, pp. 236–274, 2001.
[4] C. A. Waring and X. Liu, “Face detection using spectral histograms and SVMs,” IEEE Transactions on Systems, Man, and Cybernetics-Part B: Cybernetics, vol. 35, no.3, pp. 467–476, 2005.
[5] Y. Li, S. Gong, J. Sherrah, and H. Liddell, “Support vector machine based multi-view face detection and recognition,” Image and Vision Computing, vol.22, no.5, pp. 413–427, 2004.
[6] C. Garcia and M. Delakis, “Convolutional face finder: A neural architecture for fast and robust face detector,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 11, pp. 1408–1423, 2004.
[7] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proc. Intl Conf. Computer Vision and Pattern Recognition, vol. 1, pp. 511–518, 2001.
[8] P. Viola and M. Jones, “Robust real-time face detection,” International Journal of Computer Vision, vol. 57, no.2, pp. 137–154, 2004.
[9] Y. N. Chen, C. C. Han, C. T. Wang and K. C. Fan, “Face recognition using nearest feature space embedding,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 6, pp. 1073-1086, 2011.
[10] C. C. Han, H. Y. Liao, G. J. Yu, and L. H. Chen, “Fast face detection via morphology-based pre-processing,” Pattern Recognition, vol. 33, no. 10, pp. 1701–1712, 2000.
[11] Y. LeCun, B. Boser, J. Denker, D. Henderson, R. Howard, W. Hubbard, and L. Jackel, “Handwritten digit recognition with a back-propagation network,” Advances in Neural Information Processing Systems, pp. 396–404, 1990.
[12] B. Kwolek, “Face detection using convolutional neural networks and Gabor filters,” Lecture Notes in Computer Science, vol. 3696, pp. 551-556, 2005.
[13] S. Li, L. Zhu, Z. Zhang, A. Blake, H. Zhang, and H. Shum, “Statistical learning of multi-view face detection,” in Proc. Seventh European Conf. Computer Vision, pp. 67–81, 2002.
[14] J. Wu, C. Brubaker, M. D. Mulin and J. M. Rehg, “Fast asymmetric learning for cascade face detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 3, pp. 369–382, 2008.
[15] N. Sudha, A. R. Mohan and P. K. Meher, “A self-configurable systolic architecture for face recognition system based principal component neural network,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 7, pp. 1071-1084, 2011.
[16] S. Z. Wang, and H. J. Lee, “A cascade framework for a real-time statistical plate recognition system,” IEEE Transactions on Information Forensics and Security, vol. 2, No. 2, pp. 267–282, 2007.
[17] W. Zhou, H. Li, Y. Lu, and Q. Tian, “Principal visual word discovery for automatic license plate detection,” IEEE Transactions on Image Processing, pp. Early Access, 2012.
[18] I. Giannoukos, C. N. Anagnostopoul, V. Loumos and E. Kayafas, “Operator context scanning to support high segmentation rates for real time license plate recognition,” Pattern Recognition, vol. 43, no. 11, pp. 3866–3878, 2010.
[19] Y. R. Wang, W. H. Lin and S. J. Horng, “A sliding window technique for efficient license plate localization based on discrete wavelet transform,” Expert Systems with Applications, vol. 38, no. 4, pp. 3142–3146, 2011.
[20] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of IEEE, pp. 2278-2324, 1998.
[21] P. Y. Simard, D. Steinkraus, J. C. Platt, “Best practices for convolutional neural networks applied to visual document analysis,” Proc. IEEE Conf. Document Analysis and Recognition, pp. 958-163, 2003.
[22] M. Osadchy, Y. LeCun, and M. Miller, “Synergistic face detection and pose estimation with energy-based model,” Journal of Machine Learning Research, vol. 8, pp. 1197-1215, 2007.
[23] M. Matsugu, K. Mori, Y. Mitari, and Y. Kaneda, “Subject independent facial expression recognition with robust face detection using a convolutional neural network,” Neural Networks, vol. 12, no. 5-6, pp. 555–559, 2003.
[24] S. Lawrence, C. L. Giles, A. C. Tsoi, and A. D. Back, “Face recognition: A convolutional neural network approach,” IEEE Transactions on Neural Networks, vol. 8, no. 1, pp. 98-113, 1997.
[25] F. J. Huang and Y. LeCun, “Large-scale learning with SVM and convolutional nets for generic object categorization,” in Proc. Computer Vision and Pattern Recognition Conference (CVPR’06), pp. 284-291, 2006.
[26] S. J. Nowlan and J. C. Platt, “A convolutional neural network hand tracker,” in Advances in Neural Information Processing Systems, vol. 7, pp. 901–908, The MIT Press, 1995.
[27] P. Melin, O. Mendoza and O. Castillo, “Face recognition with an improved interval type-2 fuzzy logic sugeno integral and modular neural networks,” IEEE Transactions on Systems, Man and Cybernetics-Part: A: Systems and Humans, vol. 41, no. 5, pp. 1001-1012, 2011.
[28] Y. W. Wong, K. P. Seng and L. M. Ang, “Radial basis function neural network with incremental learning for face recognition,” IEEE Transactions on Systems, Man and Cybernetics-Part: B: Cybernetics, vol. 41, no. 4, pp. 940-949, 2011.
[29] L. Diago, T. Kitaoka, I. Hagiwara and T. Kambayashi, “Neuro-fuzzy quantification of personal perceptions of facial images based on a limited data set,” IEEE Transactions on Neural Networks, vol. 22, no. 12, pp. 2422-2434, 2011.
[30] J. Choi, Y. M. Ro and K. N. Plataniotis, “Boosting color feature selection for color face recognition,” IEEE Transactions on Image Processing, vol. 20, no. 5, pp. 1425-1434, 2011.
[31] S. H. Lee, J. Y. Choi and K. N. Plataniotis, “Local color vector binary patterns from multichannel face images for face recognition,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 2347-2353, 2012.
[32] Z. Xu, H. R. Wu, X. Yu, K. Horadam, K. Horadam and B. Qiu, “Robust shape-feature-vector-based face recognition system,” IEEE Transactions on Instrumentation and Measurement, vol. 60, no. 12: 3781-3791, 2011.
[33] H. Yan, J. Sun, and C. Zhang, “Low-Resolution Face Recognition with Variable Illumination Based on Differential Images,” Eighth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp. 98-113, 2012.
[34] Y. Pang, Y. Yuan, and X. Li, “Gabor-based region covariance matrices for face recognition,” IEEE Transactions Circuits System Video Technology, vol. 18, no. 7, pp. 989–993, 2008.
[35] S. Liao, A. K. Jain, and S. Z. Li, “Partial Face Recognition: Alignment-Free Approach,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 5, pp. 1193-1205, 2013.
[36] X. Zhao, Z. He, S. Zhang, S. Kaneko, and Yutaka Satoh, "Robust face recognition using the GAP feature," Pattern Recognition, vol.46, pp. 2647–2657, 2013.
[37] N.S. Vu, H.M. Dee, and A. Caplier, "Face recognition using the POEM descriptor," Pattern Recognition, vol.45, pp. 2478–2488, 2012.
[38] A. Sharma, D.W. Jacobs, "Bypassing synthesis: PLS for face recognition with pose, low-resolution and sketch," IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600, 2011.
[39] K. Fukui and O. Yamaguchi, "Face recognition using multi-viewpoint patterns for robot vision," Springer Tracts in Advanced Robotics, vol.15, pp. 192–201, 2005.
[40] J. Hamm, D.D. Lee, "Grassmann discriminant analysis: a unifying view on subspace-based learning," International Conference on Machine Learning, pp. 376–383, 2008.
[41] H. Zhang, N.M. Nasrabadi, Y. Zhang, T.S. Huang "Joint dynamic sparse representation for multi-view face recognition," Pattern Recognition, vol.45, pp. 1290–1298, 2012.
[42] H. Zhang, N.M. Nasrabadi, Y. Zhang, T.S. Huang "Multi-view automatic target recognition using joint sparse representation," IEEE Transactions on Aerospace and Electronic Systems, vol.48, pp. 2481–2497, 2012.
[43] H. Zhang, Y. Zhang, N.M. Nasrabadi, T.S. Huang, "Joint structured sparsity based classification for multiple-measurement transient acoustic signals," IEEE Transactions on Systems, Man, Cybernetics, Part B, vol. 42 pp.1586–1598, 2012.
[44] X. He, S. Yan, Y. Ho, P. Niyogi and H. J. Zhang, “Face recognition using Laplacian faces,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 328-340, 2005.
[45] M. Turk and A. P. Pentland, “Face recognition using eigenfaces,” IEEE Conf. Computer Vision and Pattern Recognition, pp. 586-591, 1991.
[46] P. N. Belhumeur, J. P. Hespanha and D.J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition using class specific linear projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720, 1997.
[47] S. J. Wang, J. Yang, N. Zhang and C. G. Zhou, “Tensor discriminant color space for face recognition,” IEEE Transactions on Image Processing, vol. 20, no. 9, pp. 2490-2501, 2011.
[48] R. Gopalan, S. Taheri, P. Turaga and R. Chellappa, “Blur-robust descriptor with applications to face recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 6, pp. 1220-1226, 2012.
[49] C. Tian, G. Fan, X. Gao and Q. Tian, “Multiview face recognition: from tensorface to v-tensorface and k-tensorface,” IEEE Transactions on Systems, Man and Cybernetics-Part: B: Cybernetics, vol. 42, no. 2: 320-333, 2012.
[50] S. Zafeiriou, G. Tzimiropoulos, M. Petrou and T. Stathaki, “Regularized kernel discriminant analysis with a robust kernel for face recognition and verification,” IEEE Transactions on Neural Networks, vol. 23, no. 3: 526-534, 2012.
[51] Y. Deng, Q. Dai and Z. Zhang, “Graph Laplace for occluded face completion and recognition,” IEEE Transactions on Image Processing, vol. 20, no. 8: 2329-2338, 2011.
[52] H. Huang and H. He, “Super-resolution method for face recognition using nonlinear mappings on coherent features,” IEEE Transactions on Neural Networks, vol. 22, no. 1: 121-130, 2011.
[53] J. Yang and J.Y. Yang, “Why can LDA be performed in PCA transformed space?," Pattern Recognition, vol.36, pp. 563–566, 2003.
[54] G. F. Lu, J. Zou, and Y. Wang, "Incremental complete LDA for face recognition," Pattern Recognition, vol. 45, pp. 2510-2521, 2012.
[55] S. Yan, D. Xu, B. Zhang, H. J. Zhang, Q. Yang and S. Lin, “Graph embedding and extensions: General framework for dimensionality reduction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 1, pp. 40-51, 2007.
[56] S. T. Roweis and L. K. Saul, “Nonlinear dimensionality reduction by locally linear embedding,” Science. vol. 290, no. 22, pp. 2323- 2326, 2000.
[57] S. Z. Li, “Face recognition based on nearest linear combinations,” Proceedings of 1998 Computer Vision and Pattern Recognition, pp. 839-844, 1998.
[58] W. Zheng, L. Zhao and C. Zou, “Locally nearest neighbor classifiers for pattern classification,” Pattern Recognition, vol. 37, pp. 1307-1309, 2004.
[59] H. Du, and Y. Q. Chen, “Rectified nearest feature line segment for pattern classification,” Pattern Recognition, vol. 40, pp. 1486-1497, 2007.
[60] D. Cai, X. He, J. Han and H. Zhang, “Orthogonal Laplacian faces for face recognition,” IEEE Transactions on Image Processing, vol. 15, no. 11, pp. 3608-3614, 2006.
[61] T. Sim, S. Baker and M. Bsat, “The CMU pose, illumination, and expression database,” IEEE Transactions Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 1615-1618, 2003.

指導教授

范國清(Kuo-chin Fan)

審核日期

2013-7-24

推文