最近特徵線嵌入網路之影像物件辨識系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：5

、訪客IP：13.58.39.23

姓名

黃裕庭(Yu-Ting Huang) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

最近特徵線嵌入網路之影像物件辨識系統
(Image Object Recognition System Based on Nearest Feature Line Embedding Network)

相關論文

★ MFNet：基於點雲與RGB影像的多層級特徵融合神經網路之3D車輛偵測	★ 使用bag-of-word特徵進行人臉與行為分析
★ Multi-Proxy Loss:基於度量學習提出之損失函數用於細粒度圖像檢索

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2024-12-23以後開放)

摘要(中)

科技日新月異，隨著電腦硬體技術不斷提升，影像辨識的技術也不斷精進，對於電腦來說，分辨一張圖片的內容是一隻狗或一隻貓，已經是一件非常簡單的事情。然而，要達到高精準度的辨識，需要的條件也有許多：具有良好運算能力的GPU、多至數千與數萬筆的資料、訓練與調整參數優化的時間。在人工智慧技術不斷普及的現在，各個不同領域皆需要利用機器學習與深度學習的技術，達到各個領域所需求的效果，但在學術以外或是較為冷門的領域需要結合人工智慧時，資料量不足首當其衝的浮現了。除此之外，許多產業擁有的機器並不需求運算效能優異的GPU輔助訓練。
目前具有高精準度的影像辨識方法主流，通常仍是以CNN為主的架構，因此需要良好的GPU運算能力與一定的訓練時間才能夠成功訓練出來，雖然也有傳統特徵提取方法結合了類神經網路的PCANet，但在此部分仍有相當大的進步空間。本論文將採用與PCANet相似的架構，同樣是使用傳統方式設計濾波器的方式，將PCA的部分替換為最近特徵線策略NFL，NFL的特性為：在資料量少時能夠保持非常不錯的精準度，利用與PCANet相似的架構進行圖片的分析與處理，並使用NFL提取出必要的特徵，在最後使用SVM方法進行圖片的分類，以上是本篇論文的核心。
分析實驗結果得知，在資料量較少，約500~1000筆左右資料的資料數訓練時，NFLENet能夠得到比PCANet高5%~10%的辨識精準度，並因為資料量減少，訓練時間也大幅減少。

摘要(英)

With the continuous improvement of computer hardware technology, the technology of image recognition is also constantly improving. For computers, it is a very simple matter to distinguish the content of a picture as a dog or a cat. However, to achieve high-accuracy identification, there are many conditions required: GPUs with good computing power, up to tens of thousands of training data, time of training. Nowadays, with the increasing popularity of artificial intelligence technology, different industries need to use machine learning and deep learning to achieve the desired target. However, when there is a need to combine artificial intelligence in areas other than academic or relatively unpopular, the amount of data is insufficient. In addition, many industry-owned machines do not have GPU-assisted training with superior computing performance.
At present, the mainstream image recognition method with high precision is still CNN-based architecture. It requires good GPU computing power and a certain training time to be successfully trained. Although there are also traditional feature extraction methods combined with PCANet based on neural networks. However, there is still big space for improvement in this section. This paper will use a similar architecture to PCANet, but replace the PCA part with the nearest feature line embedding(NFL). The NFL features a very good accuracy when the amount of data is small, and uses a similar architecture to PCANet for image analysis. It is the core of this paper to deal with and use the NFL to extract the necessary features and to use the SVM method to classify the images.
According to the analysis results, NFLENet can obtain 5%~10% higher recognition accuracy than PCANet when the amount of data is small, about 500 pieces of data training, and the training time is greatly reduced because of the reduced amount of data.

關鍵字(中)

★ 圖片物件辨識
★ 特徵提取
★ 最近特徵線嵌入

關鍵字(英)

論文目次

摘要........i
Abstract...ii
目錄..............................iii
圖目錄..............................v
表目錄.............................vi
第一章緒論........................................................................1
1.1 研究動機.........................................................................1
1.2 研究目的..........................................................................4
1.3 論文架構.................................................................5
第二章相關文獻...................6
2.1 相關研究.........................6
2.2 主成分分析.....................8
2.3 卷積.................................11
2.4 PCANet..........................12
2.5 線性判別分析..............14
2.6 LDANet & RandNet.............15
2.7 支援向量機.............................................16
2.8 二值化雜湊函式編碼.......................................................17
第三章研究方法................................................................18
3.1 NFLENet架構.....................................................................18
3.2 Input Layer輸入層..............................................................19
3.3 First Stage 第一次NFL編碼卷積...................................21
3.3.1 取圖片Patch..............................................................21
3.3.2 最近特徵線編碼NFL.....................................................23
3.3.3 卷積Convolution.................................................25
3.4 Second Stage 第二次NFL編碼卷積.....................................25
3.5 Output Layer 特徵輸出層..................................................26
3.5.1 Binary Hashing Encoding & Block-wise Histogram......27
3.5.2 支援向量機SVM..........................................28
第四章實驗結果..........................................29
4.1 實驗建置環境與資料集介紹...................................29
4.1.1 MNIST手寫資料集.....................................29
4.1.2 CIFAR-10物件辨識資料集.............................30
4.1.3 Extended Yale B人臉資料集..............................32
4.1.4 PubFig人臉資料集.......................33
4.2 實驗說明..................................................35
4.3 實驗數據...........................................................36
4.4 實驗結論........................................................39
第五章結論與未來展望..............................40
參考文獻..........................................................41

參考文獻

[1] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, Li Fei-Fei, “ImageNet Large Scale Visual Recognition Challenge" arXiv:1409.0575v3, 2015.
[2] A. Krizhevsky, I. Sutskever, and G. Hinton, “Image-net classification with deep convolutional neural network,” in Proc. Of the 25th International Conference on NIPS, vol. 1, 2012, pp. 1097-1105.
[3] Y. M. Guo, Y. Liu, A. Oerlemans, S. Y. Lao, S. Wu and M. S. Lew, “Deep Learning for Visual Understanding: A Review”, Neurocomputing, vol. no. 2015.
[4] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, D. Hassabis, “Mastering the game of Go with deep neural networks and tree search”, in Nature, vol. 529 (2016), pp. 484-503.
[5] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, T. Darrell, “Caffe: Convolutional Architecture for Fast Feature Embedding”, arXiv:1408.5093, 2014.
[6] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei, “ImageNet: A Large-Scale Hierarchical Image Database”, in IEEE Computer Vision and Pattern Recognition, 2009
[7] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, “Going Deeper with Convolutions”, in IEEE Computer Vision and Pattern Recognition, 2015.
[8] K. Simonyan, A. Zisserman, “VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION”, in Proc. of international Conference on Pattern Recognition, ICLR, 2015.
[9] J. Redmon, A. Farhadi, “YOLO9000: Better, faster, stronger. Proceedings”, in 30th IEEE Conference on Computer Vision and Pattern Recognition, 2017.
[10] K. He, X. Zhang, S. Ren, J. Sun, “Deep Residual Learning for Image Recognition”, in Proc. of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016.

[11] Tsung-Han Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Yi Ma, “PCANet: A Simple Deep Learning Baseline for Image Classification?”, IEEE Trans on Image Processing, vol. 24, no. 12, pp. 5017-5032, 2015.
[12] J. Shlens, “A Tutorial on Principal Component Analysis”, arXiv:1404.1100v1, 2014.
[13] Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, “A Practical Guide to Support Vector Classification”, Technical report, Department of Computer Science, National Taiwan University, July 2003.
[14] K., H. Yang, J. Hsiao, C. Chen, “Deep Learning of Binary Hash Codes for Fast Image Retrieval” in IEEE Computer Vision and Pattern Recognition, 2015.
[15] N. Dalal, B. Triggs, “Histograms of Oriented Gradients for Human Detection” in IEEE Computer Vision and Pattern Recognition, 2005.
[16] Yu-kun Ge, Jia-ni Hu, Wei-hong Deng, “PCA-LDANet: A Simple Feature Learning Method for Image Classification”, 10.1109/ACPR.2017.36
[17] Zi-yong Feng, Lian-wen Jin, Da-peng Tao, Shuang-ping Huang, “DLANet: A manifold-learning-based discriminative feature learning network for scene classification”, Neurocomputing, vol. no. 2015.
[18] Ying-Nong Chen, Cheng-Ta Hsieh, Ming-Gang Wen, Chin-Chuan Han, and Kuo-Chin Fan, “Hyperspectral Image Classification Using a General NFLE Transformation with Kernelization and Fuzzification,” Remote Sensing, vol. 7, no. 11, pp. 14292-14326, 2015.
[19] A. Tharwat, A. Ibrahim, T. Gaber, A. E. Hassanien, “Linear discriminant analysis: A detailed tutorial”, in AI Communications 30(2):169-190, May 2017.
[20] Yi Sun, Xiaogang Wang, Xiaoou Tang, “Hybrid Deep Learning for Face Verification”, IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, pp. 1997-2009, 2016.
[21] X. Yi, X. Liu, ”Multi-Task Convolutional Neural Network for Pose-Invariant Face Recognition”, IEEE Trans on Image Processing, October 2017.
[22] G. B. de Souza, D. F. da Silva Santos, R. G. Pires, A. N. Marana , J. P. Papa, “Deep Texture Features for Robust Face Spoofing Detection”, IEEE Trans. on Circuits and System II: Express Briefs, October 2017.

[23] N. Y. Almudhahka, M. S. Nixon, J. S. Hare, ”Semantic Face Signatures: Recognizing and Retrieving Faces by Verbal Descriptions”, IEEE Trans on Information Forensics and Security, October 2017.
[24] Athinodoros S. Georghiades, Peter N. Belhumeur, David J. Kriegman, “From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose”, IEEE Trans on Pattern Analysis and Machine Intelligence, 2001.
[25] Kuang-Chih Lee, J. Ho, D. J. Kriegman, “Acquiring Linear Subspaces for Face Recognition under Variable Lighting”, IEEE Trans on Pattern Analysis and Machine Intelligence, 2005.
[26] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, “Gradient-Based Learning Applied to Document Recognition”, Proc of IEEE, November 1998.
[27] A. Krizhevsky, “Convolutional Deep Belief Networks on CIFAR-10”, Auguest 2010., Retrieve from https://www.cs.toronto.edu/~kriz/conv-cifar10-aug2010.pdf
[28] K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun, “What is the best multi-stage architecture for object recognition,” in ICCV, 2009.
[29] Jiasong Wu, Shijie Qiu, Rui Zeng, Youyong Kong, Lotfi Senhadji, Huazhong Shu, ” Multilinear Principal Component Analysis Network for Tensor Object Classification“, 2017.
[30] Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar, “Attribute and Simile Classifiers for Face Verification”, International Conference on Computer Vision (ICCV), 2009.
[31] 謝正達“應用特徵線為基礎之度量學習框架於身份識別,”國立中央大學資訊工程學系博士論文, 2017.

指導教授

范國清韓欽銓(Kuo-Chin Fan Chin-Chuan Han)

審核日期

2019-12-26

推文