不同影像尺寸與不同特徵表達對影像辨識之影響

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：87

、訪客IP：52.14.240.57

姓名

鍾穎慧(Yong-Hui Chung) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

不同影像尺寸與不同特徵表達對影像辨識之影響
(Object Recognition with Different Image Resolution and Different Feature Representation)

相關論文

★ 利用資料探勘技術建立商用複合機銷售預測模型	★ 應用資料探勘技術於資源配置預測之研究-以某電腦代工支援單位為例
★ 資料探勘技術應用於航空業航班延誤分析-以C公司為例	★ 全球供應鏈下新產品的安全控管-以C公司為例
★ 資料探勘應用於半導體雷射產業-以A公司為例	★ 應用資料探勘技術於空運出口貨物存倉時間預測-以A公司為例
★ 使用資料探勘分類技術優化YouBike運補作業	★ 特徵屬性篩選對於不同資料類型之影響
★ 資料探勘應用於B2B網路型態之企業官網研究-以T公司為例	★ 衍生性金融商品之客戶投資分析與建議-整合分群與關聯法則技術
★ 應用卷積式神經網路建立肝臟超音波影像輔助判別模型	★ 基於卷積神經網路之身分識別系統
★ 能源管理系統電能補值方法誤差率比較分析	★ 企業員工情感分析與管理系統之研發
★ 資料淨化於類別不平衡問題: 機器學習觀點	★ 資料探勘技術應用於旅客自助報到之分析—以C航空公司為例

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著網路的蓬勃發展，影像資料的數量也日漸增加，有鑑於人工直接標註影像過於費時，自動物件辨識及影像註解議題應運而生。過去研究著重於如何在龐大的影像資料庫中有效率且準確的將影像自動命名。然而，隨著影像數量日漸增多，利用原尺寸影像進行標註，會降低演算法效能且占據大量儲存空間，此外，不同特徵表達方式也可能影響影像註解之準確率。因此本研究將分別以兩種構面探討其對影像註解之正確率影響。
實驗結果顯示，不同的特徵表達方式確實會影響影像註解之準確率，但影像解析度對於影像註解準確率的影響程度卻不高，且不同特徵表達方式受影像解析度的影響程度不同。
本研究使用 Corel、PASCAL 2008、Corel 5000 三種不同資料集，選擇影像內插法中最廣為運用的雙立方內插法(Bicubic Interpolation)將影像重新取樣(分為 256x256、128x128、64x64、32x32、16x16)，特徵表達方式則分為區域特徵表達(Local Feature)、袋字模型(Bag-of-Words)特徵表達兩種。

摘要(英)

With the advent of the Internet and an increase in web images, manual image annotation becomes a difficult task and more time-consuming than automatic image annotation. Most research proposed algorithms for matching the keywords and the images accurately. However, those methods annotated images in original resolution, and it might cost more time and storage. In addition, different feature representation approach can cause various performance of annotation .We aimed to annotate images with different resolution and different feature representation approach and discussed the effect of these two factors.
We chose Corel, PASCAL VOC2008 and Corel 5000 to be our experiment data sets, and selected Bicubic Interpolation to scale these data sets into 256x256 resolution, 128x128 resolution, 64x64 resolution, 32x32 resolution and 16x16 resolution. Furthermore, local feature representation and Bag-of-Words feature representation were used in our experiment. In annotation step, we used support vector machine and K nearest neighbor algorithms.
Finally, the experimental results indicated that the accuracy of annotation didn’t decrease but the time of annotation was reduced rapidly when the image resolution was diminished. Besides, we also compared two feature representation approaches, the performance of local feature representation was better than Bag-of-Words feature representation, especially in support vector machine. Meanwhile, in different resolution, the performance of Bag-of-Words feature representation was more stable than local feature representation.

關鍵字(中)

★ 影像特徵表達
★ 影像註解
★ 影像解析度
★ 物件辨識

關鍵字(英)

★ object recognition
★ feature representation
★ image annotation
★ image resolution

論文目次

第一章緒論 1
1.1. 研究背景 1
1.2. 研究動機與目的 2
1.3. 論文架構 3
第二章文獻探討 4
2.1. 影像註解 4
2.2. 特徵擷取及表達 5
2.2.1. 全域特徵及區域特徵 5
2.2.2. 袋字模型(Bag-of-Words) 7
2.3. 影像縮放技術 11
第三章實驗設計 12
3.1. 資料集 14
3.1.1 Corel資料集 14
3.1.3 PASCAL VOC2008 資料集 15
3.1.4 Corel 5000資料集 15
3.2. 第一階段實驗：不同影像尺寸對影像註解正確率之影響 16
3.2.1. 影像縮放(Image scaling) 16
3.2.2. 影像切割(Image segmentation) 17
3.2.3. 特徵萃取及描述 (Feature extraction and representation) 18
3.3. 第二階段實驗：不同特徵表達方式對影像註解正確率之影響 20
3.3.1. 影像縮放(Image Scaling) 20
3.3.2. 區域特徵表達(Local Feature representation) 20
3.3.3. 袋字模型特徵表達(Bag-of-Words Feature representation) 20
3.3.4. 影像註解分類器(Classifier) 24
3.3.5. 衡量方法 24
第四章實驗結果與討論 25
4.1. 第一階段實驗：不同影像尺寸對影像註解正確率之影響 25
4.1.1. Corel 190資料集 25
4.1.2. PASCAL VOC2008 資料集 28
4.2. 第二階段實驗：不同特徵表達方式對影像註解正確率之影響 31
4.2.1. Corel 5000資料集 31
4.2.2. PASCAL VOC2008資料集 32
第五章結論與未來研究方向 34
5.1. 小結 34
5.2. 未來研究方向 35
參考文獻 37
附錄一 44
附錄二 46

參考文獻

[1] Veltkamp, R. And Tanase, M, 2000.‘ Content-Based Image Retrieval Systems: A Survey’, Department of Computing Science, Utrecht University, working material.
[2] John P. Eakins and Margaret E. Graham. Content-based image retrieval, a report to the JISC technology application programme. Technical report, Institute for Im-age Data Research, University of Northumbria at Newcastle, UK, January 1999.取自http://www.unn.ac.uk/iidr/report.html
[3] Song Lin, Yao Yao, and Ping Guo, 2010. Speed up image annotation based on LVQ technique with affinity propagation algorithm. In Proceedings of the 17th international conference on Neural information processing: models and applications - Volume Part II (ICONIP'10), Kok Wai Wong, B. Sumudu U. Mendis, and Abdesselam Bouzerdoum (Eds.), Vol. Part II. Springer-Verlag, Berlin, Heidelberg, 533-540
[4] Xirong Li, Le Chen, Lei Zhang, Fuzong Lin, Wei-Ying Ma, 2006. Image Annotation by Large-Scale Content-based Image Retrieval, MULTIMEDIA '06 Proceedings of the 14th annual ACM international conference on Multimedia, pp. 609-610.
[5] Hideki Nakayama, Tatsuya Harada, and Yasuo Kuniyoshi, 2009. Canonical contextual distance for large-scale image annotation and retrieval. In Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining (LS-MMRM '09). ACM, New York, NY, USA, 3-10.
[6] Y. Wang, T. Mei, S. Gong, X. Hua, 2009. Combining global, regional and contextual features for automatic image annotation.Pattern Recognition, Vol. 42, No. 2, pp. 259-266.
[7] Y.-G. Jiang, J. Yang, C.-W. Ngo, and A. G. Hauptmann, 2010. Representations of keypoint-based semantic concept detection: A comprehensive study. IEEE Transaction on Multimedia, 12(1):42--53.
[8] Lei Yanga, Nanning Zhenga and Jie Yangb, 2011. A unified context assessing model for object categorization. Computer Vision and Image Understanding, Vol 115, Issue 3, March 2011, pp. 310-322
[9] Mori, Y., Takahashi, H., Oka, R, 1999. Image-to-word Transformation Based on Dividing and Vector Quantizing Images with Words. MISRM.
[10] Chad Carson, Serge Belongie, Hayit Greenspan,and Jitendra Malik, "Color- and Texture-based Image Segmentation Using the Expectation-Maximization Algorithm and Its Application to Content-Based Image Retrieval," Int. Conf. Computer Vision, Bombay, India, Jan 1998.
[11] Chad Carson, Serge Belongie, Hayit Greenspan,and Jitendra Malik, “Blobworld: Image Segmentation Using Expectation-Maximization and Its Application to Image Querying,” IEEE Trans. on Pattern Analysis and Machine Trans. on Pattern Analysis and Machine Intelligence, 24(8), 1026-1038, August 2002.
[12] M. Szummer and R.W. Picard, “Indoor-Outdoor Image Classification,” IEEE International Workshop on Content-based Access of Image and Video Databases, in conjunction with ICCV'98. Bombay, India, 1998
[13] Jiayu Tang and Paul H. Lewis, 2007. Using multiple segmentations for image auto-annotation. In Proceedings of the 6th ACM international conference on Image and video retrieval (CIVR '07). ACM, New York, NY, USA, 581-586.
[14] Xirong Li, Le Chen, Lei Zhang, Fuzong Lin, and Wei-Ying Ma, 2006. Image annotation by large-scale content-based image retrieval. In Proceedings of the 14th annual ACM international conference on Multimedia (MULTIMEDIA '06). ACM, New York, NY, USA, 607-610.
[15] Xiaojun Qi, Yutao Han, 2007. Incorporating multiple SVMs for automatic image annotation. Pattern Recognition - PR , Vol. 40, No. 2, pp. 728-741
[16] Xiaohong Hu, Xu Qian, Lei Xi, and Xinming Ma, 2009. Robust image annotation refinement via graph-based learning. In Proceedings of the 21st annual international conference on Chinese control and decision conference (CCDC'09). IEEE Press, Piscataway, NJ, USA, 3970-3973.
[17] Nasullah Khalid Alham, Maozhen Li, Suhel Hammoud, and Hao Qi, 2009. Evaluating Machine Learning Techniques for Automatic Image Annotations. In Proceedings of the 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 07 (FSKD '09), Vol. 7. IEEE Computer Society, Washington, DC, USA, 245-249.
[18] Jianjiang Lu, Tianzhong Zhao, and Yafei Zhang, 2008. Feature selection based-on genetic algorithm for image annotation. Know.-Based Syst. 21, 8 (December 2008), 887-891.
[19] Ran Li, Jianjiang Lu, Yafei Zhang, and Tianzhong Zhao, 2010. Dynamic Adaboost learning with feature selection based on parallel genetic algorithm for image annotation. Know.-Based Syst. 23, 3 (April 2010), 195-201.
[20] Yong Wang and Shaogang Gong, 2007. Refining image annotation using contextual relations between words. In Proceedings of the 6th ACM international conference on Image and video retrieval (CIVR '07). ACM, New York, NY, USA, 425-432.
[21] Zhiwu Lu, Horace H. S. Ip, and Qizhen He, 2009. Context-based multi-label image annotation. In Proceeding of the ACM International Conference on Image and Video Retrieval (CIVR '09). ACM, New York, NY, USA, , Article 30 , 7 pages.
[22] Ainhoa Llorente, Enrico Motta, and Stefan Rüger, 2009. Image Annotation Refinement Using Web-Based Keyword Correlation. In Proceedings of the 4th International Conference on Semantic and Digital Media Technologies: Semantic Multimedia (SAMT '09), Tat-Seng Chua, Yiannis Kompatsiaris, Bernard Mérialdo, Werner Haas, Georg Thallinger, and Werner Bailer (Eds.). Springer-Verlag, Berlin, Heidelberg, 188-191.
[23] Xiangdong Zhou , Mei Wang , Qi Zhang , Junqi Zhang , Baile Shi, Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching, Proceedings of the 6th ACM international conference on Image and video retrieval, p.25-32, July 09-11, 2007, Amsterdam, The Netherlands
[24] Antonio Torralba , Rob Fergus , William T. Freeman, 2008. 80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, v.30 n.11, p.1958-1970, November 2008
[27] Jiakai Liu, Rong Hu, Meihong Wang, Yi Wang, and Edward Y. Chang, 2008. Web-Scale Image Annotation. In Proceedings of the 9th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing (PCM '08), Yueh-Min Ray Huang, Changsheng Xu, Kuo-Sheng Cheng, Jar-Ferr Kevin Yang, M. N. Swamy, Shipeng Li, and Jen-Wen Ding (Eds.). Springer-Verlag, Berlin, Heidelberg, 663-674.
[28] Ya Lien Liao, 2010. A Meta-Feature Representation Approach to Image Annotation. National Central University.
[29] Shi, J. and Malik, J, 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22 No. 8, pp. 888-905.
[30] Paredes, R., Pérez, J. C., Juan, A., and Vidal, E, 2001. Local representations and a direct voting scheme for face recognition. In In Proc. of the Workshop on Pattern Recognition in Information Systems.
[31] Lei Wu, Steven C.H. Hoi, and Nenghai Yu, 2009. Semantics-preserving bag-of-words models for efficient image annotation. In Proceedings of the First ACM workshop on Large-scale multimedia retrieval and mining (LS-MMRM '09). ACM, New York, NY, USA, 19-26.
[32] Nhu Van Nguyen, Alain Boucher, Jean-Marc Ogier, and Salvatore Tabbone, 2009. Region-Based Semi-automatic Annotation Using the Bag of Words Representation of the Keywords. In Proceedings of the 2009 Fifth International Conference on Image and Graphics (ICIG '09). IEEE Computer Society, Washington, DC, USA, 422-427.
[33] Faheema, A.G.; Rakshit, S, 2010. Feature selection using bag-of-visual-words representation. Advance Computing Conference (IACC), 2010 IEEE 2nd International.
[34] D. Lowe, 1999. Object recognition with informative features and linear classification. Proc. of International Conference on Computer Vision. pp. 1150–1157.
[35] C. Harris and M. Stephens, 1988. A combined corner and edge detector. Proc. of the 4th Alvey Vision Conference. pp. 147–151
[36] T. Kadir and M. Brady, 2001. "Scale, saliency and image description". International Journal of Computer Vision 45 (2): 83–105. doi:10.1023/A:1012460413855.
[37] D. Lowe, 2004. “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004.
[38] S.Lazebnik, C.Schmid, and J.Ponce, 2006.“Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories, ”in Proc.CVPR,2006.
[39] J.Sivic and A.Zisserman, 2003. “Videogoogle:A text retrieval approach to object matching in videos,”in Proc.ICCV,2003.
[40] E.Nowak, F.Jurie, and B.Triggs, 2006.“Sampling strategies for bag-of-features image classifcation,” in Proc.ECCV,2006.
[41] David Augusto Rojas Vigo, Fahad Shahbaz Khan, Joost van de Weijer, and Theo Gevers, 2010. The Impact of Color on Bag-of-Words Based Object Recognition. In Proceedings of the 2010 20th International Conference on Pattern Recognition (ICPR '10). IEEE Computer Society, Washington, DC, USA, 1549-1553.
[42] Xu Yang, De Xu, Ying-Jian Qi, 2010. Bag-of-words image representation based on classified vector quantization, Machine Learning and Cybernetics - ICMLC , pp. 708-712.
[43] Z. Wu, Q Ke, and J. Sun, 2010. A multi-sample, multi-tree approach to bag-of-words image representation for im-age retrieval, CVPR, 2010.
[44] Tinglin Liu, Jing Liu, Qinshan Liu, and Hanqing Lu. 2009. Expanded bag of words representation for object classification. In Proceedings of the 16th IEEE international conference on Image processing (ICIP'09), Magdy Bayoumi (Ed.). IEEE Press, Piscataway, NJ, USA, 297-300.
[45] Z. Wu, Q. Ke, M. Isard, and J. Sun, 2009. Bundling Features for Large Scale Partial-Duplicate Web Image Search. In Proc. CVPR, 2009.
[46] M. Unser, 1999. Spline: A Perfect Fit for Signal and Image Processing, IEEE Signal Processing Magazine, pp. 22-38, Nov. 1999.
[47] Yong Wang, Tao Mei, Shaogang Gong, and Xian-Sheng Hua, 2009. Combining global, regional and contextual features for automatic image annotation. Pattern Recogn. 42, 2 (February 2009), 259-266.
[48] Pinar Duygulu, Kobus Barnard, Nando de Freitas, and David Forsyth, 2002. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary , Seventh European Conference on Computer Vision, pp IV:97-112.
[49] Tsai, C. F., McGarry, K., and Tait, J. ,2006, Qualitative evaluation of automatic assignment of keywords to images. Information Processing & Management, vol. 42, no. 1, pp. 136-154.
[50] Wu, J.K., Kankanhalli, M.S., Lim, J.-H., and Hong, D.,2000. Perspectives on content-based multimedia systems. Kluwer Academic Publishers, Massachusetts.
[51] Long, F., Zhang, H., and Feng, D.D. ,2003. Fundamentals of content-based image retrieval. In Multimedia Information Retrieval and Management – Technological Fundamentals and Applications. Feng, D.D., Siu, W.C., Zhang, H. (Eds.), Springer-Verlag, Germany.
[52] Sebe, N. and Lew, M.S.,2001. Texture feature for content-based retrieval. In Principles of Visual Information Retrieval, Lew, M. S. (Ed.), Springer-Verlag, London.
[53] Tuceryan, M. and Jain, A.K. ,1998. Texture analysis. In The Handbook of Pattern Recognition and Computer Vision, 2ndEdition. Chen, C.H., Pau, L.F., and Wang, P. S. P. (Eds.), World Scientific, Singapore.
[54] Castleman, K.R, 1996. Digital Image Processing. Prentice-Hall, New Jersey.
[55] Daugman, J.G, 1990. An information-theoretic view of analog representation in striate cortex. In Computational Neuroscience (pp.403-423), Schwartz, E.L. (Ed.), MIT Press, Massachusetts
[56] Grigorescu, S.E., Petkov, N., and Kruizinga, P. , 2002. Comparison of texture features based on Gabor filters. IEEE Transactions on Image Processing, vol. 11, no. 10, pp. 1160-1167.
[57] Livens, S., Scheunders, P., Van de Wouwer, G. and Van Dyck, D, 1997. Wavelet for texture analysis: an overview. Proceedings of the IEEE International Conference on Image Processing and its Applications, Dublin, Ireland, July 14-17, p. 581-585.
[58] Rui, Y., Huang, T. S., and Chang, S. F. , 1999. Image retrieval: current techniques, 77 promising directions and open issues. Journal of Visual Communication and Image Representation, vol. 10, no. 1, pp. 39-62.
[59] Wong, W.T. and Hsu, S.H., 2006. Application of SVM and ANN for image retrieval. European Journal of Operational Research, vol. 173, pp. 938-50.
[60] Choi, Y. and Rasmussen, E.M., 2002. Users‘ relevance criteria in image retrieval in American history. Information Processing & Management, vol. 38, no. 5, pp. 695-726.
[61] Blei, D.M. and Jordan, M.I.,2003. Modeling annotated data. Proceedings of the 26thInternational ACM SIGIR Conference on Research and Development in Information Retrieval, Toronto, Canada, July 28-Aug. 1, pp. 127-134.
[62] B. Sirmacek and C. Unsalan, 2009. Urban-area and building detection using sift keypoints and graph theory. Geoscience and Remote Sensing, IEEE Transactions on, 47(4):1156--1167, April 2009.
[63] David G. Lowe, 1999. Object Recognition from Local Scale-Invariant Features. In Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2 (ICCV '99), Vol. 2. IEEE Computer Society, Washington, DC, USA, 1150-.
[64] 吳俊霖、陳彥良，2007，一個不同曝光時間影像序列之強健特徵等項影像定位法，國立中興大學，碩士論文。
[65] D.G.Lowe, “Distinctive image features from scale invariant keypoints,” International Journal of Computer Vision, vol.60, no.2, pp.91–110,2004.
[66] Ce Liu, Jenny Yuen, Antonio Torralba, 2010. SIFT Flow: Dense Correspondence across Scenes and its Applications. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 33, Issue. 5, pp. 1-17.
[67] Pedram Azad , Tamim Asfour , Rudiger Dillmann, 2009. Combining Harris interest points and the SIFT descriptor for fast scale-invariant object recognition, Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems, p.4275-4280, October 10-15, 2009, St. Louis, MO, USA

指導教授

蔡志豐(Chih-Fong Tsai)

審核日期

2011-7-26

推文