顯著物件與尺度不變特徵轉換特徵包比對之影像搜尋研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：47

、訪客IP：3.145.164.139

姓名

林政威(Cheng-Wei Lin) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

顯著物件與尺度不變特徵轉換特徵包比對之影像搜尋研究
(The Study of Salient Object and BOF with SIFT for Image Retrieval)

相關論文

★ 探討國內田徑競賽資訊系統－以103年全國大專田徑公開賽資訊系統為例	★ 生物晶片之基因微陣列影像分析之研究
★ 台灣資訊家電產業IPv6技術地圖與發展策略之研究	★ 台灣第三代行動通訊產業IPv6技術地圖與發展策略之研究
★ 影響消費者使用電子書閱讀器採納意願之研究	★ 以資訊素養映對數位學習平台功能之研究
★ 台商群聚指標模式與資料分析之研究	★ 未來輪輔助軟體發展之需求擷取研究
★ 以工作流程圖展現未來研究方法配適於前瞻研究流程之研究	★ 以物件導向塑模未來研究方法配適於前瞻研究之系統架構
★ 應用TRIZ 探討核心因素建構電子商務新畫布	★ 企業策略資訊策略人力資源管理策略對組織績效的影響
★ 採用Color Petri Net方法偵測程式原始碼緩衝區溢位問題	★ 簡單且彈性化的軟體代理人通訊協定之探討與實作
★ 利用分析層級程序法探討台灣中草藥製造業之關鍵成功因素	★ 利用微陣列資料分析於基因調控網路之建構與預測

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

有效地檢索數位影像，已成為影像檢索領域的重要研究。1990年，基於內容之影像檢索主要為擷取影像低階特徵；但是低階視覺特徵和高階語意概念之間仍存在著語意差距。本研究提出以尺度不變特徵轉換(Scale Invariant Feature Transform, SIFT)之特徵包(Bag of Features, BOF)模型結合影像之顯著物件概念的影像檢索系統，以物件圖像作為查詢影像標的之影像搜尋，透過影像含有的物件進行搜尋，並實作出影像搜尋系統。
本研究透過顯著物件偵測技術辨識出影像的顯著物件，並降低背景雜訊對物件的影響。經過顯著物件偵測處理過的影像，使用SIFT擷取影像特徵，再透過K均數分群演算法對所有影像特徵向量分群，得到影像之BOF向量；另外物件圖像亦運用SIFT擷取影像特徵，再透過從資料集影像計算得到之編碼簿，統計物件圖像SIFT特徵在各視覺詞彙中的數量，得到物件圖像之BOF向量。
本研究從MSRA-A影像資料集整理出十個類型，共一千張影像進行實驗。實驗一：發現顯著物件偵測以矩形顯著影像表現較好；實驗二：探討編碼簿大小為何能影響影像搜尋準確率，實驗發現分群數目為200時，影像搜尋效果較佳；實驗三：探討物件圖像是否可以達到影像搜尋之應用，實驗結果發現以物件概念搜尋目標影像，確實可達到以物件搜尋影像之目的。從敏感度分析得知，透過變形功能提供更多樣的物件圖像，可以達到較精確的影像搜尋結果。
研究結果證實使用物件概念搜尋影像；並結合顯著物件與BOF與SIFT，確實比過去研究未結合顯著物件偵測之方法，較能夠提高影像搜尋準確率；最後，透過改良之系統搜尋方式與改善之影像搜尋準確率，實作出影像搜尋系統。

摘要(英)

To effectively search digital images has become increasingly important in image retrieval (IR) area. In 1990’s, content-based image retrieval indexes images by their low-level features, but there are existing semantic gaps between low-level features and high-level semantic concepts. The study proposes an image retrieval system based on bag-of-features (BOF) with scale invariant feature transform (SIFT) combined salient object, to search through the objects contained in the image and to implement the real image retrieval system.
This research detects a salient object in the image through salient object detection, and reduces the influence of background noise. After using salient object detection, SIFT features are extracted from each salient image in image database, and clustered using K-means clustering algorithm to form the codebook. SIFT features are extracted from object image, and found the nearest cluster center of the visual vector in codebook, and then the SIFT features of image are quantified using this visual vocabulary. Finally, an object image is presented as a set of visual words.
In the experiments, image database is subset of image dataset MSRA-A. It contained 1000 images, which were equally divided into 10 different categories. The 1st experimental results showed that rectangle salient images perform better than original salient images in terms of salient object detection. The 2nd experiment studying the influence of the codebook size on retrieval performance of the system showed that the best size is 200 for this data set. The 3rd experimental results showed that using object concept is useful to find similar images that contain objects. From sensitivity analysis, providing a variety of query images through the transformation of object image can achieve better performance in image retrieval.
In conclusion object images can improve the accuracy of image retrieval based on BOF with SIFT combined salient object. Eventually, the study is to implement an image retrieval system by changing the query method and improving the precision in image retrieval.

關鍵字(中)

★ 影像檢索
★ 基於內容之影像檢索
★ 尺度不變特徵轉換
★ 特徵包
★ K均數分群演算法

關鍵字(英)

★ Image retrieval
★ Content-Based Image Retrieval
★ Scale Invariant Feature Transform
★ Bag of Features
★ K-means clustering algorithm

論文目次

摘要 i
Abstract ii
誌謝 iii
目錄 iv
圖目錄 vii
表目錄 ix
第一章、緒論 1
1-1、研究背景 1
1-2、研究動機 1
1-3、研究目的 2
1-4、研究貢獻 3
第二章、文獻探討 4
2-1、基於內容之影像檢索 4
2-1-1、特徵包模型 5
2-1-2、尺度不變特徵轉換 6
2-2、視覺注意 9
2-2-1、顯著物件偵測 10
第三章、系統開發 14
3-1、系統架構 14
3-1-1、偵測顯著物件 14
3-1-2、特徵包轉換 15
3-1-3、相似度衡量 17
3-2、系統實作與功能 18
3-2-1、系統功能 18
第四章、實驗設計與結果 21
4-1、實驗環境 21
4-1-1、資料集 21
4-1-2、物件圖像 21
4-1-3、效能評估方法 22
4-2、實驗設計 23
4-2-1、顯著物件裁切方式 23
4-2-2、編碼簿大小 24
4-2-3、結合顯著物件與尺度不變特徵轉換特徵包之效能 24
4-2-4、影像搜尋準確率 25
4-3、實驗結果分析 25
4-3-1、顯著物件裁切方式分析 26
4-3-2、編碼簿大小分析 26
4-3-3、結合顯著物件與尺度不變特徵轉換特徵包之效能分析 28
4-3-4、影像搜尋準確率分析 28
4-4、敏感度分析 31
4-4-1、分析結果 32
4-5、小結 36
第五章、結論與未來研究方向 40
5-1、結論 40
5-2、研究限制與未來方向 42
參考文獻 44
附錄一：中英對照表 47
附錄二：系統程式碼 50

參考文獻

[1] Bressler, S. L., Tang, W., Sylvester, C. M., Shulman, G. L., & Corbetta, M. (2008). Top-down control of human visual cortex by frontal and parietal cortex in anticipatory visual spatial attention. The Journal of Neuroscience : The Official Journal of the Society for Neuroscience, 28(40), 10056–10061.
[2] Brown, M., & Lowe, D. (2002). Invariant Features from Interest Point Groups. British Machine Vision Conference, Cardiff, Wales, 656–665.
[3] Fehr, J., Streicher, A., & Burkhardt, H. (2009). A bag of features approach for 3D shape retrieval. Advances in Visual Computing, 5875, 34–43.
[4] Giesbrecht, B., Woldorff, M. G., Song, A. W., & Mangun, G. R. (2003). Neural mechanisms of top-down control during spatial and feature attention. NeuroImage, 19(3), 496–512.
[5] Itti, L., Koch, C., & Niebur, E. (1998). A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11), 1254–1259.
[6] Jiang, Y.-G., Ngo, C.-W., & Yang, J. (2007). Towards optimal bag-of-features for object categorization and semantic video retrieval. Proceedings of the 6th ACM International Conference on Image and Video Retrieval - CIVR ’07, 494–501.
[7] Khokher, A., & Talwar, R. (2012). Content-based Image Retrieval : Feature Extraction Techniques and Applications. International Conference on Recent Advances and Future Trends in Information Technology (iRAFIT2012), 9–14.
[8] Liu, T., Sun, J., Zheng, N., Tang, X., & Shum, H. Y. (2007). Learning to detect a salient object. In CVPR.
[9] Lowe, D. G. (1999). Object Recognition from Local Scale-Invariant Features. IEEE International Conference on Computer Vision, 1150–1157.
[10] Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.
[11] Lv, H., Huang, X., Yang, L., Liu, T., & Wang, P. (2013). A K-means Clustering Algorithm Based on the Distribution of SIFT, 1301–1304.
[12] Ma, Y.-F., & Zhang, H.-J. (2003). Contrast-based image attention analysis by using fuzzy growing. Proceedings of the Eleventh ACM International Conference on Multimedia MULTIMEDIA 03, 102, 374–381.
[13] Mikolajczyk, K., & Schmid, C. (2003). A Performance Evaluation of Local Descriptors. ICPR, 2, 257–263. Retrieved from http://www.computer.org/portal/web/csdl/doi/10.1109/TPAMI.2005.188
[14] Navalpakkam, V., & Itti, L. (2006). An integrated model of top-down and bottom-up attention for optimizing detection speed. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2, 2049–2056.
[15] Niblack, C. W. (1993). QBIC project: querying images by content, using color, texture, and shape. Proceedings of SPIE, 1908(1), 173–187.
[16] Ojala, T., Pietikäinen, M., & Mäenpää, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 971–987.
[17] Pirnog, I., Oprea, C., & Paleologu, C. (2009). Image Content Extraction Using a Bottom-Up Visual Attention Model. 2009 Third International Conference on Digital Society.
[18] Przemyslaw, G., Krzysztof, S. la, & Pawel, D. (2012). Ranking by K-Means Voting Algorithm for Similar Image Retrieval, 509–517.
[19] Rutishauser, U., Walther, D., Koch, C., & Perona, P. (2004). Is bottom-up attention useful for object recognition? Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., 2.
[20] Schreij D., O. C. T. J. (2008). Abrupt onsets capture attention independent of top-down control settings. Perception and Psychophysics, 70(2), 208–218.
[21] Sivic, J., & Zisserman, a. (2003). {Video Google:} A text retrieval approach to object matching in videos. Proc. CVPR, (Iccv), 2–9.
[22] Theeuwes, J. (1991). Exogenous and endogenous control of attention: the effect of visual onsets and offsets. Perception & Psychophysics, 49(1), 83–90.
[23] Theeuwes, J. (1992). Perceptual selectivity for color and form. Perception & Psychophysics, 51(6), 599–606.
[24] Torres, R. da S., & Falcão, A. X. (2006). Content-Based Image Retrieval: Theory and Applications. Revista de Informática Teórica E Aplicada RITA, 13(2), 161–185.
[25] Veltkamp, R. C., & Tanase, M. (2000). Content-Based Image Retrieval Systems : A Survey. Technical Report UU-CS-2000-34, Dept. of Computing Science, Utrecht
[26] Wan, T., & Qin, Z. (2010). A new technique for summarizing video sequences through histogram evolution. International Conference on Signal Processing and Communications, 1–5.
[27] Yang, Z., & Kurita, T. (2013). Improvements to the Descriptor of SIFT by BOF Approaches. 2013 2nd IAPR Asian Conference on Pattern Recognition, 95–99.
[28] Yuan, X., Yu, J., Qin, Z., & Wan, T. (2011). A SIFT-LBP image retrieval model based on bag of features. International Conference on Image …, 1061–1064. Retrieved from http://icmll.buaa.edu.cn/members/jing.yu/YuanYuQinWan.pdf
[29] Zhang, S., Tian, Q., Hua, G., Huang, Q., & Gao, W. (2011). Generating descriptive visual words and visual phrases for large-scale image applications. IEEE Transactions on Image Processing, 20(9), 2664–2677.

指導教授

薛義誠(Yih-Chearng Shiue)

審核日期

2015-7-15

推文