姓名 陳冠宇(Kuan-Yu Chen)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 基於關鍵點篩選於袋字模型之影像分類
(Keypoint Selection for Bag-of-Words Based Image Classification)
摘要(中) 隨著人們所能接觸到的影像資料大量增加,影像檢索的發展是非常重要的課題,但目前影像檢索辨識率仍不夠高,因為有著低階和高階意涵上的鴻溝問題,因此,自動影像註解開始發展並不斷改良影像萃取方式和分類方法等等來增加影像註解的準確率,近幾年更發展出了BOW(Bag of Words)的袋字模型方法,此方法原本是用於文字探勘上,但目前很常被使用在影像註解的領域上,目前不少研究針對BOW做改良,像是SPM(Spatial Pyramid Matching Bag Of Words)就是針對BOW加上空間資訊所形成,這些方法中會用到所謂的keypoint,keypoint為BOW方法中所偵測出的影像特徵,目前少有研究針對於keypoint作處理,通常萃取的keypoint數量非常龐大,因此不但可能消耗CPU的運算且可能影響訓練的model結果,進而使影像註解效果不佳,因此本研究將使用一種新的非監督演算法叫做IKS( Iterative Keypoint Selection)來做keypoint的篩選。
  本研究採用Caltech101以及Caltech256兩種資料集來進行實驗,透過IKS來做為keypoints的篩選方法,並比較經過IKS篩選和未經過篩選的BOW和SPM所產生的影像註解效果,評估的分類方法採用SVM(Support Vector Machines)。
摘要(英) To search images from large image databases, image retrieval is the major technique to retrieve similar images based on users’ queries. In order to allow users to provide keyword-based queries, automatically annotating images with keywords has been extensively studied. In particular, BOW(Bag Of Words)and SPM(Spatial Pyramid Matching Bag Of Words) are two well-known methods to represent image content as the image feature descriptors. To extract the BOW or SPM features, some keypoints must be detected from each image. However, the number of the detected keypoints is usually very large and some of them are unhelpful to describe the image content, such as background and similar keypoints in different classes. In addition, the computational cost of the vector quantization step heavily depends on the amount of detected keypoints.
  Therefore, in this thesis I introduce a new algorithm called IKS(Iterative Keypoint Selection), whose aim is to select representative keypoints for generating the BOW and SPM features. The main concept of IKS is based on identifying some representative keypoints and the distance to select useful keypoints. Specifically, IKS can be divided into IKS1 and IKS2 according to the strategy of identifying representative keypoints. While IKS1 focuses on randomly selecting a keypoint from an image as the representative keypoint, IKS2 uses the k-means to generate the cluster centroids to find the representative keypoints that is closest to them.
  Our experimental results based on the Caltech101 and Caltech256 datasets demonstrate that performing keypoint selection by IKS1 and IKS2 can allow the SVM classifier to provide better classification accuracy than the baseline BOW and SPM without keypoint selection. More specifically, IKS2 is more appropriate than IKS1 for image annotation since it performs better than IKS1 when the larger dataset, i.e. Caltech 256, is used.
關鍵字(中) ★ keypoint selection
★ 袋字模型
★ 影像註解
★ 影像分類
關鍵字(英) ★ keypoint selection
★ bag of words
★ image annotation
★ image classification
論文目次 摘要                        i
Abstract                         ii
致謝辭                       iii
目錄                        iv
圖目錄                       vi
表目錄                       viii
第一章 緒論                    1
 1-1 研究背景                    1
 1-2 研究動機                    2
 1-3 研究目的                    4
 1-4 研究貢獻                    5
 1-5 論文架構                    5
第二章 文獻探討                  6
 2-1 影像註解                    6
 2-2 袋字模型(Bag Of Words)             8
 2-3 SPM(Spatial pyramid matching Bag Of Words)     11
 2-4 樣本選取                   17
 2-5 文獻討論                   20
第三章 關鍵點篩選(Keypoint Selection)          22
 3-1 Keypoint Selection定義               22
 3-2 Iterative Keypoint Selection (IKS)            24
  3-2-1 IKS1                    25
  3-2-2 IKS2                    28
第四章 實驗結果                   34
 4-1 實驗設計                   34
  4-1-1 資料集                   34
  4-1-2 Baseline                   34
   4-1-2-1 Bow features              34
   4-1-2-2 Keypoint Selection Method         36
  4-1-3 分類方法                 36
 4-2 Caltech101                   36
  4-2-1 參數設定                 37
   4-2-1-1 IKS1參數                37
   4-2-1-2 IKS2參數                38
  4-2-2 IB3參數                  39
  4-2-3 visual words數量               40
  4-2-4 比較結果                  41
 4-3 Caltech256                   43
  4-3-1 參數設定                 43
   4-3-1-1 IKS1參數                44
   4-3-1-2 IKS2參數                45
  4-3-2 IB3參數                   47
  4-3-3 visual words數量              47
  4-3-4 比較結果                 48
 4-4 分析與討論                   51
第五章 結論與未來研究方向             60
 5-1 總結                     60
 5-2 未來研究方向                 60
參考文獻                      62
附錄一                       66
附錄二                       68
附錄三                       72
附錄四                       75
指導教授 蔡志豐(Chih-Fong Tsai) 審核日期 2012-7-2
