博碩士論文 105423020 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:16 、訪客IP:3.88.220.93
姓名 李懿真(Yi-Zhen Li)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 發展一個整合應用視覺詞頻率與文字語意於自動圖像註解系統的方法
(Automatic image annotation approach using visual word frequency and semantic information)
相關論文
★ 信用卡盜刷防治簡訊規則製作之決策支援系統★ 不同檢索策略之效果比較
★ 知識分享過程之影響因子探討★ 兼具分享功能之檢索代理人系統建構與評估
★ 犯罪青少年電腦態度與學習自我效能之研究★ 使用AHP分析法在軟體度量議題之研究
★ 優化入侵規則庫★ 商務資訊擷取效率與品質促進之研究
★ 以分析層級程序法衡量銀行業導入企業應用整合系統(EAI)之關鍵因素★ 應用基因演算法於叢集電腦機房強迫對流裝置佈局最佳近似解之研究
★ The Development of a CASE Tool with Knowledge Management Functions★ 以PAT tree 為基礎發展之快速搜尋索引樹
★ 以複合名詞為基礎之文件概念建立方式★ 利用使用者興趣檔探討形容詞所處位置對評論分類的重要性
★ 透過半結構資訊及使用者回饋資訊以協助使用者過濾網頁文件搜尋結果★ 利用feature-opinion pair建立向量空間模型以進行使用者評論分類之研究
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 在影像搜尋中,一般使用者使用圖片搜尋引擎(如:Google、Flickr),基本是以文字為基礎的圖像檢索方法(Text-Based Image Retrieval, TBIR)為主要查詢方式。使用者輸入關鍵字作查詢,仰賴的是資料庫中對於圖片的說明文字,但在現實狀況中,圖片提供者很少針對圖片內容做進一步的標籤註解建置,導致圖片資訊過少,召回率低下。為了解決此問題,發展了自動標籤的研究來改進人工建置作業。
演化至今,在人工智慧備受重視的時代,賦予圖像具語意概念的資訊是目前圖像相關研究的重點。因此本研究旨在自動圖像註解領域發展一個整合應用視覺詞與文字語意的方法,應用圖像檢索熱門方法 Bag-of-Visual-Words 模型作為提取圖像特徵的依據,以TF-IDF 加權圖像的視覺詞頻率,找出對圖像來說具重要性的視覺詞。語意部分,加入Word2Vec 模型計算字詞的語意概念,將視覺詞對應語意概念來找出適當的標籤字詞。本研究使用多標籤圖像集LabelMe 戶外街景圖片進行訓練與實驗,並探討本研究方法可行性,以準確率(Precision)、召回率(Recall)、????值衡量本系統產生自動註解的效能。
摘要(英) In common image searching scenarios, Image Search Engines like Google Image and Flickr that most people using usually are built on Text-Based Image Retrieval
techniques. By searching with keywords that user provide, Text-Based Image Retrieval techniques extremely rely on the describing context tag on images within the database. However, the practical data image uploader seldom provides detailed image tags or context description that make it even harder for Text-Based Image Retrieval to identify the correct image. To solve this problem, the development of Automatic Image Annotation is aimed to improve the process of manual construction.
How to effectively accomplish image retrieval and management has become a popular research topic in IT field since massive image data are now available in digital era. We propose an Automatic Image annotation approach integrating visual words and semantic words. Using popular image retrieval method Bag-of-Visual-Words to extract image features and combining with TF-IDF to calculate weighted visual word’s frequency, we can identify the most representative visual words for image. Furthermore, we apply Word2Vec model to conceptualize the meaning of context and generate image tags with proper semantic meaning. In this study, we use multi-label outdoor image dataset LabelMe to perform model training and experiments and discuss about the practicability and efficiency of this approach via Precision Rate, Recall Rate, and F1-measure.
關鍵字(中) ★ 自動圖像註解
★ 視覺詞
★ TF-IDF
★ Word2Vec
★ 多標籤圖像
關鍵字(英) ★ Automatic Image Annotation
★ visual word
★ TF-IDF
★ Word2Vec
★ Multi-label images
論文目次 中文摘要 i
英文摘要 ii
誌謝 iii
目錄 iv
圖目錄 vi
表目錄 vii
一、緒論 1
1-1 研究背景 1
1-2 研究動機 2
1-3 研究目的 2
1-4 研究範圍與限制 2
1-4-1 研究範圍 2
1-4-2 研究限制 3
1-5 論文架構 3
二、文獻探討 4
2-1 圖像檢索(Image Retrieval) 4
2-1-1 以文字為基礎的圖像檢索 4
2-1-2 以內容為基礎的圖像檢索 4
2-2 自動圖像註解(Automatic Image Annotation) 5
2-3 Word2Vec 模型 7
2-4 局部特徵的視覺物體表示 8
2-5 視覺詞袋模型(Bag-of-visual words) 11
三、研究方法 13
3-1 系統架構 13
3-2 圖像特徵分析 14
3-2-1 特徵提取 14
3-2-2 特徵分群 15
3-2-3 計算視覺詞TF-IDF 16
3-3 標籤語意分析 17
3-3-1 建立圖像集合 17
3-3-2 標記字詞前處理 19
3-3-3 計算字詞分數 19
3-4 測試階段 21
四、實驗分析與結果 22
4-1 實驗環境 22
4-2 實驗資料集 22
4-3 評估註解成果指標 24
4-4 實驗設計與結果 25
4-4-1 本研究在不同標籤數量下,準確率與召回率的變化 27
4-4-2 以平均準確率(AP)、平均召回率(AR)探討本研究與baseline之效能 28
4-4-3 以Fβ值(β=1)探討本研究與baseline之效能 30
4-4-4 以Fβ值(β=0.5)探討本研究與baseline之效能 31
4-4-5 以Fβ值(β=1.5)探討本研究與baseline之效能 32
4-5 實驗結果討論 33
4-6 註解結果 34
五、結論與未來研究方向討論 35
5-1 研究結論與貢獻 35
5-2 未來研究方向 36
參考文獻 37
參考文獻 [1]. Magesh, N., and Thangaraj, P. 2011. "Semantic Image Retrieval Based on Ontology and Sparql Query," International Conference on Advanced Computer Technology (ICACT).
[2]. Shen, H. T., Ooi, B. C., and Tan, K.-L. 2000. "Giving Meanings to Www Images," Proceedings of the eighth ACM international conference on Multimedia: ACM, pp. 39-47.
[3]. Srihari, R. K., Zhang, Z., and Rao, A. 2000. "Intelligent Indexing and Semantic Retrieval of Multimodal Documents," Information Retrieval (2:2-3), pp. 245-275.
[4]. Lai, H., Yan, P., Shu, X., Wei, Y., and Yan, S. 2016. "Instance-Aware Hashing for Multi-Label Image Retrieval," IEEE Transactions on Image Processing (25:6), pp. 2469-2479.
[5]. Li, J., and Wang, J. Z. 2008. "Real-Time Computerized Annotation of Pictures," IEEE transactions on pattern analysis and machine intelligence (30:6), pp. 985-1002.
[6]. Datta, R., Joshi, D., Li, J., and Wang, J. Z. 2008. "Image Retrieval: Ideas, Influences, and Trends of the New Age," ACM Computing Surveys (Csur) (40:2), p. 5.
[7]. 廖瑋星. 2010. "基於圖片內容擷取之圖像語意擴充與其影片廣告安插之應用," in: 資訊工程學研究所. 台北市: 國立臺灣大學, p. 59.
[8]. Feng, L., and Bhanu, B. 2016. "Semantic Concept Co-Occurrence Patterns for Image Annotation and Retrieval," IEEE Transactions on Pattern Analysis and Machine Intelligence (38:4), pp. 785-799.
[9]. Karpathy, A., and Fei-Fei, L. 2017. "Deep Visual-Semantic Alignments for Generating Image Descriptions," IEEE Transactions on Pattern Analysis and Machine Intelligence (39:4), pp. 664-676.
[10]. Farhadi, A., Hejrati, M., Sadeghi, M. A., Young, P., Rashtchian, C., Hockenmaier, J., and Forsyth, D. 2010. "Every Picture Tells a Story: Generating Sentences from Images," European conference on computer vision: Springer, pp. 15-29.
[11]. Gurjar, S. P. S., Gupta, S., and Srivastava, R. "Automatic Image Annotation Model Using Lstm Approach,").
[12]. Cheng, Q., Zhang, Q., Fu, P., Tu, C., and Li, S. 2018. "A Survey and Analysis on Automatic Image Annotation," Pattern Recognition (79), pp. 242-259.
[13]. Duygulu, P., Barnard, K., de Freitas, J. F., and Forsyth, D. A. 2002. "Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary," European conference on computer vision: Springer, pp. 97-112.
[14]. Makadia, A., Pavlovic, V., and Kumar, S. 2008. "A New Baseline for Image Annotation," European conference on computer vision: Springer, pp. 316-329.
[15]. Ciocca, G., Cusano, C., Santini, S., and Schettini, R. 2011. "Halfway through the Semantic Gap: Prosemantic Features for Image Retrieval," Information Sciences (181:22), pp. 4943-4958.
[16]. Chang, E., Goh, K., Sychay, G., and Wu, G. 2003. "Cbsa: Content-Based Soft Annotation for Multimodal Image Retrieval Using Bayes Point Machines," IEEE Transactions on Circuits and Systems for Video Technology (13:1), pp. 26-38.
[17]. Grangier, D., and Bengio, S. 2008. "A Discriminative Kernel-Based Model to Rank Images from Text Queries," IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (10:LIDIAP-ARTICLE-2008-010).
[18]. Mikolov, T., Chen, K., Corrado, G., and Dean, J. 2013. "Efficient Estimation of Word Representations in Vector Space," arXiv preprint arXiv:1301.3781).
[19]. Handler, A. 2014. "An Empirical Study of Semantic Similarity in Wordnet and Word2vec,".
[20]. Wikipedia. 2018/06/01. "Wikipedia維基百科:資料庫下載,". Retrieved June 20, 2018, from https://dumps.wikimedia.org/enwiki/
[21]. 王?杰. 2010. "基于?著局部特征的??物体表示方法." 北京理工大?.
[22]. Yuan, X., Yu, J., Qin, Z., and Wan, T. 2011. "A Sift-Lbp Image Retrieval Model Based on Bag of Features," IEEE international conference on image processing.
[23]. Lowe, D. G. 2004. "Distinctive Image Features from Scale-Invariant Keypoints," International journal of computer vision (60:2), pp. 91-110.
[24]. 林政威. 2015. "顯著物件與尺度不變特徵轉換特徵包比對之影像搜尋研究," 中央大學資訊管理學系學位論文, pp. 1-63.
[25]. Lv, H., Huang, X., Yang, L., Liu, T., and Wang, P. 2013. "A K-Means Clustering Algorithm Based on the Distribution of Sift," 2013 IEEE Third International Conference on Information Science and Technology (ICIST), pp. 1301-1304.
[26]. Yang, Z., and Kurita, T. 2013. "Improvements to the Descriptor of Sift by Bof Approaches," 2013 2nd IAPR Asian Conference on Pattern Recognition, pp. 95-99.
[27]. MacQueen, J. 1967. "Some Methods for Classification and Analysis of Multivariate Observations," Proceedings of the fifth Berkeley symposium on mathematical statistics and probability: Oakland, CA, USA, pp. 281-297.
[28]. Salton, G., and McGill, M. J. 1986. Introduction to Modern Information Retrieval. McGraw-Hill, Inc.
[29]. Moulin, C., Barat, C., and Ducottet, C. 2010. "Fusion of Tf. Idf Weighted Bag of Visual Features for Image Classification," Content-Based Multimedia Indexing (CBMI), 2010 International Workshop on: IEEE, pp. 1-6.
[30]. Russell, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. 2008. "Labelme: A Database and Web-Based Tool for Image Annotation," International journal of computer vision (77:1-3), pp. 157-173.
[31]. 林栗岑. 2017. "應用語意之字詞分群於多文件自動摘要之方法; Applying Semantic Clustering of Words on Multiple Documents Summarization Method." 國立中央大學.
[32]. 楊瑞敏. 2010. "多文件摘要系統基於mutual Reinforcement原理," in: 多媒體工程研究所. 新竹市: 國立交通大學, p. 50.
[33]. Frey, B. J., and Dueck, D. 2007. "Clustering by Passing Messages between Data Points," science (315:5814), pp. 972-976.
[34]. Google. "Google Cloud Platform,". Retrieved June 20, 2018, from https://cloud.google.com/
[35]. Microsoft. "Microsoft Azure : Computer Vision,". Retrieved June 20, 2018, from https://azure.microsoft.com/zh-tw/services/cognitive-services/computer-vision/
指導教授 周世傑(Shih-Chieh Chou) 審核日期 2018-7-30
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明