以自我組織特徵映射圖網路為基礎之
影像摘要系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：33

、訪客IP：18.217.106.26

姓名

陳劍航(Chien-Hang Chen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

以自我組織特徵映射圖網路為基礎之影像摘要系統
(A SOM-based Approach to Video Summarization System )

相關論文

★ 以Q-學習法為基礎之群體智慧演算法及其應用	★ 發展遲緩兒童之復健系統研製
★ 從認知風格角度比較教師評量與同儕互評之差異：從英語寫作到遊戲製作	★ 基於檢驗數值的糖尿病腎病變預測模型
★ 模糊類神經網路為架構之遙測影像分類器設計	★ 複合式群聚演算法
★ 身心障礙者輔具之研製	★ 指紋分類器之研究
★ 背光影像補償及色彩減量之研究	★ 類神經網路於營利事業所得稅選案之應用
★ 一個新的線上學習系統及其於稅務選案上之應用	★ 人眼追蹤系統及其於人機介面之應用
★ 結合群體智慧與自我組織映射圖的資料視覺化研究	★ 追瞳系統之研發於身障者之人機介面應用
★ 以類免疫系統為基礎之線上學習類神經模糊系統及其應用	★ 基因演算法於語音聲紋解攪拌之應用

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著電腦運算速度的增加，儲存裝置的加大、網路技術的進步、各種影音壓縮格式的產生，都造成了影像資訊在我們生活中越來越普及。因此，如何有效管理影像資料庫便成了一個值得探討、有趣的議題，但一般傳統的文字資訊管理方法並不適用於影像資料的管理，而一個有效的影像資料庫必需具備以下兩種功能：（1）有效的影像摘要功能，可以幫助使用者快速了解影像內容。（2）有效的影像搜尋功能，可以幫助使用者在大量的影像資料庫快速找出想要的影像內容。而影像摘要系統著重在分析出影像中的語意架構，如此一來會比較符合人所認知的感覺。
在本篇論文中，我們將研究重點著重在影像摘要部份，提出了以情境為索引的影像摘要系統，根據影片中每段不同意義的情境作為摘要，幫助使用者快速瞭解影片內容，以及使得搜尋動作更加方便。
一般影片摘要系統包含以下幾個步驟、從分鏡偵測（Shot detection）、主視訊頁擷取（Key-frame extraction）、分鏡合併（Shot group）到最後的情境偵測（Scene detection），每個步驟都環環相扣，前一步驟的結果不佳，就可能影響下一個步驟的結果。
而本篇論文的特點為利用自我組織特徵映射圖網路（Self-organizing Feature Map Network，簡稱SOM），因為SOM有能夠將資料特徵保存在映射圖上的拓璞特性，所以在映射圖上，特徵相似的分鏡(Shot)，會映射在靠近的區域，接著再利用區域增長演算法（Region growing）將相似的分鏡合併起來為群組，最後再利用情境偵測的演算法來分析群組，將語意相同的群組合併成情境（Scene），而最後使用者可以利用建立好的情境圖，或者是階層式的樹狀表示法來了解影片的內容。
最後實驗部份，我們測試了各種不同類型的影片，並經由不同測試者來和系統分析出來做比較。

摘要(英)

Due to rapid advances and improvements in electronics hardware and networking technologies and the decreasing cost of storage, video data are becoming available at an ever increasing rate. Traditional database management technique for text documents cannot effectively data with video data; therefore, Method and technique to automatically analyze video data have become a very attractive and challenging research topic. An efficient video database management system should have following two functionalities: 1) the video summarization functionality which make the take of browsing video content become easy and 2) the video retrieval functionality which can retrieve video from a huge video database based on user queries. This thesis focuses on the development of video summarization technique. Traditional way to browse video data is via the “fast forward” and “rewind” function keys to manually locate the region of interest. It is very time consuming. The goal of the new video summarization technique proposed in this thesis is to provide and effective table of content which can capture the semantic structure of a vide document. Several different approaches to video summarization technique have been proposed, each for its own advantage and limitation.
The proposed video summarization technique involves the following four steps: 1) shot detection, 2) key-frame extraction, 3) shot group, 4) scene detection. The most appealing property of the proposed technique is the use of the self-organizing feature map(SOM).Since the SOM has the topologically preserving property, shots with similar feature will be grouped into the scene cluster and similar will be located nearby on a map. Then a region growing technique is employed to merge similar shots into groups. After the group map has been constructed, an effective scene detection technique is adopted to merge groups with a similar semantic concept into a scene. The constructed scene map can be either directly used as the table of content of a video document or transformed to a hierarchy tree to represent the video content of a video document. Via the scene map or the hierarchy tree, a user can effectively browse the content. The performance of the proposed technique is demonstrated by experiments on several different types of video documents.

關鍵字(中)

★ 影像摘要
★ 影像瀏覽
★ 自我組織特徵映射圖網路
★ 分鏡偵測
★ 語意分析

關鍵字(英)

★ video Summarization
★ video browsing
★ SOM
★ shot detection
★ semantic analyze

論文目次

摘要 IV
Abstract VI
致謝 VIII
目錄 IX
表目錄 XI
圖目錄 XII
第一章緒論 1
1.1 研究動機 1
1.2 相關研究 2
1.3 論文架構 5
第二章相關技術研究探討 7
2.1 分鏡變換偵測 7
2.1.1 分鏡介紹 7
2.1.2 偵測方法 8
2.2主視訊頁選取 15
2.3分鏡合併 16
2.4情境偵測 18
第三章研究方法及步驟 21
3.1系統架構 21
3.2分鏡偵測 22
3.3特徵擷取 23
3.4以SOM為基礎之分鏡合併 25
3.4.1 分鏡距離函數定義 25
3.4.2自我組織特徵映射圖網路(SOM)演算法介紹 25
3.4.3 SOM初始化設定 26
3.4.4 訓練後處理 28
3.4.5 區域增長(Region Growing)介紹 28
3.4.6 利用區域增長合併分鏡 30
3.4.7修正分鏡合併的結果 31
3.5情境偵測 35
第四章實驗結果及分析 36
4.1評斷方式及測試資料 36
4.2實驗結果及分析 37
4.3實驗結果探討 46
第五章結論及未來展望 48
5.1結論 48
5.2未來展望 48
參考文獻 50

參考文獻

[1] Y. A. Aslandogan and C. T. Yu,“Techniques and systems for image and video retrieval,”IEEE Transaction on Knowledge and Data Engineering., vol. 11, no. 1, pp. 56-63, January 1999.
[2] J. Bescós, G. Cisneros, J.M. Martínez, J. M. Menéndez, and J. Cabrera “A Unified Model for Techniques on Video-Shot Transition Detection,” IEEE Transactions on Multimedia, vol. 7, no. 2, pp. 293-307, April 2005
[3] J. S. Boreczky and L. A. Rowe “Comparison of video shot boundary detection techniques,” In SPIE Proceedings on Storage & Retrieval for Image and Video Databases IV, vol. 2664, pp. 170-179, 1996.
[4] S. F. Chang, W. Chen, H. J. Meng, H. Sundaram, and D. Zhong, “VideoQ: an automated content based video search system using visual cues,”Proceedings of ACM Multimedia Conf., pp. 313-324, 1997.
[5] C. Cotsaces, N. Nikolaidis, and I. Pitas“Video Shot Detection and Condensed Representation, A review,” IEEE Signal Processing Magazine, vol. 23, pp. 28 – 37, Mar. 2006
[6] A. Dailianas, R. B. Allen, and P. England “Comparison of automatic video segmentation algorithms,” In SPIE Proceedings on Integration Issues in Large Commercial Media Delivery Systems, vol. 2615, pp. 2-16, 1996.
[7] A. Divakaran, R. Radhakrishnan, and K. A. Peker“Video summarization using descriptors of motion activity: a motion activity based approach to key-frame extraction from video shots,”J. Electron. Imag., vol. 10, no. 4, pp. 909-916, 2001.
[8] A. Divakaran, R. Radhakrishnan, and K. Peker,“Motion activity-based extraction of key-frame from video shots,”in Proc. IEEE ICIP Conf., vol.1, pp. 932-935, 2002.
[9] J. Fan, A. K. Elmagarmid, X. Zhu, W. G. Aref, and L. Wu, “ClassView: Hierarchical Video Shot Classification, Indexing, and Accessing,” IEEE Transactions on Multimedia, vol. 6, no. 1, pp. 648-666, February 2004.
[10] J. Fan, H. Luo, and A. K. Elmagarmid, “Concept-Oriented Indexing of Video Databases: Toward Semantic Sensitive Retrieval and Browsing,” IEEE Transactions on Image Processing, vol. 13, no. 7, pp. 974-992, July 2004.
[11] R. M. Ford, C. Robson, D. Temple, and M. Gerlach “Metrics for shot boundary detection in digital video sequences,” Multimedia Systems, vol. 8, pp. 37 – 46, January 2000.
[12] Gupta and R. Jain, “Visual information retrieval,” Communications of ACM, vol. 40, no. 5, pp. 70–79, May 1997.
[13] Hampapur, A. Gupta, B. Horowitz, C.F. Shu, C. Fuller, J. Bach, M. Gorkani, and R. Jain, “Virage video engine, In SPIE Proceedings on Storage and Retrieval for Video and Image Databases V, pp. 188-197, 1997.
[14] J. H. Lee, G. G. Lee, and W. Y. Kim, “Automatic Video Summarizing Tool using MPEG-7 Descriptors for Personal Video Recorder,” IEEE Transactions on Consumer Electronics, vol. 49, no. 3, pp. 742-749, August 2003.
[15] D. Lelescu and D. Schonfeld, “Statistical sequential analysis for real-time video scene change detection on compressed multimedia bitstream,” IEEE Transaction on Multimedia, vol. 5, no. 1, pp. 106-117, March 2003.
[16] R. Lienhart, “Reliable dissolve detection,” In Proceedings of the SPIE Conference on Storage and Retrieval for Media Databases, vol. 4315, pp. 219-230, January 2001.
[17] M. R. Naphade and J. R. Smith“Learing Regional Semantic Concept from Incomplete Annotation,” IEEE International Conference on Image Processing, vol. 2, pp. 603-606, 2003.
[18] A. Nagasaka and Y. Tanaka “Automatic Video Indexing and Full-Video Search for Object Appearances,” Visual Database Syst, pp. 113-127, 1992.
[19] J. Nam and A. Tewfik, “Detection of gradual transitions in video sequences using Bspline interpolation,” IEEE Transaction on Multimedia, vol. 7, no. 4, pp. 667–679, August 2005.
[20] Natsev, M. R. Naphade, and J. R. Smith, “Exploring Semantic Dependencies For Scaleable Concept Detection,” IEEE International Conference on Image Processing, vol. 3, pp. 625-628, 2003.
[21] Otsuka, K. Nakane, A. Divakaran, K. Hatanaka, and M. Ogawa,“A Highlight Scene Detection and Video Summarization System using Audio Feature for a Personal Video Recorder,” IEEE Transactions on Consumer Electronics, vol. 51, no. 1, pp. 112-116, February 2005.
[22] Y. Rui, T. S. Huang, and S. Mehrotra,“Constructing table-of-content for videos,”Multimedia Systems, vol. 7, pp. 359-368, 1999.
[23] Sethi IK and Patel N., “A Statistical Approach to Scene Change Detection,” In SPIE Proceedings on Storage and Retrieval for Image and Video Databases III, vol. 2420, pp. 329-338, 1995.
[24] J. R. Smith and S. F. Chang, “VisualSEEk: a fully automated content based image query system,”Proceedings of ACM Multimedia Conf., pp. 87-98, 1996.
[25] J. R. Smith, M. Naphade, and A. Natsev “Multimedia Semantic Indexing Using Model Vectors,” Proceedings of ICME, vol. 2, pp. 445-448, 2003.
[26] M. C. Su, T. K. Liu, and H. T. Chang, “Fast self-organizing feature map algorithm,” IEEE Transaction on Neural Networks, vol. 13, no 3. pp. 721-733, 2000.
[27] D. Swanberg, C.F. Shu, and Jain. R., “Knowledge Guided Parsing and Retrieval in Video Databases,” In SPIE Proceedings on Storage and Retrieval for Image and Video Databases, Wayne Niblack, Editor, pp. 173-187, February 1993.
[28] W. Tavanapong and J. Zhou “Shot Clustering Techniques for Story Browsing,” IEEE Transactions on Multimedia, vol. 6, no. 4, August 2004.
[29] R.C. Veltkamp, M. Tanase, and D. Sent, “Features in content-based image retrieval systems: A survey,” State-of-the-Art in Content-Based Image and Video Retrieval, pp. 97-124, 2001.
[30] H. Wactlar,“Informedia—Search and summarization in the video medium,”in Proc. Imagina Conf., 2000.
[31] Y. Wang, Z. Liu, and J. C. Huang,“Multimedia content analysis,” IEEE Signal Processing Magazine, vol. 17, no. 6, pp. 12-36, November 2000.
[32] W. Wolf,“Key frame selection by motion analysis,”In Proceedings of the IEEE International Conference on ICASSP, vol. 2, pp. 1228-1231, May 1996.
[33] M. Yeung and B. L. Yeoz, “Segmentation of Video by Clustering and Graph Analysis,” Computer Vision and Image Understanding vol. 71, no. 1, pp. 94-109, July 1998.
[34] J. Yu and M.D. Srinath, “An efficient method for scene cut detection,” Pattern Recognition Letters, vol. 22, no. 13, pp. 1379-1391, November 2001.
[35] R. Zabih, J. Miller, and K. Mai, “A feature-based algorithm for detecting and Classifying Scene Breaks,” ACM Multimedia Syst., vol. 7, no. 1, pp. 119-128, January 1999.
[36] H.J. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic Partitioning of Full-motion Vide,” Multimedia Systems vol. 1, no. 1, pp. 10-28, 1993.
[37] H. J. Zhang, C. Y. Low, S.W. Smoliar, and D. Zhong, “Video parsing, retrieval and browsing: an integrated and content-based solution,”Proceedings of ACM Multimedia Conf., pp. 15-24 1995.
[38] Y. Zhang, Y. Rui, T. S. Huang, and S. Mehrotra,“Adaptive key frame extraction using unsupervised clustering,”in Proc. ICIP, vol. 1, pp. 866-870, 1998.
[39] X. Zhu, A. K. Elmagarmid, X. Xue, L. Wu, and Ann Christine Catlin, “InsightVideo: Toward Hierarchical Video Content Organization for Efficient Browsing, Summarization and Retrieval,” IEEE Transactions on Multimedia , vol. 7, no. 4, pp. 648-665, August 2005.
[40] 孫中麒，「低價位之導盲系統」，國立中央大學資工系碩士論文，民國九十四年七月。
[41] 賴俊銘，「結合空間與動量特徵分析之MPEG-4新聞視訊摘要系統」，國立中正大學電機系碩士論文，民國九十二年七月。
[42] 蘇木春，張孝德，「機器學習：類神經網路、模糊系統以及基因演算法則」，全華科技圖書股份有限公司，民國九十三年。

指導教授

蘇木春(Mu-Chun Su)

審核日期

2006-7-22

推文