以視訊內容為基礎應用於MPEG影片之視訊檢索技術

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：11

、訪客IP：18.225.175.119

姓名

蘇志文(Chih-Wen Su) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

以視訊內容為基礎應用於MPEG影片之視訊檢索技術
(Content-based Video Retrieval Techniques for MPEG Video)

相關論文

★ 使用視位與語音生物特徵作即時線上身分辨識	★ 以影像為基礎之SMD包裝料帶對位系統
★ 手持式行動裝置內容偽變造偵測暨刪除內容資料復原的研究	★ 基於SIFT演算法進行車牌認證
★ 基於動態線性決策函數之區域圖樣特徵於人臉辨識應用	★ 基於GPU的SAR資料庫模擬器：SAR回波訊號與影像資料庫平行化架構 (PASSED)
★ 利用掌紋作個人身份之確認	★ 利用色彩統計與鏡頭運鏡方式作視訊索引
★ 利用欄位群聚特徵和四個方向相鄰樹作表格文件分類	★ 筆劃特徵用於離線中文字的辨認
★ 利用可調式區塊比對並結合多圖像資訊之影像運動向量估測	★ 彩色影像分析及其應用於色彩量化影像搜尋及人臉偵測
★ 中英文名片商標的擷取及辨識	★ 利用虛筆資訊特徵作中文簽名確認
★ 基於三角幾何學及顏色特徵作人臉偵測、人臉角度分類與人臉辨識	★ 一個以膚色為基礎之互補人臉偵測策略

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著數位時代的來臨，聲音、影像等各式各樣的資訊不僅能以更有效率、更加方便的形式被儲存，同樣也造成大量的影音資訊累積氾濫的問題。由於完全以人工方式對大量影片內容作註解需仰賴極大人力，且難以針對影片中每個段落定下客觀而詳盡的文字描述，而造成傳統上以文字查詢的方式，無法滿足影片內容搜尋上的需求。為了能在大量的數位影片中針對視訊內容查詢出特定影片段落，我們需藉由電腦事先對數位影片做自動而快速的分鏡偵測，再針對各個分割出來的鏡頭取出視訊特徵，再分就不同特徵以自動或半自動的方式加上客觀註解，並經整理分類存放於資料庫中，當使用者希望查詢某種影片內容時，可以直接透過電腦對此一建構過的資料庫做快速而有效比對，達到真正對影片內容查詢與瀏覽的目的。
有鑑於此，本論文主要提出兩項以視訊內容為基礎，應用於MPEG影片之視訊檢索技術。首先，我們針對慢換景(gradual transition)中最常被使用，也最難被現有技術正確判斷出來的溶解特效作研究。利用每個像素位置的亮度變化，統計所有在時間軸上符合線性變化特性的像素數量百分比，並與藉由累加二項式分配所預估出的門檻值比較，已達到明顯辨識出溶解式(dissolve)慢換景的目的。再對影片畫面序列做重新取樣，使偵測的正確率不受溶解特效的時間長短而降低。我們也可應用同一原理來偵測淡入(fade-in)、淡出(fade-out)形式的慢換景。理論與實驗結果都證實，我們的方法不僅快速有效，更能達到其他同類型研究所難達到的低錯誤率，真正有效降低因影片中的物體、攝影機運動所造成的大量誤判情況發生。
其次，我們提出了以應用層面最廣的MPEG影片為對象，藉由分析整理其內含之運動向量資訊，自動生成影片中移動物體的軌跡資訊，使我們在方法上有著下列優點：<1>直接應用現成已有的運動向量資訊，使註解時間更為快速。<2>影片中有多個移動物體之情況亦適用。<3>不受限於靜止裝設之攝影機所拍攝的影片。另一方面，我們亦發展了快速的軌跡比對策略，透過不同以往的座標表示法，使多數非相似軌跡僅須比對少數控制點就能加以排除，大幅加速檢索效率。

摘要(英)

Gradual shot change detection is one of the most important research issues in the field of video indexing/retrieval. Among the numerous types of gradual transitions, the dissolve-type gradual transition is considered the most common one, but it is also the most difficult one to detect. In most of the existing dissolve detection algorithms, the false/miss detection problem caused by motion is very serious. In this thesis, we present a novel dissolve-type transition detection algorithm that can correctly distinguish dissolves from disturbance caused by motion. We carefully model a dissolve based on its nature and then use the model to filter out possible confusion caused by the effect of motion.
Furthermore, we propose the use of motion vectors embedded in MPEG bitstreams to generate so-called “motion flows”, which are applied to perform quick video retrieval. By using the motion vectors directly, we do not need to consider the shape of a moving object and its corresponding trajectory. Instead, we simply “link” the local motion vectors across consecutive video frames to form motion flows, which are then annotated and stored in a video database. In the video retrieval phase, we propose a new matching strategy to execute the video retrieval task. Motions that do not belong to the mainstream motion flows are filtered out by our proposed algorithm. The retrieval process can be triggered by query-by-sketch (QBS) or query-by-example (QBE). The experiment results show that our method is indeed efficient and accurate in the video retrieval process.

關鍵字(中)

★ 換景偵測
★ 視訊檢索

關鍵字(英)

★ shot change detection
★ video retrieval

論文目次

1. Introduction 1
1.1 Motivation 2
1.2 Overview of CBIR 2
1.3 Overview of CBVR 3
1.3.1 Shot Change Detection 3
1.3.2 Features for CBVR 5
1.3.3 QBE Versus QBS 5
1.4 Organization of the Thesis 6
2. Background 7
2.1 MPEG Standards 8
2.1.1 Intra-Frame Coding 9
2.1.2 Inter-Frame Coding 12
2.2 Content-based Video Retrieval Techniques 14
2.2.1 Strategies of Shot Boundary Detection 15
2.2.2 Categories of Visual Features for CBVR 19
2.2.3 State-of-the-art 22
2.3 Concluding Remarks 27
3. A Motion-Tolerant Dissolve Detection Algorithm 28
3.1 Introduction 29
3.2 Modeling a Dissolve Transition 36
3.3 Threshold Selection 40
3.4 Discussion of False and Misdetection of Dissolve 46
3.4.1 Misdetection Caused by a Long Dissolve Duration 46
3.4.2 Color Shading 48
3.4.3 Illumination Problem 51
3.5 Experimental Results 51
3.6 Concluding Remarks 55
4. Motion Flow-based Video Retrieval 57
4.1 Introduction 58
4.2 Constructing Motion Flows from MPEG Bitstreams 62
4.2.1 Shot Change Detection 62
4.2.2 Camera Motion Estimation 63
4.2.3 Generating Motion Flow 67
4.3 Coarse-to-fine Trajectory Comparison 75
4.4 Experimental Results 81
4.5 Concluding Remarks 89
5. Conclusion and Future Work 91
References 93

參考文獻

[1]Y. Rui, T. S. Huang, and S. F. Chang,, “Image retrieval: current techniques, promising directions, and open issues,” in Journal of Visual Communication and Image Representation, Vol. 10, No. 1, pp. 39-62, Mar. 1999.
[2] W. Niblack, R. Berber, , W. Equitz, M. Flickner, E. Glasman, D. Petkovic, and P. Yanker, “The QBIC project: querying images by content using color, texture and shape”, in SPIE Storage and Retrieval for Image and Video Database II, pp. 173-187, Feb. 1993.
[3] J. Dowe, “Content-based retrieval in multimedia imaging,” in SPIE Storage and Retrieval for Image and Video Databases II, pp.164-167, 1993.
[4] J. R. Bach, C. Fuller, A. Gupta, A. Hampapur, B. Horowitz, R. Humphrey, R. Jain, and C. F. Shu, “The Virage image search engine: an open framework for image management,” in SPIE Storage and Retrieval for Image and Video Databases V, pp 76-87, 1996.
[5] J. R. Smith, and S. F. Chang,, “An image and video search engine for the World-Wide Web,” in Proc. of SPIE, vol. 3022, pp 85-95, 1997.
[6] C. Carson, S. Belongie, H. Greenspan and J. Malik, “Region-based image querying,” in IEEE CVPR'97 Workshop on Content-Based Access of Image and Video Libraries, pp. 42-49, 1997.
[7] T. P. Minka and R. W. Picard, “Interactive learning with a `society of models”, in Pattern Recognition, 30(4), pp. 565-581, Apr. 1997.
[8] Y. Rui, T. Huang, and S. Mehrotra, “Content-based image retrieval with relevance feedback in MARS,” in IEEE International Conference on Image Processing, pp. 815-818, Oct. 1997.
[9] Z. Yang, X. Wan,, and C. C. J. Kuo,, “Interactive image retrieval: concept, procedure and tools,” in IEEE 32nd Asilomar Conference, Montery, CA, pp. 261–265, Nov. 1998.
[10] A. Nagasaka and Y. Tanaka, “Automatic video indexing and full-video search for object appearances,” in Proc. IFIP 2nd Working Conf. Visual Database Systems, pp. 113–127, 1992.
[11] C. M. Lee and M. C. Ip, “A robust approach for camera break detection in color video sequence,” in Proc. IAPR Workshop Machine Vision Applications, Kawasaki, Japan, pp. 502–505, 1994.
[12] A. Hampapur, R. Jain, and T.Weymouth, “Production model based digital video segmentation,” J. Multimedia Tools Applicat., vol. 1, no. 1, pp. 9–46, 1995.
[13] H. C. Liu and G. L. Zick, “Automatic determination of scene changes in MPEG compressed video,” in Proc. ISCAS-IEEE Int. Symp. Circuits and Systems, pp. 764–767, 1995.
[14] J. Meng, Y. Juan, and S. F. Chang, “Scene change detection in a MPEG compressed video sequence,” in Proc. SPIE/IS&T Symp. Electronic Imaging Science and Technology: Digital Video Compression: Algorithms and Technologies, vol. 2419, pp. 14–25, 1995.
[15] B. Yeo and B. Liu, “Rapid scene analysis on compressed video,” IEEE Trans. Circuits Syst. Video Technol., vol. 5, pp. 533—544, 1995.
[16] H. J. Zhang, C. Y. Low, and S.W. Smoliar, “Video parsing and browsing using compressed data,” Multimedia Tools and Applicat., vol. 1, no. 1, pp. 91–113, 1995.
[17] D. Androutsos and A.N. Venetsanopoulos, “Efficient colour image indexing and retrieval using a vector based scheme,” Proceedings 1998 IEEE Second Workshop on Multimedia Signal Processing, Redondo Beach, California, 7-9, pp.15-20, December 1998.
[18] A. Hampapur, A. Gupta, B. Horowitz, C-F. Shu, C. Fuller, J. Bach, M. Gorkani, R. Jain, “Virage Video Engine”, SPIE Vol. 3022, pp 188-197, 1997.
[19] D. Ponceleon, S. Srinivasan, A. Amir, D. Petkovic, D. Diklic, “Key to Effective Video Retrieval: Effevtive Cataloging and Browsing”, ACM Multimedia, ’98, pp. 99-107.
[20] S-F. Chang, W. Chen, H. Meng, H. Sundaram, D. Zhong, “A Fully Automated Content Based Video Search Engine Supporting Spatio-Temporal Queries”, IEEE Transaction on Circuits and Systems for Video Technology, Vol. 8, No. 5, pp. 602-615, Sept., 1998.
[21] N. Dimitrova, H. J. Zhang, B. Shahraray, I. Sezan, T. Huang, and A. Zakhor, “Applications of video-content analysis and retrieval,” IEEE Multimedia, vol. 4, no. 3, pp. 42–55, Jul./Sep. 2002.
[22] S. Antani, R. Kasturi, and R. Jain, “A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of images and video,” Pattern Recognit., vol. 35, pp. 945–965, 2002.
[23] J. M. Corridoni and A. Del Bimbo, “Structured representation and automatic indexing of movie information content,” Pattern Recognit., vol. 35, no. 12, pp. 2027–2045, 1998.
[24] R. Lienhart, S. Pfeiffer, and W. Effelsberg, “Video abstracting,” Commun. ACM, vol. 40, no. 12, pp. 55–62, Dec. 1997.
[25] M. Bertini, A. Del Bimbo, and P. Pala, “Content-based indexing and retrieval of TV news,” Pattern Recognit. Lett., vol. 22, pp. 503–516, 2001.
[26] N. V. Patel and I. K. Sethi, “Video shot detection and characterization for video database,” Pattern Recognit., vol. 30, no. 4, pp. 583–592, 1997.
[27] E. Sahouria and A. Zakhor, “Content analysis of video using principal components,” IEEE Trans. Circuits Syst. Video Technol., vol. 9, no. 8, pp. 1290–1298, Dec. 1999.
[28] C. W. Ngo, T. C. Pong, and H. J. Zhang, “Motion-based video representation for scene change detection,” Int. J. Comput. Vis., vol. 50, no. 2, Nov. 2002.
[29] M. K. Shan and S. Y. Lee, “A framework for temporal similarity measures of content-based scene retrieval,” Pattern Recognit. Lett., vol. 22, pp. 517–532, 2001.
[30] I. B. Ozer, W. Wolf, and A. N. Akansu, “A graph-based object description for information retrieval in digital image and video libraries,” J. Vis. Commun. and Image Repres., vol. 13, pp. 425–459, 2002.
[31] H. J. Zhang, A. Kankanhalli, and S.W. Smoliar, “Automatic partitioning of full-motion video,” Multimedia Syst., vol. 1, no. 1, pp. 10–28, 1993.
[32] R. Zabih, J. Miller, and K. Mai, “A feature-based algorithm for detecting and classifying production effects,” ACM J. Multimedia Syst., vol. 7, no. 2, pp. 189-200, 1995.
[33] C. C. Shih, H. R. Tyan, and H. Y. M. Liao, “Shot change detection based on the Reynolds Transport Theorem,” in Proc. IEEE 2nd Pacific-Rim Conf. Multimedia, vol. 2195, Lecture Notes in Computer Science, Beijing, China, pp. 819–824, Oct. 24–26, 2001.
[34] C. W. Ngo, T. C. Pong, and R. T. Chin, “Detection of gradual transitions through temporal slice analysis,” in Proc. IEEE Computer Vision and Pattern Recognition, vol. I, pp. 36–41, Jun. 1999.
[35] M. Wu, W. Wolf, and B. Liu, “An algorithm for wipe detection,” in Proc. IEEE Int. Conf. Image Processing, pp. 893–897, 1998.
[36] R. Lienhart, “Reliable dissolve detection,” in Storage and Retrieval for Media Databases 2001, vol. 4315, Proc. SPIE, pp. 219–230, Jan. 2001.
[37] J. Nam and A. H. Tewfik, “Dissolve transition detection using B-splines interpolation,” in Proc. IEEE Int. Conf.Multimedia and Expo, pp. 1349–1352, Jul. 2000.
[38] Z. N. Li and J. Wei, “Spatio-temporal joint probability images for video segmentation,” in Proc. IEEE Int. Conf. Image Processing, vol. II, 2000, pp. 295–298.
[39] W. A. C. Fernando, C. N. Canagarajah, and D. R. Bull, “Fade and dissolve detection in uncompressed and compressed video sequences,” in Proc. IEEE Int. Conf. Image Processing, vol. 3, pp. 299–303, 1999.
[40] C. W. Ngo, T. C. Pong, and H. J. Zhang, “Motion analysis and segmentation through spatio-temporal slices processing,” IEEE Trans. Image Process., vol. 12, no. 3, pp. 341–355, Mar. 2003.
[41] L. F. Chen, H. Y. M. Liao, and J. C. Lin, “Wavelet-based optical flow estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 1, pp. 1–12, Feb. 2002.
[42] D. Li and H. Lu, “Lighting change problem in shot detection,” in Proc.7th IEEE Int. Conf. Electronics, Circuits and Systems 2000 (ICECS 2000), vol. 1, pp. 541–544, Dec. 2000.
[43] M. Flickner et al., “Query by image and video content: The QBIC system,” IEEE Compute. Mag., vol.28, pp.23-32, Sept. 1995.
[44] S. F. Chang, W. Chen, H. J. Meng, H. Sundaram, and D. Zhong, “A fully automated content-based video search engine supporting spatiotemporal queries,” IEEE Transactions on Circuits and Systems for Video Technology, Vol.8, No.5, pp.602- 615, 1998.
[45] D. H. Douglas and T. K. Peucker, “Algorithms for the reduction of the number of points required to represent a digitized line or its caricature,” The Canadian Cartographer, vol.10, No.2, pp.112-122, 1973.
[46] R. Wang and T. Huang, “Fast camera motion analysis in MPEG domain,” ICIP, vol.3, pp.691-694, Oct. 1999
[47] S. Dagtas, W. Al-Khatib, A. Ghafoor, and R.L. Kashyap, “Models for motion-based video indexing and retrieval,” IEEE Transactions on Image Processing, vol.9, No.1, pp.88-101, 2000.
[48] R. Fablet, P. Bouthemy, and P. Perez, “Nonparametric motion characterization using causal probabilistic models for video indexing and retrieval,” IEEE Transactions on Image Processing, vol.11, No.4, pp.393-407, 2002.
[49] A. Pentland, R.W. Picard, and S. Sclaroff, “Photobook: Content-Based Manipulation of Image Databases,” International Journal of Computer Vision, vol. 18, No. 3, pp.233-254, 1996.
[50] J. R. Smith, and S.F. Chang, “VisualSEEk: A Fully Automated Content-Based Image Query System,” ACM Multimedia Conference, pp.87-98, Nov. 1996.
[51] A. Hamrapur, A. Gupta, B. Horowitz, C.F. Shu, C. Fuller, J. Bach, M. Gorkani, and R. Jain, “Virage Video Engine,” SPIE Proceedings on Storage and Retrieval for Image and Video Databases V, San Jose, pp.188-197, Feb. 1997.
[52] Y. Deng and B. S. Manjunath, “NeTra-V: Toward an Object-based Video Representation,” IEEE Transactions on Circuits and Systems for Video Technology, Vol. 8, No.5, pp.616-627, Sept. 1998.
[53] Y. F. Ma, and H. J. Zhang, “Motion texture: a new motion based video representation,” 16th International Conference on Pattern Recognition, vol. 2, pp.548-551, 11-15 Aug. 2002.
[54] D. J. Lan, Y. F. Ma, and H. J. Zhang, “A novel motion-based representation for video mining,” International Conference on Multimedia and Expo, vol. 3, pp.469-472, 6-9 July. 2003.
[55] B. S. Manjunath, P. Salembier, and T. Sikora, Introduction to MPEG-7: Multimedia Content Description Interface, June. 2002.
[56] Y. Tsaig, and A. Averbuch, “Automatic segmentation of moving objects in video sequences: a region labeling approach,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no.7, pp.597-612, July. 2002.
[57] V. Mezaris, I. Kompatsiaris, and M. G. Strintzis, “Video object segmentation using Bayes-based temporal tracking and trajectory-based region merging,” IEEE Transactions on Circuits and Systems for Video Technology, vol.14, no.6, pp.782-795, June. 2004.
[58] R. Cucchiara, C. Grana, M. Piccardi, and A. Prati, “Detecting moving objects, ghosts, and shadows in video streams,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.25, no.10, pp.1337-1342, Oct. 2003.
[59] C. Rao, A. Gritai, M. Shah, and T. Syeda-Mahmood, “View-invariant alignment and matching of video sequences,” Ninth IEEE International Conference on Computer Vision, vol.2, pp.939-945, 13-16 Oct. 2003.
[60] M. Vlachos, G. Kollios, and D. Gunopulos, “Discovering similar multidimensional trajectories,” 18th International Conference on Data Engineering, pp.673-684, 26 Feb.-1 March. 2002.
[61] http://video.google.com/
[62] http://search.yahoo.com/
[63] C. -W. Su, H. -Y. Mark Liao, K. -C. Fan, C. -W. Lin, and H.-R. Tyan, “A Motion-Flow-Based Fast Video Retrieval System,” Proc. 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, Singapore, Nov. 10-11, 2005.
[64] C. -W. Su, H. -Y. Mark Liao, H. -R. Tyan, K. -C. Fan, and L.-H. Chen, “A Motion-Tolerant Dissolve Detection Algorithm,” IEEE Transactions on Multimedia, vol.7, no.6, Dec. 2005.
[65] C. -C. Shih, H. -R. Tyan, and H. -Y. Mark Liao, “Shot Change Detection based on the Reynolds Transport Theorem,” Lecture Notes in Computer Science, vol. 2195, pp.819-824.
[66] B. -L. Yeo and B. Liu, “A unified approach to temporal segmentation of motion JPEG and MPEG compressed video,” in Proc. 2nd Int. Conf. Multimedia Computing and Systems, 1995, pp. 81–83.
[67] S. F. Chang and D. G. Messerschmitt, “Manipulation and compositing of MC-DCT compressed video,” IEEE J. Select. Areas Commun., vol. 13, pp. 1–11, Jan. 1995.
[68] T. Sikora, "The MPEG-7 visual standard for content description - An overview," IEEE Trans. Circuits Syst. Video Technol., vol. 11, pp. 696--702, June 2001.
[69] Y. S. Kim and W. Y. Kim,“Content-based trademark retrieval system using a visually salient feature”, Image and Vision Computing, vol.16, pp. 931-939, 1998.

指導教授

廖弘源、范國清
(Hong-Yuan Liao、Kuo-Chin Fan)

審核日期

2006-7-17

推文