基於文字與主播偵測之新聞視訊分析系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：20

、訪客IP：18.222.21.222

姓名

卓晉億(Chin-yi Chou) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

基於文字與主播偵測之新聞視訊分析系統
(A TV News Analysis Scheme based onText and Anchorperson Identification)

相關論文

★ 基於QT之跨平台無線心率分析系統實現	★ 網路電話之額外訊息傳輸機制
★ 針對與運動比賽精彩畫面相關串場效果之偵測	★ 植基於向量量化之視訊/影像內容驗證技術
★ 植基於串場效果偵測與內容分析之棒球比賽精華擷取系統	★ 以視覺特徵擷取為基礎之影像視訊內容認證技術
★ 使用動態背景補償以偵測與追蹤移動監控畫面之前景物	★ 應用於H.264/AVC視訊內容認證之適應式數位浮水印
★ 棒球比賽精華片段擷取分類系統	★ 利用H.264/AVC特徵之多攝影機即時追蹤系統
★ 利用隱式型態模式之高速公路前車偵測機制	★ 基於時間域與空間域特徵擷取之影片複製偵測機制
★ 結合數位浮水印與興趣區域位元率控制之車行視訊編碼	★ 應用於數位智權管理之H.264/AVC視訊加解密暨數位浮水印機制
★ 植基於數位浮水印之H.264/AVC視訊內容驗證機制	★ 利用隱式型態模式之自適應車行監控畫面分析系統

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在數位科技漸趨成熟的今日，大量的影音資訊藉由數位化與日益進步的壓縮技術而得到廣泛的傳遞與永久的保存。現今的使用者能夠藉由不同的管道取得大量的多媒體資訊，但龐大的多媒體資料若需以人工方式搜尋或加註以分類則是相當耗時的。因此，如何協助使用者有效率地搜尋及萃取多媒體資訊的技術與工具成為一個相當重要的研究議題。
本研究針對新聞視訊提出協助內容擷取與分類的工具。在新聞視訊內容中，文字是最重要的特徵之一，少許的幾個文字可為新聞內容給予精確的註解，若能對新聞中的文字進行有效的識別，將有助於對新聞內容的認識與了解。然而，在台灣的新聞頻道中，畫面文字包括了新聞標題、氣象預報、股市行情與跑馬燈，內容繁複，且文字字體與字型及其大小格式不一，而目前的文字識別軟體僅能針對少數已訓練過字型做識別，無法作用於台灣多數新聞頻道中的文字，如何從複雜的新聞畫面中擷取出利於分析的區域，便成為待解決的問題。此外，穿插於新聞播報中的廣告會使得內容分析受到影響，因此我們必須予以有效剔除以利分析。本研究將針對有代表性意義的文字區域進行偵測擷取及相關處理，並對上述問題提出解決的方法。

摘要(英)

With the Proliferation of multimedia data, requests for effective and efficient video retrieval are growing. Among the various kinds of digital videos, TV news videos play an important role in broadcasting nowadays and may also serve as a major source of daily information for people these days. In Taiwan, there are several TV news stations and duplicated news videos are repeated again and again. Watching them may be a waste of time. Considering that the digital recording facilities are widely available
now, we propose a classification scheme that can cluster the recorded TV news video segments so that the viewers may choose to watch the related archived news and even retrieve the useful information from them.
In the proposed scheme, we make use of the text in TV news for clustering videos. It should be noted that the text analysis in Taiwan’s TV news needs further processing since the text areas in Taiwan’s TV news may include various information including the caption, weather report, and stock market indices etc. It’s challenging to locate the area where we are really interested in. Furthermore, video OCR is not mature enough and does not work quite well in Taiwan’s TV news broadcasting because of the special and different text fonts used in each TV news channel. We apply the low-level feature extraction and SVM to locate the possible region of interest, which should help to differentiate new segments from commercials. Then the anchorperson scene will be located to divide a piece of news into two parts, one part with the anchorperson describing the news and the other part related to the news content itself. Next, we extract the caption in the second part, in which the text is more stable and representative. After refining the extracted text areas, a cross-correlation process is used to find the similar pattern in captions of video segments to relate them together. Experimental results will be
shown to demonstrate the feasibility of this potential solution.

關鍵字(中)

★ 廣告
★ 文字偵測
★ SVM
★ 電視新聞
★ 數位電視

關鍵字(英)

★ commercial
★ text detection
★ SVM
★ TV news
★ Digital videos

論文目次

Chapter 1 緒論 1
1.1 研究動機與目的 1
1.2 新聞視訊分析方法 2
1.3 論文架構 4
Chapter 2 背景知識 5
2.1 人臉偵測 5
2.1.1 Integral images6
2.1.2 Haar-Like特徵 8
2.1.3 Adaboost 10
2.1.4 Cascade 11
2.2 Sobel 14
2.3 Otsu 演算法 14
2.4 SVM (Support Vector Machines) 15
2.5 形態學的膨脹和侵蝕 17
2.6 Connected Component 18
2.7 Wierner Filter 18
2.8 Cross Correlation 21
Chapter 3 相關研究 22
3.1 視訊文字偵測 22
3.2 廣告偵測 23
3.3 新聞分類 25
Chapter 4 研究方法 26
4.1 系統概述 26
4.2 主播辨識 27
4.2.1 膚色偵測 28
4.2.2 主播辨識訓練31
4.2.3 主播辨識 33
4.3 標題偵測 34
4.3.1 SVM特徵擷取 34
4.3.2 SVM訓練流程 35
4.3.3 SVM辨識流程 36
4.3.4 標題文字分析 38
4.3.5 標題位置偵測 40
4.4 新聞分段偵測 44
4.4.1 新聞分析 44
4.4.2 新聞分段 45
4.5 新聞分類 47
4.5.1 新聞標題擷取 47
4.5.2 新聞分類 47
Chapter 5 實驗結果與討論 49
5.1 SVM 49
5.1.1 邊緣能量強度值 49
5.1.2 標題區域偵測 51
5.2 新聞分段偵測 53
5.3 新聞分類 56
Chapter 6 結論與未來方向 59
REFERENCE 60

參考文獻

[1] Nevenka Dimitrova , Hong-Jiang Zhang , Behzad Shahraray , Ibrahim Sezan , Thomas Huang , Avideh Zakhor, Applications of Video-Content Analysis and Retrieval, IEEE MultiMedia, v.9 n.3, p.42-55, July 2002
[2] P. Viola and M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple Features”, IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 511-518, 2001
[3] C. Garcia, and G.Tziritas. “ Face Detection Using Quantized SkinColor Regions Merging and Wavelet Packet Analysis.” in IEEE Transactions on Multimedia vol. 1 , No. 3 , pp. 264-277, 1999.
[4] I. Sobel, “An isotropic 3_3 image gradient operator,” in Machine Vision for Three-Dimensional Scenes, H. Freeman, Ed. New York: Academic, 1990, pp. 376–379.
[5] N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst., Man, Cybernet., vol. SMC-9, no. 1, pp. 62–66, Jan.
[6] Vapnik, V. , "Statistical Learning Theory," New York, NY: Wiley, 1998
[7] Chang, C. et al, "The analysis of decomposition methods for support vector machines," IEEE Transations on Neural Networks, 2000 , 11 (4):1003 ~1008
[8] C.J.C. Burges. “A tutorial on support vector machines for pattern - 61 -recogition.” Data Mining and Knowledge Discovery, 2(2) 955-974， 1998.
[9] N.Cristianini, J. Shawf-Taylor. “An Introduction to Support Vector Machines and other kernel-based learning methods,” Cambridge University Press，2000.
[10] Steve R. Gunn. “Support Vector Machines for Classification and Regression,” University of Southampton， Technical Report 1998.6
[11] M.A. Smith, T. Kanade, Video skimming for quick browsing based onaudio and image characterization, Carnegie Mellon University Pittsburgh, PA, Technical Report CMU-CS-95-186, July, 1995.,
[12] M. R. Lyu , J. Song and M. Cai “A comprehensive method for multilingual video text detection, localization, and extraction,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, pp. 243, 2005.
[13] K. I. Kim, K. Jung, and J. H. Kim, “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 1631-1639, 2003.
[14] H. Li, D. Doermann, and O. Kia, “Automatic text detection and tracking in digital video,” IEEE Trans. Image Process., vol. 9, no. 1, pp. 147–156, Jan. 2000.
[15] Y. Zhong, K. Karu, A.K. Jain, Locating text in complex color images, Pattern Recognition 28 (1995) 1523–1535.
[16] U. Gargi, S. Antani, and R. Kasturi, “Indexing text events in digital video databases,” in Proc. 14th Int. Conf. Pattern Recognit., vol. 1, 1998, pp. 916–918.
[17] Y. Zhong, H.-J. Zhang, and A. K. Jain, “Automatic caption localization in compressed video,” in Proc. Int. Conf. Image Process., vol. 2, 1999, pp. 96–100.
[18] Y.-K. Lim, S.-H. Choi, and S.-W. Lee, “Text extraction in MPEG compressed video for content-based indexing,” in Proc. Int. Conf. on Pattern Recognit., vol. 4, 2000, pp. 409–412.
[19] D. Sadlier, S. Marlow, N. O'Connor, and N. Murphy, "Automatic TV advertisement detection from mpeg bitstream, " Pattern Recogition Society, vol. 35, no. 12, pp. 2-15, 2002.
[20] X. Hua, L.Lu, and H. Zhang, "Robust Learning-Based TV Commercial Detection", in Proceedings of the IEEE International Conference on Multimedia and Expo (ICME’05), vol. 4, pp. 6-8, July 2005.
[21] Alexander G. Hauptmann, and Michael J. Witbrock, “Story Segmentation and Detection of Commercials In Broadcast News Video” IEEE Conference “Research and Technologies Advances In Digital Libraries” 1988.
[22] J. Yeh, J. Chen, and J. Kuo et al, "TV commercial detection in news program videos, " in Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS’05), Vol. 5, pp. 23-26, 2005.
[23] K.K. Sung and T. Poggio, “Example-Based Learning for View-Based Human Face Detection,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 1, pp. 39-51, Jan. 1998.
[24] K. I. Kim, K. Jung and J. H. Kim, “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp.1631-1639, 2003.
[25] Dimitrova N., Agnihotri, L. and Wei G. Video Classification Based on HMM Using Text and Faces. European Conference on Signal Processing, Finland, 2000.
[26] Huang, J., Liu, Z., Wang, Y., Chen, Y. and Wong, E.K. Integration of Multimodal Features for Video Scene Classification Based on HMM. IEEE Third Workshop on Multimedia Signal Processing, Copenhagen, Denmark, 1999.
[27] Wei-Hao Lin, Alexander G. Hauptmann: News video classification using SVM-based multimodal classifiers and combination strategies. ACM Multimedia 2002: 323-326
[28] Weiyu Zhu, C. Toklu, and Shih-Ping Liou, “Automatic news video segmentation and categorization based on closed-captioned text,” Pro. Of IEEE Int’l Conf. on Multimedia and Expo, pp. 829-832, 2001.
[29] Y. Ariki, and T. Teranishi, “Indexing and classification of TV news articles based on telop recognition,” Proc. of the Fourth Int’l Conf. on Document Analysis and Recognition, vol. 1, pp. 422-427, 1997.
[30] Yoichi Yamashita, Toshikatsu Tsunekawa, Riichiro Mizoguchi, “Topic Recognition for News Speech Based On Keyword Spotting,” Proc. of 5th Int’l Conf. on Spoken Language Processing, 1998.
[31] Wei Qi; Lie Gu, Hao Jiang; Xiang-Rong Chen, Hong-Jiang Zhang, “Integrating visual, audio and text analysis for news video,” Proc. of 2000 Int’l Conf. on Image Processing, Vol. 3, pp. 520-523, 2000.
[32] Min-Kuan Chang, Ko-Yen Lu, Chia-Hung Yeh & Hsuan-Huei Shih, "Anchor person detection for TV news segmentation based on visual features," in Proceedings of SPIE conferences on OpticsEast, vol. 6391, pp. T1-T10, 2006.
[33] Qixiang Ye, Qingming Huang, Wen Gao, Debin Zhao: Fast and robust text detection in images and video frames. Image Vision Comput. 23(6): 565-576 2005
[34] W.T. Freeman, K. Tanaka, J.Ohta, and K. Kyuma, “Computer Vision for Computer Games,” Int. Conf. On Automatic Face and Gesture Recognition, pp.100-105, 1996.
[35] 鐘國亮, “影像處理與電腦視覺”, 第三版, pp.88-90, 1995

指導教授

蘇柏齊(Po-chyi Su)

審核日期

2009-7-24

推文