自然場景中手持裝置的跑馬燈偵測及文句重構

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：18

、訪客IP：3.144.35.132

姓名

廖文晧(Wen-Hao Liao) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

自然場景中手持裝置的跑馬燈偵測及文句重構
(Detecting and reconstructing marquee texts in natural scenes base on hand-held devices)

相關論文

★ 使用視位與語音生物特徵作即時線上身分辨識	★ 以影像為基礎之SMD包裝料帶對位系統
★ 手持式行動裝置內容偽變造偵測暨刪除內容資料復原的研究	★ 基於SIFT演算法進行車牌認證
★ 基於動態線性決策函數之區域圖樣特徵於人臉辨識應用	★ 基於GPU的SAR資料庫模擬器：SAR回波訊號與影像資料庫平行化架構 (PASSED)
★ 利用掌紋作個人身份之確認	★ 利用色彩統計與鏡頭運鏡方式作視訊索引
★ 利用欄位群聚特徵和四個方向相鄰樹作表格文件分類	★ 筆劃特徵用於離線中文字的辨認
★ 利用可調式區塊比對並結合多圖像資訊之影像運動向量估測	★ 彩色影像分析及其應用於色彩量化影像搜尋及人臉偵測
★ 中英文名片商標的擷取及辨識	★ 利用虛筆資訊特徵作中文簽名確認
★ 基於三角幾何學及顏色特徵作人臉偵測、人臉角度分類與人臉辨識	★ 一個以膚色為基礎之互補人臉偵測策略

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

在科技發達的今日，已經有許多資訊傳播媒介已經不再是單純使用傳統文字傳遞訊息，而是改用更方便的裝置如電視牆及跑馬燈等。當人們面對這麼多資訊時難免有遺漏的狀況出現，而在這些遺漏的資訊中有許多資訊對我們來說可能是相當重要且緊急的。因此如何幫助人們在這資訊過於快速的時代中完整重現遺漏的資訊便成為一項重要的課題。
　　由於跑馬燈能夠在有限的版面上快速呈現大量的文字訊息且造價相對便宜，因此相當普及，但因為跑馬燈的顯示範圍有限，一般情況下無法同一時間顯示完整的文字訊息，而需要不斷更新訊息，導致人們有一定的機會遺漏文字資訊。由於現今智慧型手機的普及，因此本論文提出一個以手持裝置為基礎的方法，用以擷取跑馬燈上的文字資訊來幫助人們避免遺漏掉跑馬燈上的重要訊息。首先在偵測階段我們利用一些前景偵測的方法來找出跑馬燈在影片中的位置，接著在重組階段利用本論文提出的過濾機制，將影片中不同樣的文字過濾出來，並將其重組來得到完整的文字訊息。
　　實驗部份我們用了兩種方法來對偵測階段做探討，分別為偵測跑馬燈位置的精準度以及跑馬燈在影片中被偵測到的準確度；在重組階段的探討中，我們採用了正確擷取文字的字數來計算重組階段的正確率。實驗結果顯示，在影片晃動的情況下無論是在白天或是晚上皆能有良好的正確率。

摘要(英)

Nowadays, traditional text-based message in conveying information can no longer meet the demand of information broadcasting. Instead, people convey important information via TV walls or marquees (i.e. scrolling texts) benefiting from the emerging of mature technology. Encountering such vast information conveyed by scrolling texts, people may miss some important information due to the inherent scrolling nature. How to help people catching the information hence becomes an important issue to be pursued.
In this thesis, a novel system is proposed to extract texts displayed on an electronic marquee, especially when the input videos are captured using hand-held devices. The proposed system consists of two stages including detection stage and reconstruction stage. In the detection stage, the position of a marquee in each frame is located by utilizing the Gaussian Mixture Model and optical flow information. In the reconstruction stage, a LDP based text-filtering method is designed to retrieve complete text information.
Experiments were conducted to verify the validity of the proposed system. Among them, two experiments were conducted to demonstrate the accuracy of the detected marquee region. As to another experiment, it was conducted to demonstrate the performance in the construction stage by counting how many words are correctly retrieved. Experimental results show that the proposed system works well in capturing scrolling texts both in day and night.

關鍵字(中)

★ 跑馬燈

關鍵字(英)

★ Marquee

論文目次

目錄
摘要 i
Abstract ii
圖目錄 iv
表目錄 vii
第一章緒論 1
1-1 研究動機與目的 1
1-2 文獻探討 2
1-3 系統流程 4
1-4 論文架構 5
第二章　相關研究 7
2-1　影片穩定方法（Video stabilization） 7
2-2　文字區域偵測 9
第三章　系統架構與方法 14
3-1　偵測階段(Detection stage) 15
3-1-1　單一高斯模型(Uni-variate Gaussian Model) 15
3-1-2　高斯混合模型(Gaussian Mixture Model) 16
3-1-3　直方圖均衡化(Histogram equalize) 19
3-1-4　Harris corner 21
3-1-5　光流演算法(Optical Flow) 23
3-1-6　伽瑪校正(Gamma Correction) 24
3-1-7　HSV色彩空間 26
3-2　重組階段(Reconstruction stage) 27
3-2-1　幾何投影法 27
3-2-2　Local Directional Patterns(LDP) 29
3-2-3　模板匹配(Template matching) 32
3-2-4　直方圖匹配(Histogram matching) 34
第四章實驗結果與分析 37
4-1　偵測階段 39
4-2　重組階段 47
4-3　失敗案例 49
4-4　OCR結果與探討 51
第五章　結論與未來工作 52
5-1　結論 52
5-2　未來工作 52
參考文獻 54
圖目錄
圖2-1利用X方向及Y方向移動的平滑化取得較平滑的移動路徑[1] 7
圖2-2利用圖2-1算出的平滑移動路徑回饋至crop window[1] 8
圖2-3將因影像穩定而切除的影像還原[3] 8
圖2-4利用穩定影像邊緣像素移動向量計算如何還原被切除的影像[3] 9
圖2-5在螢幕下方的邊緣密集度較其他地方更為密集[4] 9
圖2-6藉由文字筆畫像素的方向分配成直方圖[6] 10
圖2-7找出可能的文字區塊[6] 10
圖2-8藉由計算模板內像素的灰階值搭配Ada-booster來找出文字區塊[6] 10
圖2-9利用全域及區域門檻值過濾後跟利用遮罩還原後的結果[8] 11
圖2- 10用於還原低對比邊緣像素的遮罩[8] 11
圖2-11利用梯度計算出連通單元寬度[14] 12
圖2-12將封閉邊緣內的像素填滿[14] 12
圖2-13將葉子等寬度過大的連通單元去除後留下文字區塊[14] 12

圖3-1系統詳細流程圖 14
圖3-2高斯單一模型示意圖 16
圖3-3高斯混合模型示意圖 17
圖3-4單一張前景偵測圖與三十張聯集後的前景偵測圖 18
圖3-5擷取跑馬燈位置示意圖 19
圖3-6直方圖均衡化示意圖 19
圖3-7直方圖均衡化示意圖 20
圖3-8昏暗環境示意圖 20
圖3-9將昏暗環境利用直方圖等化加強對比 21
圖3-10 Harris corner示意圖 22
圖3-11 Lucas-Kanade多尺度金字塔示意圖 24
圖3-12伽瑪校正灰階值關係圖 25
圖3-13經過伽瑪校正處理後的二值化影像及HSV影像 25
圖3-14未經過伽瑪校正處理後的二值化影像及HSV影像 25
圖3-15 HSV空間各空間關係圖 26
圖3-16幾何投影法示意圖 27
圖3-17幾何投影法流程圖 28
圖3-18幾何投影法標記示意圖 28
圖3-19經過標記後的二值化區塊 29
圖3-20經過標記後的灰階圖區塊 29
圖3-21 LBP範例 29
圖3-22 LBP遮罩範例圖 30
圖3-23 LDP範例 31
圖3-24 LDP範例 31
圖3-25 LDP範例 32
圖3-26 Template matching示意圖 32
圖3-27 Template matching示意圖 33
圖3-28 LDP應用示意圖 34
圖3-29 LDP應用示意圖 34
圖3-30直方圖匹配示意圖 36
圖3-31直方圖匹配示意圖 36
圖3-32直方圖匹配結果圖 36
圖3-33直方圖匹配輸入圖 36
圖3-34直方圖匹配輸入圖 36
圖3-35直方圖匹配輸入圖 36

圖4-1偵測區塊與跑馬燈區塊 39
圖4-2失誤影像示意圖 41
圖4-3重組階段實驗結果示意圖 48
圖4-4重組階段實驗結果示意圖 48
圖4-5重組階段實驗結果示意圖 48
圖4-6失敗案例 49
圖4-7失敗案例二值化圖 49
圖4-8失敗案例 50
圖4-9失敗案例二值化圖 50

參考文獻

[1]　M. Grundmann, V. Kwatra, I. Essa. “Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths,” Computer Vision and Pattern Recognition (CVPR) 2011. pp. 225 – 232.
[2]　徐聖哲,數位影像穩定技術及其應用,交通大學博士論文,2010
[3]　Ｙ. Matsushita, E. Ofek, X. Tang, H.Y. Shum. “Full-frame Video Stabilization”, Computer Vision and Pattern Recognition (CVPR) 2005. pp. 50 – 57.
[4]　C. Liu, C. Wang, R. Dai. ” Text Detection in Images Based on Unsupervised Classification”, International Conference on Document Analysis and Recognition (ICDAR) 2005. pp. 610 – 614.
[5]　 B.M. Saturnino, L.A. Sergio, G.J. Pedro, G.M. Hilario, L.F. Francisco. “Road-Sign Detection and Recognition Based on SUpport Vector Machines”, Intelligent Transportation Systems, 8(2), 2007. pp. 264 – 278.
[6]　C. Yi, Y. Tian, A. Arditi. “Portable Camera-Based Assistive Text and Product”, Mechatronics, 19(3), 2014. pp. 808 – 817.
[7]　Haque, M. Murshed, M. Paul, M. “A Hybird Object Detection Technique from Dynamic Background Using Gaussian Mixture Models”, Multimedia Signal Processing, 2008, pp. 915 – 920.
[8]　 M.R. Lyu, J. Song, M. Cai, “A Comprehensive Method for Multilingual Video Text Detection, Localization, and Extraction”, Circuits and Systems for Video Technology,15(2) 2005. pp. 243 – 255.
[9]　陳厚安, 自然場景跑馬燈偵測與完整文具重構,中央大學碩士論文,2011
[10] A.K. Jain, B. Yu, “Automatic Text Location in Images and Video Frames”. Pattern Recognition,31(12),1998, pp. 2055-2076.
[11] Q. Ye, Q. Huang, W. Gao, D. Zhao, “Fast and robust text detection in images and video frames”. Image and Vision Computing,23(6), 2005, pp. 565-576.
[12] K. Jung, K.I. Kim, and A.K. Jain, “Text information extraction in images and videos: A survey”. Pattern Recognition, 37(5), 2004, pp.977-997.
[13] P. Shivakumara, T.Q. Phan and C.L. Tan, “A Robust Wavelet Transform Based Technique for Video Text Detection”, International Conference on Document Analysis and Recognition (ICDAR), 2009, pp. 1285-1289.
[14] B.Epshtein, E. Ofek, Y. Wexler, “Detecting Text in Natural Scenes with Stroke Width Transform” Computer Vision and Pattern Recognition (CVPR) 2010, pp. 2963-2970.
[15] J. Zhang, R. Kasturi,“Text Detection Using Edge Gradient and Graph Spectrum”, International Conference on Pattern Recognition (ICPR), 2010, pp. 3979-3982.
[16] C. Jung, Q. Liu, J.Kim, “A stroke filter and its application to text localization”, Pattern Recognition Letters,30(2), 2009, pp. 114-122.
[17] C.M. Wang, K.C. Fan, C.T. Wang, “Estimating Optical Flow by Integrating Multi-Frame Information”, Journal of Information Science and Engineering, 24(6), 2008, pp.1719-1731.

指導教授

范國清(Kuo-Chin Fan)

審核日期

2014-7-29

推文