適用於二維至三維影像轉換之單眼視覺深度萃取方法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：45

、訪客IP：18.119.143.253

姓名

范辰碩(Chen-shuo Fan) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

適用於二維至三維影像轉換之單眼視覺深度萃取方法
(Monocular Vision Based Depth Map Extraction Method for 2D to 3D Video Conversion)

相關論文

★ 即時的SIFT特徵點擷取之低記憶體硬體設計	★ 即時的人臉偵測與人臉辨識之門禁系統
★ 具即時自動跟隨功能之自走車	★ 應用於多導程心電訊號之無損壓縮演算法與實現
★ 離線自定義語音語者喚醒詞系統與嵌入式開發實現	★ 晶圓圖缺陷分類與嵌入式系統實現
★ 語音密集連接卷積網路應用於小尺寸關鍵詞偵測	★ G2LGAN: 對不平衡資料集進行資料擴增應用於晶圓圖缺陷分類
★ 補償無乘法數位濾波器有限精準度之演算法設計技巧	★ 可規劃式維特比解碼器之設計與實現
★ 以擴展基本角度CORDIC為基礎之低成本向量旋轉器矽智產設計	★ JPEG2000靜態影像編碼系統之分析與架構設計
★ 適用於通訊系統之低功率渦輪碼解碼器	★ 應用於多媒體通訊之平台式設計
★ 適用MPEG 編碼器之數位浮水印系統設計與實現	★ 適用於視訊錯誤隱藏之演算法開發及其資料重複使用考量

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

此篇論文揭露了兩種適用於二維影像轉三維影像過程中所需的影像深度資訊之半自動萃取方法。因為近來對於三維立體影像的需求，又三維影像內容的資源不是那麼普遍。如果我們也想方便的享受到逼真的立體視覺效果，勢必需要開發建立於低成本和高效能的後端轉換方法，把欲觀看的二維影像快速的轉換為三維立體影像。
對於靜態背景的二維輸入影像，我們提出一個利用前景切割和單眼視覺深度線索中的消失線技術。根據前景切割演算法分離出來的背景與前景，使用者可以根據後天學習到的視覺經驗在初始化過程給予電腦判斷背景影像中距離觀賞者遠近的訊息，之後前景再根據計算出的背景深度獲得適當的自身深度值。從實現結果比較中也顯示，在立體視覺觀感跟其他參考資料差不多的情況下，此演算法在CIF規格的影像中可以達到0.17s/frame的處理速度。
此外，我們沿用上述的概念，提出另一個適用於動態背景的二維輸入影像轉換的方法。利用估測運動向量和運動補償機制，計算出所謂背景的相對移動速度，進而用來分離出所謂的背景與前景，取代了前景切割的步驟。根據實驗結果顯示，雖然此演算法在動態背景的轉換上尚不能達到相對於靜態背景的轉換品質，但是此演算法有較廣的使用層面，並且在CIF規格的影像中更可以達到0.15s/frame的處理速度。

摘要(英)

There are two semi-automatic depth map extraction methods for stereo video conversion presented in this thesis. Due to demand of 3D visualization and lack of 3D video content, we must develop low cost and high efficiency post processing methods to convert efficiently from 2D to 3D video if everyone wants to enjoy vivid 3D video.
For static background video sequence, we proposed a method that is combined foreground segmentation with vanishing line technology of monocular depth cue. According to the results of separated foreground and background from foreground segmentation algorithm, viewer could use their acquired visual experience to assign computer some depth information of background at initialization step. Then, foreground would be obtained relative depth information form background depth map. This algorithm could be operated at 0.17s/frame in CIF size video under nearly 3D visualization to other references from our experimental results.
Moreover, we proposed another conversion method followed conception as mentioned above for dynamic background video sequence. Foreground segmentation was replaced by relative velocity estimation based on motion estimation and motion compensation. Although this method is not able to attend equally quality of foreground segmentation method, this method still has wide utility and could be operated at 0.15s/frame in CIF size video.

關鍵字(中)

★ 二維至三維
★ 深度圖

關鍵字(英)

論文目次

摘要 V
ABSTRACT VI
CHAPTER 1 - 1 -
INTRODUCTION - 1 -
1.1 DEVELOPMENT OF 2D TO 3D CONVERSION - 1 -
1.2 MOTIVATION - 3 -
1.3 THESIS ORGANIZATION - 4 -
CHAPTER 2 - 5 -
RELATED WORK - 5 -
2.1 OVERVIEW OF 2D TO 3D SYSTEM - 5 -
2.2 MONOCULAR DEPTH CUES - 6 -
2.2.1. FAMILIAR SIZE - 7 -
2.2.2. RELATIVE SIZE - 8 -
2.2.3. BRIGHTNESS - 8 -
2.2.4. OCCLUSION - 8 -
2.2.5. SHADING AND SHADOWS - 8 -
2.2.6. ATMOSPHERIC PERSPECTIVE - 9 -
2.2.7. LINEAR PERSPECTIVE - 9 -
2.2.8. RELATIVE HEIGHT - 9 -
2.2.9. TEXTURE GRADIENT - 9 -
2.2.10. CONTOUR - 10 -
2.2.11. ACCOMMODATION - 10 -
2.2.12. BLUR - 10 -
2.3 DEPTH ESTIMATION SCHEMES - 10 -
2.4 DEPTH IMAGE BASED RENDERING - 15 -
CHAPTER 3 - 17 -
PROPOSED APPROACH OF METHOD-1 - 17 -
3.1 OVERVIEW OF METHOD-1 - 17 -
3.2 FOREGROUND SEGMENTATION - 18 -
3.2.1. STATIC BACKGROUND MODELING - 19 -
3.2.2. AREA FILTER - 21 -
3.2.3. OBJECT LABELING - 22 -
3.3 DEPTH EXTRACTION AND DEPTH FUSION PROCESS - 23 -
3.3.1. VANISHING LINE EXTRACTION AND GRADIENT PLANE GENERATION - 24 -
3.3.2. BACKGROUND DEPTH EXTRACTION - 25 -
3.3.3. FOREGROUND DEPTH FUSION - 27 -
CHAPTER 4 - 28 -
PROPOSED APPROACH OF METHOD-2 - 28 -
4.1 OVERVIEW OF METHOD-2 - 28 -
4.2 DYNAMIC BACKGROUND SUBTRACTION - 29 -
4.2.1. MOTION ESTIMATION AND APPLICATION - 30 -
4.2.2. MOVING OBJECT EXTRACTION - 31 -
CHAPTER 5 - 33 -
EXPERIMENT RESULTS AND IMPLEMENTATION - 33 -
5.1 EXPERIMENT RESULTS - 33 -
5.1.1. MEHTOD-1 EXPERIMENT RESULTS - 34 -
5.1.2. METHOD-2 EXPERIMENT RESULTS - 35 -
5.1.3. COMPARISON OF EXPERIMENT RESULTS - 38 -
5.2 IMPLEMENTATION ON SMIMS VERISOC-CA8 - 45 -
5.3 IMPLEMENTATION ON ITRI PAC DUO - 48 -
CHAPTER 6 - 52 -
CONCLUSION AND FUTURE WORK - 52 -
6.1 CONCLUSION - 52 -
6.2 FUTURE WORK - 53 -

參考文獻

[1] P. V. Harman, “Home-based 3D entertainment—An overview,” in IEEE Int. Conf. Image Process., Vancouver, 2000, pp. 1–4.
[2] Kyuseo Han, and Kihyun Hong, "Geometric and texture cue based depth-map estimation for 2D to 3D image conversion," IEEE International Conference on Consumer Electronics (ICCE), vol., no., pp. 651-652, 9-12 Jan. 2011.
[3] Guttmann, M., Wolf, L., and Cohen-Or, D., "Semiautomatic stereo extraction from video footage," 2009 IEEE 12th International Conference on Computer Vision, vol., no., pp. 136-142, Sept. 29 2009-Oct. 2 2009.
[4] C. Tan, T. Hong, T. Chang, et al., “Color model-based real-time learning for road following,” in IEEE Intelligent Transportation Systems Conference, 2006, pp. 939–944.
[5] G. Zhang, N. Zheng, C. Cui, et al., “An efficient road detection method in noisy urban environment,” in IEEE Intelligent Vehicle Symposium, Xi’an, China, 2009, pp. 556–561.
[6] M. Blas, M. Agrawal, A. Sudaresan, et al., “Fast color/texture segmentation for outdoor robots,” in IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems, 2008, pp. 4078–4085.
[7] A. Saxena, J. Schulte, and A. Y. Ng, “Depth estimation using monocular and stereo cues,” in International Joint Conference on Artificial Intelligence, 2007.
[8] V. Cantoni, L. Lombardi, M. Porta, N. Sicari, “Vanishing Point Detection: Representation Analysis and New Approaches” Dip. di Informatica e Sistemistica – Università di Pavia IEEE 2001.
[9] http://www.cns.nyu.edu/~david/courses/perception/lecturenotes/depth/depth-size.html
[10] http://the7deadlysenses.blogspot.tw/2011/01/monocular-cues-lets-explore-some-in.html
[11] http://johndollin.blogspot.tw/2010/08/why-do-objects-still-appear-in-3d-with.html
[12] Q. Wei, “Converting 2D to 3D: A survey,” Delft University of Technology, The Netherlands, Project Report, Dec. 2005.
[13] Sebastiano Battiato ; Salvatore Curti ; Marco La Cascia ; Marcello Tortora and Emiliano Scordato "Depth map generation by image classification", Proc. SPIE 5302, Three-Dimensional Image Capture and Applications VI, 95 (April 16, 2004);
[14] J.R. Smith, Chung-Sheng Li, “Decoding Image Semantics Using Composite Region Templates”, In Proceedings of CVPR, Workshop on Content-Based Access of Image and Video Libraries, 1998.
[15] J.R. Smith, Chung-Sheng Li, “Image Classification and Querying Using Composite Region Templates”, Journal of Computer Vision and Image Understanding, 1999.
[16] J.R. Smith, Shih-Fu Chang, “Multi-stage Classification of Image from Futures and Related Text”, In Proc. of Fourth DELOS workshop, Pisa, Italy, August, 1997.
[17] V. Cantoni, L. Lombardi, M. Porta, N. Sicari, “Vanishing Point Detection: Representation Analysis and New Approaches”, Dip. di Informatica e Sistemistica – Università di Pavia IEEE 2001.
[18] Fengli Yu; Ju Liu; Yannan Ren; Jiande Sun; Yuling Gao; Wei Liu; , "Depth generation method for 2D to 3D conversion," 3DTV Conference: The True Vision - Capture, Transmission and Display of 3D Video (3DTV-CON), 2011 , vol., no., pp.1-4, 16-18 May 2011
[19] V. Cantoni, L. Lombardi, M. Porta, and N. Sicard, “Vanishing point detection: Representation analysis and new approaches,” Proc. of ICIAP, pp. 90–94, Sep. 2001.
[20] T.-H. Tsai; C.-Y. Lin; S.-Y. Li, "Algorithm and Architecture Design of Human-Machine Interaction in Foreground Object Detection with Dynamic Scene," Circuits and Systems for Video Technology, IEEE Transactions on, vol. PP, no.99, pp.0-1.
[21] D.-S. Lee, "Effective Gaussian Mixture Learning for Video Background Subtraction," IEEE Transactions on Pattern Analysis and Maching Intelligence, vol. 27, no. 5, MAY 2005.
[22] Shyue-Wen Yang, Ming-Hwa Sheu, Jun-Jie Lin, Chuang-Chun Hu, Tzu-Hsiung Chen and Shau-Yin Tseng, “Parallel 3-Pixel Labeling Method and its Hardware Architecture Design,” IEEE IAS., vol. 1, pp. 185-188,. 2009.
[23] Holger Flatt, Steffen Blume, Sebastian Hesselbarth, Torsten Sch¨unemann, and Peter Pirsch, “A Parallel Hardware Architecture for Connected Component Labeling Based on Fast Label Merging,” IEEE ASAP., pp. 144 – 149, May. 2008.
[24] C.-Y. Lin, S.-Y. Li and T.-H. Tsai, “A scalable parallel hardware architecture for connected component labeling,” IEEE ICIP., Sept. 2010.
[25] S. Battiato, A. Capra, S. Curti, and M. La Cascia, "3D stereoscopic image pairs by depth-map generation", Second International Symposium on 3D Data Processing, Visualization and Transmission, pp. 124-131, 2004.
[26] Difan Zhao, Yannan Ren, Jiande Sun, Wei Liu, and Ju Liu "Depth map extraction based on geometry," Proceedings of IEEE Southeastcon, vol., no., pp.1-5, 15-18 March 2012.
[27] Yih-Chuan Lin; Shen-Chuan Tai; , "Fast full-search block-matching algorithm for motion-compensated video compression," Pattern Recognition, 1996., Proceedings of the 13th International Conference on , vol.3, no., pp.914-918 vol.3, 25-29 Aug 1996.
[28] Kilthau, S.L.; Drew, M.S.; Moller, T.; , "Full search content independent block matching based on the fast Fourier transform," Image Processing. 2002. Proceedings. 2002 International Conference on , vol.1, no., pp. I-669- I-672 vol.1, 2002.

指導教授

蔡宗漢

審核日期

2013-1-10

推文