適用於二維至三維影像轉換之基於超像素與邊緣資訊深度萃取方法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：5

、訪客IP：3.145.90.229

姓名

黃泰維(Tai-wei Huang) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

適用於二維至三維影像轉換之基於超像素與邊緣資訊深度萃取方法
(A Novel Method for 2D-to-3D Video Conversion Based on Superpixels and Edge Information)

相關論文

★ 即時的SIFT特徵點擷取之低記憶體硬體設計	★ 即時的人臉偵測與人臉辨識之門禁系統
★ 具即時自動跟隨功能之自走車	★ 應用於多導程心電訊號之無損壓縮演算法與實現
★ 離線自定義語音語者喚醒詞系統與嵌入式開發實現	★ 晶圓圖缺陷分類與嵌入式系統實現
★ 語音密集連接卷積網路應用於小尺寸關鍵詞偵測	★ G2LGAN: 對不平衡資料集進行資料擴增應用於晶圓圖缺陷分類
★ 補償無乘法數位濾波器有限精準度之演算法設計技巧	★ 可規劃式維特比解碼器之設計與實現
★ 以擴展基本角度CORDIC為基礎之低成本向量旋轉器矽智產設計	★ JPEG2000靜態影像編碼系統之分析與架構設計
★ 適用於通訊系統之低功率渦輪碼解碼器	★ 應用於多媒體通訊之平台式設計
★ 適用MPEG 編碼器之數位浮水印系統設計與實現	★ 適用於視訊錯誤隱藏之演算法開發及其資料重複使用考量

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本篇論文提出一個基於superpixel的2D-to-3D的方法，此為自動深度擷取轉換方法。近來三維立體影像需求的增加，而三維影像內容資源之缺乏。如果想容易的享受到逼真的立體視覺效果，勢必需要開發低成本、高效率的轉換方法，將原本二維的影像快速的轉換成三維立體影像。
首先，我們使用高斯模型作前景偵測，分離出前景與背景，接著，我們使用superpixel演算法來找出邊緣資訊，我們將顏色相近和位置相鄰的pixels作clustering，根據superpixel群聚出來的像素我們給予初始的深度值，我們會初始六種不同的深度圖，利用hough transform來找出消失線的斜率，接著利用斜率可知哪個深度圖是我們要的，給完初始深度值後，我們再用sobel edge detection來作第二次的邊緣偵測，用兩種不同的閥值來得到不同邊緣資訊，一個有較多雜訊但邊緣資訊較完整，另一個雜訊較少但邊緣資訊也較缺乏，然後用thinning演算法來降低邊緣像素的寬度使其變成只有1 pixel，比較這兩個結果後重新賦予深度值，接著再將前景資訊加進來給前景物件相同的深度值，為了使深度圖更加精準，因此，我們利用四種方向掃描整張影像來修正深度值，即可得到最後的深度圖，最後，再用depth image based rendering (DIBR)來合成左右視角的影像，如此，就完成了3D影像。

摘要(英)

This paper proposes novel method for 2D-to-3D video conversion. It is based on boundary information to automatically generate the depth map. First, we use Gaussian model to detect foreground objects and then separate the foreground and background. Next, we use the superpixel algorithm to find the edge information. Then according to the pixels which are clustered by superpixel, the initial depth values are acquired. Based on the result for depth value assignment, we detect the edges by Sobel edge detection with two thresholds to strength the edge information. To identify the pixel of boundary, we use thinning algorithm to the results of edge detection. Comparing these results and re-assign the depth value, the depth value of foreground will be refined. In order to make more accurate depth map, we use four kinds of scanning path for the entire image to correct depth values. After that, we will have the final depth map. Finally, use depth image based rendering (DIBR) to synthesize left and right view image. The 2D-to-3D conversion will complete. Combining the depth map and the original 2D video, a vivid 3D video is produced.

關鍵字(中)

★ 立體影像
★ 深度圖

關鍵字(英)

★ 2D-to-3D
★ depth map

論文目次

摘要 2
ABSTRACT 3
致謝 4
CHAPTER 1 1
INTRODUCTION 1
1.1 DEVELOPMENT OF 2D TO 3D CONVERSION 1
1.2 MOTIVATION 3
1.3 THESIS ORGANIZATION 4
CHAPTER 2 5
RELATED WORK 5
2.1 OVERVIEW OF 2D TO 3D SYSTEM 5
2.2 HUMAN VISUAL PERCEPTION AND 3D CUES 6
2.3 DEPTH ESTIMATION SCHEMES 9
2.4 DEPTH IMAGE BASED RENDERING 10
CHAPTER 3 11
PROPOSED APPROACH OF METHOD 11
3.1 OVERVIEW OF METHOD 11
3.2 GAUSSIAN MIXTURE MODEL 12
3.2.1 BACKGROUND MODELING 13
3.2.2 AREA FILTER 16
3.2.3 MOVING OBJECT DETECTION 17
3.3 SLIC SUPERPIXELS 17
3.3.1 ALGORITHM 18
3.3.2 DISTANCE MEASURE 21
3.3.3 POSTPROCESSING 22
3.4 DEPTH EXTRACTION AND DEPTH FUSION PROCESS 22
3.4.1 DEPTH FROM PRIOR HYPOTHESIS 22
3.4.2 SOBEL EDGE DETECTION 23
3.4.3 ZHANG AND SUEN THINNING METHOD 25
3.4.4 DEPTH ASSIGNMENT 27
3.4.5 DEPTH IMAGE BASED RENDERING 31
CHAPTER 4 32
EXPERIMENT RESULTS AND IMPLEMENTATION 32
4.1 EXPERIMENTRESULTS 33
CHAPTER 5 46
CONCLUSION AND FUTURE WORK 46
5.1 CONCLUSION 47
5.2 FUTURE WORK 47

參考文獻

[1] G. Lawtom, “3d displays without glasses: Coming to a sceen near you,” IEEE Computer, vol. 44, no. 1, pp. 17-19, Jan. 2011.
[2] R.S. Brar, P. Surman, I.Sexton, R.Bares, W.K. Lee, K. Hopf, F. Neumann, S.E. Day, and E. Willman, “Laser-based head-tracked 3D display research,” IEEE J. Display Technology, vol. 6, no. 10, pp. 531-543, Sept. 2010.
[3] M. Harris, “3-D without four eyes,” IEEE Spectrum, vol. 47, no. 12, pp. 50-56, Dec. 2010.
[4] A. Smolic and P. Kauff, “Interactive 3-D video representation and coding technologies,” Proc. IEEE, vol. 93, no. 1, pp. 98-110, Jan. 2005.
[5] M. Tanimoto, “Free viewpoint television – FTV,” in Proceedings of 2004 Picture Coding Symposium, Dec. 2004.
[6] P. V. Harman, “Home-based 3D entertainment—An overview,” in IEEE Int. Conf. Image Process., Vancouver, 2000, pp. 1–4.
[7] K. S. Han, and K. Y. Hong, "Geometric and texture cue based depth-map estimation for 2D to 3D image conversion," IEEE International Conference on Consumer Electronics (ICCE), vol., no., pp. 651-652, 9-12 Jan. 2011.
[8] Guttmann, M., Wolf, L., and Cohen-Or, D., "Semiautomatic stereo extraction from video footage," 2009 IEEE 12th International Conference on Computer Vision, vol., no., pp. 136-142, Sept. 29 2009-Oct. 2 2009.
[9] C. Tan, T. Hong, T. Chang, et al., “Color model-based real-time learning for road following,” in IEEE Intelligent Transportation Systems Conference, 2006, pp. 939–944.
[10] G. Zhang, N. Zheng, C. Cui, et al., “An efficient road detection method in noisy urban environment,” in IEEE Intelligent Vehicle Symposium, Xi’an, China, 2009, pp. 556–561.
[11] M. Blas, M. Agrawal, A. Sudaresan, et al., “Fast color/texture segmentation for outdoor robots,” in IEEE/RSJ Inter. Conf. on Intelligent Robots and Systems, 2008, pp. 4078–4085
[12] C. Fehn, E. Cooke, O. Schreer, and P. Kauff, “3d analysis and image-based rendering for immersive TV application,” Signal Processing: Image Communication, vol. 17, pp. 705-715, 2002.
[13] B. S. Kim, H. Lee, and W. Y. Kim, “Rapid eye detection method for non-glasses type 3d display on portable devices,” IEEE Trans. Consumer Electron., vol. 56, no. 4, pp. 2498-2505, Nov. 2010.
[14] S. Battiato , S. Curti , M. L. Cascia, M. Tortora and E. Scordato "Depth map generation by image classification", Proc. SPIE 5302, Three-Dimensional Image Capture and Applications VI, 95 (April 16, 2004);
[15] J.R. Smith, C. S. Li, “Decoding Image Semantics Using Composite Region Templates”, In Proceedings of CVPR, Workshop on Content-Based Access of Image and Video Libraries, 1998.
[16] J.R. Smith, C. S. Li, “Image Classification and Querying Using Composite Region Templates”, Journal of Computer Vision and Image Understanding, 1999.
[17] J.R. Smith, S. F. Chang, “Multi-stage Classification of Image from Futures and Related Text”, In Proc. of Fourth DELOS workshop, Pisa, Italy, August, 1997.
[18] Stautter, C., and Grmson, W.E.L. “Adaptive background mixture models for real-time tacking,” IEEE Conference on Computer Vision & Pattern Recognition. Colorado, USA. pp. 246-252. June 1999.
[19] T. H. Tsai, C. Y. Lin, S. Y. Li, "Algorithm and Architecture Design of Human-Machine Interaction in Foreground Object Detection with Dynamic Scene," Circuits and Systems for Video Technology, IEEE Transactions on, vol. PP, no.99, pp.0-1.
[20] D. S. Lee, "Effective Gaussian Mixture Learning for Video Background Subtraction," IEEE Transactions on Pattern Analysis and Maching Intelligence, vol. 27, no. 5, MAY 2005.
[21] D. S. Lee, “Effective Gaussian mixture learning for video background subtraction,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.2/, no.5, pp. 827-832, May 2005.
[22] T. H. Tsai, Wen-Tsai Sheu, and Chung-Yuan Lin, “Foreground Object Detection Based on Multi-model Background Maintenance” IEEE International Symposium on Multimedia Workshops, pp. 151-159, 10-12 Dec 2007.
[23] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P. and Süsstrunk, S., “SLIC Superpixels Compared to State-of-the-Art Superpixel Methods,” Pattern Analysis and Machine Intellgence, IEEE Transactions on, vol.34, pp.2274-2282, Nov. 2012.
[24] Z. B. Zhang, Y. H. Wang, T. T. Jiang, and W. Gao, “Visual Pertiment 2D-to-3D Video Conversion by Multi-Cue Fusion,” Image Processing(ICIP), 2011 18th IEEE International Conference on, pp.909-912, Sept. 2011.
[25] Y. C. Fan and T. C. Chi, “The Novel non-hole-filling approach of depth image based rendering”, in 3DTV Conf.: True Vision – Capture, Transmission and Display of 3D Video, May 28-30, 2008, pp.325-328.
[26] Y. C. Fan, Y. C. Chen and S. Y. Chou “Vivid-DIBR based 2D-3D image conversion system for 3D display,” in Display Technology, Journal of Vol.10, Issue.10 pp. 887-898. Oct. 2014.
[27] S. F. Tsai, C. C. Cheng, C. T. Li, and L. G. Chen, “A real-time 1080p 2D-to-3D video conversion system,” in IEEE Int. Conf. on Consumer Electron., May 2011 vol.57, no. 2, pp. 915-922.
[28] C. Fehn, “A 3D-TV system based on video plus depth information (DIBR),” in Proc. Visualiz., Imag., Image Process., Sep. 2003, pp. 482-487.
[29] C. Fehn, “A 3D-TV system based on video plus depth information ,” in IEEE Conf. Rec. 37th Asilomar Conf. on Signals, Syst., Computers, Nov. 2003, pp. 1529-1533.
[30] ITU-R Recommendation BT.500-10, (2000). “Methodology for the subjective assessment of the quality of television pictures.”
[31] T. H. Tsai, C. S. Fan, and C. C. Huang, “Semi-automatic Depth Map Extraction Method for Stereo Video Conversion,”The 6th International Conference on Genetic and Evolutionary Computing (ICGEC), Kitakyushu, Japan, Aug. 2012.
[32] C. S. Fan, T. H. Tsai, and C. C. Huang, “Interactive Depth Generation Method for 2D-TO-3D Video Concersion”, The 25th IPPR Conference on Computer Vision, Graphics, and Image Processing (CVGIP), Nantou, Taiwan, Aug. 2012.
[33] C. C. Cheng, Student Member, IEEE, C. T. Li, and L. G. Chen, Fellow, IEEE, “A Novel 2D-to-3D Conversion System Using Edge Information,” IEEE Consumer Electronics Society, Consumer Electronics, IEEE Transactions on Vol.56, pp.1739-1745, 2010.
[34] B. L. Lin, L.C. Chang, S. S. Huang, D. W. Shen, and Y.C. Fan, “Two Dimensional to Three Dimensional Image Conversion System Design of Digital Archives for Classical Antiques and Document,” Information Security and Intelligence Control(ISIC), International Conference on, pp.218-221, Aug. 2012.
[35] Y. K. Lai, Y. F Lai, and C. Chen, “An Effective Hybrid Depth-Generation Algorithm for 2D-to-3D Conversion in 3D Displays”, IEEE/OSA J. Display Technol., vol.9, no.3 pp.146-161, March 2013.
[36] S. H. Raza, O. Javed, A. Das, H. Cheng, H. Singh and I. Essa, “Depth extraction from videos using geometric context and occlusion boundaries,” In BMCV, 2014.
[37] Nagoya University Multi-view Sequences Download List. http://www.fujii.nuee.nagoya-u.ac.jp/multiview-data/

指導教授

蔡宗漢(Tsung-Han Tsai)

審核日期

2016-1-25

推文