應用於空間與CGS可調性視訊編碼器之快速模式決策演算法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：18

、訪客IP：3.22.171.136

姓名

吳運達(Yun-Da Wu) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

應用於空間與CGS可調性視訊編碼器之快速模式決策演算法
(A Fast Mode Decision Algorithm for Spatial and CGS Scalable Video Coding)

相關論文

★ 應用於車內視訊之光線適應性視訊壓縮編碼器設計	★ 以粒子濾波法為基礎之改良式頭部追蹤系統
★ 應用於人臉表情辨識之強健式主動外觀模型搜尋演算法	★ 結合Epipolar Geometry為基礎之視角間預測與快速畫面間預測方向決策之多視角視訊編碼
★ 基於改良式可信度傳遞於同質區域之立體視覺匹配演算法	★ 以階層式Boosting演算法為基礎之棒球軌跡辨識
★ 多視角視訊編碼之快速參考畫面方向決策	★ 以線上統計為基礎應用於CGS可調式編碼器之快速模式決策
★ 適用於唇形辨識之改良式主動形狀模型匹配演算法	★ 以運動補償模型為基礎之移動式平台物件追蹤
★ 基於匹配代價之非對稱式立體匹配遮蔽偵測	★ 以動量為基礎之快速多視角視訊編碼模式決策
★ 應用於地點影像辨識之快速局部L-SVMs群體分類器	★ 以高品質合成視角為導向之快速深度視訊編碼模式決策
★ 以運動補償模型為基礎之移動式相機多物件追蹤	★ 基於匹配代價曲線特徵之遮蔽偵測之研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

可調式視訊編碼器除了保有原始AVC/H.264之詳盡搜尋方式外，空間階層間預測更添增了可調式視訊編碼器之計算複雜度，因此，如何降低可調式視訊編碼器之計算複雜度便為一個非常重要的課題。在本論文中，我們利用階層間模式之相關性，設計一個可適應於多重空間及CGS階層之快速視訊編碼模式決策演算法。此外，我們也考慮人類視覺注意力之因素，提出一個運動向量注意力模型，可選擇性地加入我們所設計之快速演算法，在節省編碼時間的同時，仍維持原始主觀視覺品質。實驗結果顯示，當位元率增加及PSNR下降在可以接受範圍下，我們所提出之快速演算法能節省66%~68%的編碼時間，而結合運動注意力模型之快速演算法也能節省50%~ 57%編碼時間，更重要地，從主觀視覺測試結果顯示出結合運動注意力模型之快速演算法相較於我們所設計之快速演算法能提供更接近於原始演算法之主觀視覺品質。

摘要(英)

The exhaustive search for macroblock mode decision in the working draft of scalable video coding extension of AVC/H.264 achieves theoretically optimal coding efficiency. However, it also accompanies high computation complexity. For scalable video coding, how to reduce the heavy computation load while there is minor bit-rate increase and PSNR loss is a critical issue for realizing such technology in the consumer electronics.Motivated by this, we present a fast mode decision algorithm for scalable video coding by exploring the correlation of MB modes between layers. Our algorithm is applied to multiple spatial and CGS layers. Additionally, we design a motion attention model (MAM) based on the considerations of the psychovisual issue. This model can be optionally combined with our proposed fast algorithm so that it not only saves the encoding time but also retains the visual quality. Experiments conducted by JSVM8.10 exhibit that the proposed fast algorithm without the MAM saves 66% to 68% encoding time in the acceptable range of PSNR loss and bit-rate increase. Moreover, the proposed fast algorithm incorporating the MAM saves 50% to 57% encoding time. The subjective visual test shows that the better visual quality compared with the proposed fast algorithm.

關鍵字(中)

★ 可調式視訊編碼
★ 空間與CGS可調性
★ 人類心理視覺特性
★ 快速演算法

關鍵字(英)

★ Scalable video coding
★ Fast algorithm
★ psychovisual characteristics
★ spatial and CGS scalability

論文目次

摘要............................................................................................................................................I
Abstract..................................................................................................................................... II
誌謝......................................................................................................................................... III
目錄.........................................................................................................................................IV
圖目錄.....................................................................................................................................VI
表目錄.................................................................................................................................. VIII
第一章緒論.............................................................................................................................. 1
1.1 前言.............................................................. 1
1.2 研究動機.......................................................... 2
1.3 研究方法.......................................................... 3
1.4 論文架構.......................................................... 3
第二章單層視訊編碼器介紹.................................................................................................. 4
2.1 單層視訊編碼器概況................................................ 4
2.2 AVC/H.264 視訊編碼器.............................................. 8
2.2.1 增進編碼效率之編碼工具..............................................................................8
2.2.2 AVC/H.264 視訊編碼器之編碼架構...............................................................8
2.2.3 最佳的巨集區塊模式選擇..............................................................................9
2.3 總結............................................................. 11
第三章可調式視訊編碼器.................................................................................................... 12
3.1 可調式視訊編碼器之應用........................................... 12
3.2 可調式視訊編碼器架構............................................. 13
3.3 時間可調性(Temporal Scalability) ..................................... 14
3.4 空間可調性(Spatial Scalability) ....................................... 15
3.4.1 階層間畫面內編碼預測(Inter-layer Intra Prediction)...................................15
3.4.2 階層間運動向量預測(Inter-layer Motion Prediction) ..................................16
3.4.3 階層間殘餘資訊預測(Inter-layer Residual Prediction) ................................17
3.5 雜訊比可調變性(SNR Scalability)..................................... 17
3.6 總結............................................................. 18
第四章快速可調式視訊編碼器之演算法介紹.................................................................... 19
4.1 基礎層之快速可調式視訊編碼器演算法............................... 19
4.2 多層可調式視訊編碼器之快速演算法................................. 21
4.3 總結............................................................. 23
第五章本論文提出之快速可調式視訊編碼演算法............................................................ 24
V
5.1 結合運動向量注意力模型的快速可調式視訊編碼演算法................. 24
5.2 快速巨集區塊模式決策演算法....................................... 25
5.2.1 可調式視訊編碼器之巨集區塊模式............................................................25
5.2.2 階層間聚集區塊模式相關性........................................................................27
5.2.3 本論文所提出之快速巨集區塊模式決策演算法........................................38
5.3 快速運動向量預測選擇演算法....................................... 40
5.3.1 巨集區塊模式與運動向量預測之相關性....................................................40
5.3.2 本論文所提出之快速預測運動向量選擇演算法........................................43
5.4 人類視覺系統為導向之運動向量注意力模型........................... 44
5.4.1 人類視覺注意力現況及應用於視訊編碼介紹............................................44
5.4.2 本論文所提出之運動向量注意力模型........................................................45
5.4.3 總結................................................................................................................48
第六章實驗結果.................................................................................................................... 49
6.1 模擬環境與參數設定............................................... 49
6.2 快速可調式視訊編碼演算法......................................... 50
6.3 結合運動向量注意力模型之快速可調式視訊編碼....................... 52
第七章結論與未來展望........................................................................................................ 56
7.1 結論............................................................. 56
7.2 未來展望......................................................... 56
參考文獻................................................................................................................................. 57

參考文獻

參考文獻
[1] K. Wallance, “ The JPEG Still Picture compression standard, ” IEEE Transaction on Comsumer Electronics, pp. 18~34, vol. 38, issue 1, Feb. 1992.
[2] Video Codec for Audiovisual Services at p _ 64 kbit/s, ITU-T Rec. H.261, ITU-T, Version 1: Nov. 1990, Version 2: Mar. 1993.
[3] Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to About 1.5 Mbit/s—Part 2: Video, ISO/IEC 11172-2 (MPEG-1 Video), ISO/IEC JTC 1, Mar. 1993.
[4] Generic Coding of Moving Pictures and Associated Audio Information—Part 2: Video, ITU-T Rec. H.262 and ISO/IEC 13818-2 (MPEG-2 Video), ITU-T and ISO/IEC JTC 1, Nov. 1994.
[5] Video Coding for Low Bit Rate communication, ITU-T Rec. H.263, ITU-T, Version 1: Nov. 1995, Version 2: Jan. 1998, Version 3: Nov. 2000.
[6] Coding of audio-visual objects—Part 2: Visual, ISO/IEC 14492-2 (MPEG-4 Visual), ISO/IEC JTC 1, Version 1: Apr. 1999, Version 2: Feb. 2000, Version 3: May 2004.
[7] Advanced Video Coding for Generic Audiovisual Services, ITU-T Rec. H.264 and ISO/IEC 14496-10 (MPEG-4 AVC), ITU-T and ISO/IEC JTC 1, Version 1: May 2003, Version 2: May 2004, Version 3: Mar. 2005, Version 4: Sept. 2005, Version 5 and Version 6: June 2006, Version 7: Apr. 2007, Version 8 (including SVC extension): Consented in July 2007.
[8] T. Wiegand, G. Sullivan, G. Bjontegarrd, and A. Luthra, “Overview of H.264/AVC Video Coding Standard,” IEEE Transactions on Circuits and Systems for video technology, vol. 13, no.7, July 2003.
[9] T. Wiegand, G. Sullivan, J. Reichel, H.Schwarz, and M. Wien, Joint draft 8 of SVC amendment,” ITU-T and ISO/IEC JTC1 doc. JVT-U201, Hangzhou, China, Oct. 2006.
[10] H. C. Huang, W. H. Peng, T. Chiang, and H. M. Hang, “Advances in the scalable amendment of H.264/AVC,” IEEE Communications Magazine, vol. 45, no. 1, pp. 68-77, January 2007.
[11] H. Schwarz, D .Marpe, and T .Wiegand, “Overview of the Scalable Video Coding Extension of the H.264/AVC Standard,” IEEE Trans. Circuirts and Systems for Video Technology, vol. 17, No.9, Sept. 2007.
[12] H. Schwarz and M. Wien, “The Scalable Vidoe Coding Extension of The H.264/AVC Standard,” IEEE Signal Processing Magazine, pp.135~141, vol. 25, Issue 2,March 2008.
[13] H. Schwarz, D. Marpe, and T. Wiegand “Comparison of MCTF and Closed-loop Hierarchical-B Pictures,” JVT-PO59, July, 2005.
[14] M. Wien, H. Schwarz, and T. Olbaum, “Performance Analysis of SVC”, ”, IEEE Transactions on Circuits and Systems for Video Technology, pp.1194~1203 ,vol.17, NO.9, Sep. 2007.
[15] H. Li, Z. G. Li, and C.Wen, “Fast mode decision for temporal scalable video coding,” in Proc. Picture Coding Symp., Beijing, China, Apr. 2006.
[16] H. Li and Z. G. Li, “Fast mode decision algorithm for inter-frame coding in fully scalable video coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 16, no.7, pp. 889-895, July 2006.
[17] S. Lim, J. Yang, and B. Jeon, “Fast coding mode decision for scalable video coding,” International Conference on Advanced Communication Technology(ICACT), pp.1897~1900, vol.3, Feb. 2008.
[18] H. Li, Z. G. Li, C. Wen, and L. P. Chau, “Fast mode decision for spatial scalable video coding,” IEEE International Symposium on Circuits and Systems, pp. 3005-3008, May 2006.
[19] B. Lee, M. Kim, S. Hahm, C. Park, and K. Park, “A fast selection scheme in inter-layer prediction of H.264 scalable video coding,” IEEE Symposium on Broadband Multimedia Systems and Broadcasting(BMSB), Apr. 2008.
[20] H. Li, Z. G. Li, and C. Wen, “Fast mode decision for coarse grain SNR scalable video coding,” IEEE International Conference on Acoustics, Speech and Signal Processing(ICSSP), pp. 545-548, vol. 2, May 2006.
[21] H. Li, Z. G. Li, C. Wen, and S. Xie, “Fast mode decision for coarse granular scalability via switched candidate mode sets,” IEEE International Conference on Multimedia and expo(ICME), pp. 1323-1326, July 2007.
[22] H.-C. Lin, W. H.-Peng, H.-M. Hang, and W.-J. Ho, ”Low-complexity Macroblock mode decision alorithm for combinined coarse grain scalability(CGS) and temporal Scalability,” JVT-W029, April. 2007.
[23] H.-C. Lin, W.-H. Peng, H.-M. Hang, and W.-J. Ho, ”Layer-adaptive mode decision and motion search for scalable video coding with the combination of coarse grain scalability(CGS) and temporal Scalability,” IEEE International Conference on Image Processing, pp. 289-292, Sept. 2007.
[24] Y.-D. Wu and C.-W. Tang, “The motion directed fast mode decision for spatial and CGS scalable video coding,” IEEE Symposium on Broadband Multimedia Systems and Broadcasting(BMSB), Apr. 2008.
[25] W. James, “The principles of psychology,” Henry Holt and Co, 1890.
[26] L. Itti, C. Koch, and E. Niebur,” A Model of Saliency-Based Visual Attention for Rapid Scene Analysis,” IEEE Transactions on Pattern Analysis and Machine Integelligence, vol.20, no.11, Nov. 1998.
[27] Y. F. Ma and H. J. Zhang,” A model of motion attention for video skimming,” IEEE International Conference on Image Processing, vol. 1, pp. 22-25, Sept. 2002.
[28] C.-W. Tang, ”Spatiotemporal visual considerations for video coding”, IEEE Transaction on miltimedia, vol. 9, no. 2, pp. 231-238, Feb. 2007.
[29] S. Li and M. C. Li, “An Efficient Spatiotemporal Attention Model and Its Application to Shot Matching”, IEEE Transaction on Circuits and Video Technology, Vol. 17, NO. 10, Oct. 2007.
[30] W. Lai, X. D. Gu, R. H. Wang, W. Y. Ma, and H. J. Zhanga, “A region based multiple frame-rate tradeoff of video streaming," IEEE International Conference on Image Processing, vol. 3, pp. 2067– 2070, Oct. 2004.
[31] Z. Chen, G. Qiu, Y. Lu, L. Zhu, Q. Chen, X. Gu, and C. Wang, ” Improving video coding at scene cuts using attention based adaptive bit allocation,” IEEE International Symposium on Circuits and Systems, pp. 3634 – 3638, May 2007.
[32] L. Itti and P. Baldi, “A principled approach to detecting surprising events in video,” in Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, San Diego, CA, Jun. 2005.
[33] J. Vieron, H. Schwarz, and M. Wien, “JSVM 8 Software,” ITU-T and ISO/IEC JTC1 doc. J VT-U203, Hangzhou, China, Oct. 2006.
[34] F. Pereira and T. Ebrahimi, The MPEG-4 Book. Upper Saddle River, NJ: Prentice-Hall, pp. 669–675, 2002.

指導教授

唐之瑋(Chih-Wei Tang)

審核日期

2008-7-21

推文