基於結構張量之多決策VVC畫面內預測編碼;A Multi-Strategy VVC Intra Prediction Coding Based on Structure Tensor

NCU Institutional Repository > 資訊電機學院 > 通訊工程研究所 > 博碩士論文 > Item 987654321/99364

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/99364

題名:	基於結構張量之多決策VVC畫面內預測編碼;A Multi-Strategy VVC Intra Prediction Coding Based on Structure Tensor
作者:	廖紅綾;Liao, Hong-Ling
貢獻者:	通訊工程學系
關鍵詞:	多功能影像編碼;支持向量機;卷積神經網路;編碼單元;畫面內預測;快速深度決策;Versatile Video Coding;Support Vector Machine;Convolutional Neural Network;Coding Unit;Intra Prediction;Fast depth decision
日期:	2026-01-22
上傳時間:	2026-03-06 18:48:53 (UTC+8)
出版者:	國立中央大學
摘要:	在當今網路與科技快速發展的時代，人們對高解析度影像品質的需求逐漸增加。然而，高解析度影像所產生的龐大資料量需要更高效的壓縮技術來處理。H.266/VVC引入了多項先進技術，例如方形與矩形編碼樹單元(Coding Unit, CU)的多類型劃分，以及碼率失真最佳化(Rate-Distortion Optimization, RDO)，這些技術在提升壓縮效率的同時，也使編碼計算之複雜度上升。本論文結合傳統影像特徵分析方法、機器學習及深度學習技術，針對VVC的編碼單元劃分進行優化。研究首先分析了兩階段VVC編碼演算法及使用Sobel Operator的改進架構，深入探討其優缺點與侷限性。我們發現在第一階段採用Sobel Operator為較粗略的影像紋理偵測，且其二元決策機制僅能選擇執行部分模式或全部模式，限制了演算法在編碼效率與品質之間的靈活調控能力。為解決這些問題，本研究提出以結構張量(Structure Tensor)取代Sobel Operator，並加入熵值(entropy)及一致性(coherence)的分析，更精確地擷取影像的紋理方向特徵，從而優化編碼單元的劃分決策。此外，我們提出四層級決策機制，相較於二元決策，本研究的四層級機制能更細緻地控制編碼模式的選擇，在編碼時間節省與碼率失真性能之間取得更好的平衡。;In the era of rapid development in networks and digital technologies, the demand for high-resolution image quality has been steadily increasing. However, high-resolution images generate a large amount of data, which necessitates more efficient compression techniques. H.266/VVC introduces several advanced technologies, such as multi-type partitioning of square and rectangular Coding Units (CUs) and Rate-Distortion Optimization (RDO). While these techniques significantly improve compression efficiency, they also lead to increased computational complexity in the encoding process. This thesis combines traditional image feature analysis methods with machine learning and deep learning techniques to optimize coding unit partitioning in VVC. The study first analyzes a two-stage VVC encoding algorithm and an improved framework incorporating the Sobel operator, and thoroughly discusses their advantages, drawbacks, and limitations. We observe that employing the Sobel operator in the first stage provides only a coarse detection of image textures, and its binary decision mechanism merely allows the selection between executing partial modes or all modes, thereby limiting the algorithm’s flexibility in balancing encoding efficiency and coding performance. To address these issues, this study proposes replacing the Sobel operator with the Structure Tensor and incorporating entropy and coherence analysis to more accurately capture directional texture characteristics in images, thereby improving coding unit partitioning decisions. Furthermore, a four-level decision mechanism is introduced. Compared with the conventional binary decision approach, the proposed four-level mechanism enables finer control over the selection of encoding modes, achieving a better trade-off between encoding time reduction and rate-distortion performance.
顯示於類別:	[通訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	89	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....