一種結合支持向量機與卷積神經網路的架構以降低HEVC計算複雜度之研究;Computation Reduction of HEVC Intra Prediction using combined SVM and CNN

NCU Institutional Repository > 資訊電機學院 > 通訊工程研究所 > 博碩士論文 > Item 987654321/82850

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/82850

題名:	一種結合支持向量機與卷積神經網路的架構以降低HEVC計算複雜度之研究;Computation Reduction of HEVC Intra Prediction using combined SVM and CNN
作者:	王致傑;Wang, Jie-Jay
貢獻者:	通訊工程學系
關鍵詞:	高效率視頻編碼;畫面內預測;支持向量機;卷積神經網路;碼率失真最佳化;編碼單元;快速深度決策;HEVC;Intra Prediction;SVM;CNN;RDO;CU;Fast Depth Decision
日期:	2020-01-17
上傳時間:	2020-06-05 17:24:26 (UTC+8)
出版者:	國立中央大學
摘要:	隨著科技的高速發展與使用者越來越多的需求，高解析度的影像逐漸充斥了人們的生活。為了能夠更高效率的壓縮這些巨大的視頻資料量，HEVC採用了一些更新穎的技術，如編碼樹單元、碼率失真最佳化等等，但於此同時也造成了編碼計算複雜度的提升。本論文結合近幾年來十分熱門的深度學習與機器學習，即卷積神經網路與支持向量機，將其應用於HEVC編碼單元深度決策。不同於原始HEVC遞迴運算編碼單元深度0至3，本論文在編碼一開始時先使用支持向量機將編碼單元分成單調區塊與複雜區塊，再利用卷積神經網路分層向下細分。分類完成的區塊將只會進行特定深度的編碼並提前終止後續的編碼計算，藉此節省編碼其他深度所需的運算時間。而後進一步將支持向量機的結果導入卷積神經網路模型，設計一個映射函數使其修正模型的預測判斷。最終實驗結果顯示，與HEVC相比，整體平均BDBR上升0.66%的情況下，編碼時間大約可以節省49%。;With the rapid development of technology and the increasing requirements of users, high-resolution images are gradually filling our lives. In order to compress huge amounts of video data more efficiently, HEVC utilizes some newer technologies, such as coding tree units (CTU), rate distortion optimization (RDO), etc., but it also increases a lot of computation complexity at the same time. In this thesis, we combine the deep learning (DL) which is popular in recent years and the machine learning (ML), scilicet convolutional neural network (CNN) and support vector machine (SVM), applying them to the depth decision of coding units in HEVC. Different from the original HEVC which computes the depth of coding units 0 to 3 recursively, we first divide CTU into homogeneous blocks and complex blocks with SVM, and then classifying them hieratically by CNN models. The classified blocks will only encode at some specific depths and terminate calculations of encoding in advance, thus saving the computation time of other encoding depths. After that, the results of SVM are imported into CNN models, and some mapping functions are designed to modify the prediction of these models. The final experimental results in this thesis show that the overall average BDBR rises by 0.66%, and the encoding time can be saved by 49%.
顯示於類別:	[通訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	116	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....