English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 81570/81570 (100%)
造訪人次 : 47076226      線上人數 : 438
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/96267


    題名: 一種以 CNN 為基礎之後處理器應用於 FCM 性能提升之研究;CNN-Based Post-Processing for Performance Improvement in Feature Coding for Machines
    作者: 陳柏聿;Chen, Po-Yu
    貢獻者: 通訊工程學系
    關鍵詞: 機器視覺編碼;機器視覺特徵編碼;多通道後處理補償器系統;深度學習;殘差稠密神經網路;Video Coding for Machines;Feature Coding for Machines;Multi-Channel Post-Processing Compensation System;Deep Learning;Residual Dense Neural Network
    日期: 2024-12-12
    上傳時間: 2025-04-09 17:30:56 (UTC+8)
    出版者: 國立中央大學
    摘要: 在智慧應用快速發展的背景下,傳統依賴客服中心人員
    全程監控的模式已難以應對不斷增長的需求,逐漸被機器自主判
    斷並採取措施的新模式所取代。然而,現今基於人眼視覺優化的
    影像壓縮標準(如 HEVC、VVC)可能無法滿足機器視覺的需求。
    由於機器視覺任務和人眼視覺需求存在差異,因此需要開發一種
    針對機器視覺的高效壓縮標準,這樣的需求推動了機器視覺影像
    編碼(Video Coding for Machines, VCM)的發展,以便更好地應用
    在機器視覺開發。同時,VCM 將評估方法由傳統的峰值信噪比
    (PSNR)轉變為基於機器視覺的任務準確率。其中,機器視覺特
    徵編碼(Feature Coding for Machines, FCM)通過壓縮特徵圖資料

    i

    代替原始影像,以提高可壓縮性並期望達到更高的機器視覺任務
    準確率。FCM 的一項挑戰是壓縮失真可能導致機器視覺模型的準
    確率下降,這也是本論文的研究重點。在 2023 年 4 月的第 142 次
    MPEG 會議上,針對 FCM 提出了一項特徵壓縮測試模型(Feature
    Compression Test Model, FCTM)並發布了 Call for Proposal,邀請
    各界提出創新方案。本研究在此背景下提出了一種結合卷積神經
    網路(CNN)架構的多通道後處理補償器系統,旨在恢復因 FCM
    壓縮系統而導致的準確率下降。該系統經過神經網路訓練後進行
    測試,通過補償失真資料來提升機器視覺任務的準確率。初步結
    果顯示,單通道後處理補償器已能顯著提升壓縮後的機器視覺任
    務準確率,相比 FCTM v1.0.0 可以提昇 BDMOTA 至 2.94%。並且,
    為了進一步優化效果,我們分別增加了第二通道和第三通道的額
    外特徵以提升後處理補償器對壓縮失真的補償效果。實驗結果表
    明,相較於 FCTM v1.0.0,本文提出的多通道後處理補償器系統使
    整體平均 BDMOTA 提升最高達 4.7%,顯著改善了經過 FCM 壓縮
    後損失的機器視覺任務準確率。;In the context of the rapid development of intelligent applications, the traditional approach of relying on customer service center staff for full-time monitoring has become inadequate to meet the growing demand. This has gradually been replaced by a new model in which machines autonomously make judgments and take actions. However, current image compression standards optimized for human visual perception, such as HEVC and VVC, may not suffice for the needs of machine vision. Since the requirements of machine vision tasks differ from those of human visual perception, it is essential to develop an efficient compression standard tailored specifically for machine vision. This need has driven the development of Video Coding for Machines (VCM), which is better suited for machine vision applications. Moreover, VCM shifts its evaluation metrics from traditional peak signal-to-noise ratio (PSNR) to machine vision task accuracy. Feature Coding for Machines (FCM) compresses feature map data instead of raw images to improve compressibility while aiming for higher accuracy in machine vision tasks. One of the challenges in FCM lies
    in the compression distortion, which can result in reduced accuracy for machine vision models. Addressing this issue is the primary focus of this research.

    During the 142nd MPEG meeting in April 2023, a Feature Compression Test Model (FCTM) for FCM was proposed, and a Call for Proposals was issued, inviting innovative solutions from the community. Against this backdrop, this study proposes a multi-channel post-processing compensator system based on a convolutional neural network (CNN) architecture. The system is designed to mitigate the accuracy degradation caused by FCM compression by compensating for distorted data. After being trained with neural networks, the proposed system is tested to enhance machine vision task accuracy by addressing compression artifacts.

    Preliminary results show that even a single-channel post-processing compensator significantly improves the accuracy of machine vision tasks after compression, increasing BDMOTA by up to 2.94% compared to FCTM v1.0.0. To further enhance performance, additional features were incorporated into second and third channels, improving the compensator’s ability to mitigate compression distortion. Experimental results demonstrate that, compared to FCTM v1.0.0, the proposed multi-channel post-processing compensator system achieves an overall average BDMOTA improvement of up to 4.7%, effectively restoring the accuracy loss in ma-
    chine vision tasks caused by FCM compression.
    顯示於類別:[通訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML22檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明