English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 84303/84303 (100%)
造訪人次 : 63509388      線上人數 : 2473
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/99359


    題名: 一種基於CSTrack多目標追蹤框架實現的機器視覺特徵編碼系統之研究;A Machine Vision Feature Coding System Based on the CSTrack Multi-Object Tracking Framework
    作者: 許雲程;Hsu, Yun-Cheng
    貢獻者: 通訊工程學系
    關鍵詞: 多目標追蹤;特徵壓縮測試模型;機器視覺;特徵編碼;MOT;FCTM;VCM;FCM
    日期: 2026-01-22
    上傳時間: 2026-03-06 18:48:01 (UTC+8)
    出版者: 國立中央大學
    摘要: 近年機器視覺任務需求快速成長,使影像編碼技術不再僅以提升視覺呈現品質為目標,而轉向同時服務人工智慧模型的分析需求。機器視覺影像編碼(Video Coding for Machines, VCM)概念因而萌生,其核心在於建立能兼顧「人看」與「機器看」的統一碼流。因而延伸出的機器視覺特徵編碼(Feature Coding for Machines, FCM)進一步強調:將深度神經網路所抽取的特徵進行壓縮後傳輸,可在降低頻寬的同時維持下游任務表現。為推動研究標準化,MPEG 提出了特徵編碼測試模型(FCM Test Model, FCTM),提供從特徵抽取、壓縮、解碼到任務端評估的完整流程。
    然而既有 FCTM 中常用的 JDE(Joint Detection and Embedding)前端模型,其偵測與 ReID 嵌入共用特徵分支,在高壓縮環境下容易出現語義表徵不足、跨幀辨識不穩定等問題,限制整體追蹤性能。本研究因而以 CSTrack 取代 JDE 作為 FCTM 的前端特徵抽取器。CSTrack 藉由 CCN(Cross-Correlation Network)與 SAAN(Scale-Aware Attention Network)兩個模組,採用更強的語義建模與跨特徵強化設計,使輸出的特徵更具區辨性與可編碼性,特別是在多目標追蹤(MOT)場景中能有效減少 ID Switch、FN 以及 FP,進而提升 MOTA 性能表現。本研究實驗證實,維持 FCTM 標準流程不變的前提下,在前端特徵抽取器中,以 CSTrack 取代 JDE ,在 HiEve 數據集的測試中,平均 MOTA性能提昇了約7.53%,能夠顯著改善壓縮後的追蹤表現,展現其作為特徵編碼前端的可行性與優勢。;In recent years, the demand for machine vision tasks has grown rapidly, shifting the role of video coding from solely enhancing visual presentation quality to simultaneously serving the analytical needs of artificial intelligence models. This has led to the emergence of the Video Coding for Machines (VCM) concept, which aims to establish a unified bitstream that serves both human perception and machine interpretation. Extending from this idea, Feature Coding for Machines (FCM) further emphasizes the compression and transmission of deep neural network–extracted features, enabling reduced bandwidth while maintaining downstream task performance. To facilitate standardization in this field, MPEG introduced the Feature Compression Test Model (FCTM), which provides a complete processing pipeline covering feature extraction, compression, decoding, and task-level evaluation.
    However, the JDE (Joint Detection and Embedding) model commonly used as the frontend in existing FCTM frameworks shares features between detection and re-identification branches. Under high compression conditions, this design tends to suffer from insufficient semantic representation and unstable cross-frame identity association, limiting overall tracking performance. To address this issue, this study replaces JDE with CSTrack as the frontend feature extractor within FCTM. By integrating the Cross-Correlation Network (CCN) and Scale-Aware Attention Network (SAAN) modules, CSTrack employs robust semantic modeling and cross-feature reinforcement designs, producing more discriminative and more compressible representations. Particularly in multi-object tracking (MOT) scenarios, CSTrack effectively reduces ID switches, false negatives, and false positives, thereby improving MOTA performance.
    Experimental results demonstrate that, while preserving the standard FCTM pipeline, substituting JDE with CSTrack as the frontend feature extractor yielded an average MOTA improvement of approximately 7.53% on the HiEve dataset. This confirms CSTrack’s feasibility and advantages as a feature-coding frontend, significantly enhancing tracking robustness under compressed settings.
    顯示於類別:[通訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML3檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明