English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 83696/83696 (100%)
造訪人次 : 56292271      線上人數 : 1408
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98174


    題名: Fed-HMF: 用於多模態車輛識別之階層式多尺度融合之聯;Fed-HMF: A Federated Learning Framework with Hierarchical Multi-scale Fusion for Multi-modal Vehicular Recognition
    作者: 吳軒宇;Wu, Xuan-Yu
    貢獻者: 通訊工程學系
    關鍵詞: 多模態融合;相機影像;iDAR 點雲通訊與計算;數據異質 性;聯邦學習;物件辨識;Multi-modal Fusion;Camera Image;LiDAR Point Cloud;Non-IID Data;Federated Learning;Object Recognition
    日期: 2025-09-22
    上傳時間: 2025-10-17 12:27:17 (UTC+8)
    出版者: 國立中央大學
    摘要: 在自動駕駛領域中,車輛必須能夠即時且精準地感知多重目標,
    然而,將大量車輛的感測數據集中至單一伺服器進行模型訓練,將引
    發嚴重的數據隱私與安全隱憂。為解決此一挑戰,本研究提出一個
    保護隱私的多模態三維目標分類框架,該框架融合相機影像與光達
    (LiDAR)點雲,並透過聯邦學習(FL)進行分散式訓練。在模型架
    構上,圖像分支採用多尺度視覺 Transformer(MViTv2)作為骨幹網
    路,並引入 Token 篩選機制以提升運算效率,同時結合雙向特徵金字
    塔網路(BiFPN)以強化多尺度特徵融合;光達分支則基於 PointNet
    架構來提取三維幾何特徵。融合後的特徵將被送入分類器以預測目標
    類別。模型訓練採用聯邦平均(Federated Averaging)演算法,允許
    多台車輛在本地進行訓練,確保原始感測數據永遠不會離開車輛本
    身,從而保障數據隱私。大量的實驗結果證明,多模態融合的效能顯
    著優於單一模態模型,且本研究所提出的架構創新在提升效能的同時
    亦降低了計算複雜度。我們深入分析了聯邦學習與中心化訓練之間的
    性能權衡,以及不同數據分佈(同質 IID vs. 非同質 non-IID)對模型
    收斂的影響。本研究發現一個值得注意的現象:在中度偏斜的非同質
    數據環境下,全局模型的表現反而因一種類正則化的「特化」效應而
    超越了同質數據環境。此外,我們也凸顯了在數據不平衡場景下,採
    用多維度評估指標(如準確率與宏觀 F1 分數)的重要性,因為單一
    的準確率指標可能產生誤導。本研究為車載多目標感知提供了一個兼
    具效能與隱私保護的綜合解決方案,並為未來公平、穩健的聯邦學習
    自動駕駛模型研究奠定了基礎。;In autonomous driving, vehicles must accurately perceive multiple targets in
    real time, but centralizing the vast sensor data from many cars raises privacy
    and security concerns. This work proposes a privacy-preserving multi-modal
    3D object classification framework that fuses camera images and LiDAR
    point clouds and is trained via Federated Learning (FL). The model features
    an image branch based on Multiscale Vision Transformers (MViTv2), en-
    hanced with a token selection mechanism for efficiency and a Bi-directional
    Feature Pyramid Network (BiFPN) for multi-scale feature fusion. A LiDAR
    branch built on PointNet extracts 3D geometric features from point clouds.
    The fused features are fed to a classifier head to output object category pre-
    dictions. Model training is conducted with a federated averaging algorithm
    across multiple vehicles, so that raw sensor data never leaves the vehicle,
    addressing data privacy. Extensive experiments demonstrate that combin-
    ing modalities yields superior accuracy over single-modality models, and our
    architectural innovations improve performance while reducing computation.
    We analyze the trade-offs introduced by federated training versus centralized
    training, and how different data distributions (IID vs. non-IID) affect conver-
    gence. Notably, we observe an unexpected benefit under moderately skewed
    non-IID data, where the global model outperforms the IID case due to a reg-
    ularizing specialization effect. We also highlight the importance of evaluating
    with multiple metrics (accuracy and macro F1) in imbalanced scenarios, as accuracy alone can be misleading. This work provides a comprehensive solu-
    tion for multi-target perception that is both effective and privacy-preserving,
    laying the groundwork for future research in fair and robust federated au-
    tonomous driving models.
    顯示於類別:[通訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML8檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明