Blockchain-Driven Secure MARL for IoT: A Framework with On-Chain Anomaly Detection and Token Economics

NCU Institutional Repository > 資訊電機學院 > 通訊工程研究所 > 博碩士論文 > Item 987654321/98140

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98140

題名:	Blockchain-Driven Secure MARL for IoT: A Framework with On-Chain Anomaly Detection and Token Economics
作者:	李昕潔;Lee, Hsin-Chieh
貢獻者:	通訊工程學系
關鍵詞:	區塊鏈;多代理人強化學習;物聯網;代幣經濟學;惡意偵測;智能合約;Blockchain;MARL;IoT;Token Economics;Malicious Detection;Smart Contracts
日期:	2025-08-16
上傳時間:	2025-10-17 12:24:32 (UTC+8)
出版者:	國立中央大學
摘要:	物聯網（IoT）的快速發展推動了需要在動態環境中進行多代理協作的應用，而在這些環境中，確保性能、安全性和可靠性是一項重大挑戰。多代理強化學習（MARL）對於優化此類環境中的決策至關重要，但它容易受到策略性或惡意行為的影響，這些行為可能破壞信任並降低性能。 % 本研究提出了一種針對物聯網量身定制的區塊鏈支持集中訓練和分散執行（BE-CTDE）框架。該框架使用區塊鏈作為可信的訓練管理層，以增強決策透明度和協作效率。我們引入了基於鏈上殘差的惡意行為檢測，透過過濾訓練過程中的惡意代理來增強 MARL 的穩定性和系統容錯能力。此外，我們的方法設計了一種基於代幣經濟的激勵機制，結合懲罰與補償，促進誠實參與並提升代理學習表現。為了實現現實世界中的應用，我們建立了一個私有以太坊環境，用於實現數據提交、惡意行為檢測和獎勳分配。在多代理路徑尋找（MAPF）任務中，於包含 32 個代理、其中 37.5% 為惡意代理，以及 15% 靜態障礙物密度的條件下，我們的框架相較於缺乏安全機制的基準模型 SCRIMP，碰撞率降低了 46.22%。同時，我們的方法仍維持 93% 的成功率，這展現了其在充滿挑戰的物聯網環境中的有效性與可靠性。;The proliferation of the Internet of Things (IoT) drives applications requiring multi-agent collaboration in dynamic environments, where ensuring performance, security, and reliability is a significant challenge. Multi-Agent Reinforcement Learning (MARL) is essential for optimizing decision-making in such settings, but it is vulnerable to strategic or malicious behaviors that can undermine trust and degrade performance. This work proposes a Blockchain-enabled Centralized Training and Decentralized Execution (BE-CTDE) framework tailored for IoT. The framework uses blockchain as a trusted training management layer to enhance decision transparency and collaborative efficiency. We introduce on-chain residual-based malicious behavior detection to enhance MARL stability and system fault tolerance by filtering malicious agents during training. Furthermore, our method designs an incentive mechanism based on a token-based economy, combining punishment and compensation to promote honest participation and enhance agent learning performance. A private Ethereum environment was established to implement data submission, malicious behavior detection, and reward allocation for real-world deployment. In a multi-agent pathfinding (MAPF) task, our framework reduces the collision rate by 46.22% compared to the baseline SCRIMP, which lacks security mechanisms, under conditions with 32 agents, a 37.5% malicious agent ratio, and 15% static obstacle density. At the same time, our method maintains a 93% success rate, showcasing its effectiveness and reliability in challenging IoT environments.
顯示於類別:	[通訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	80	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....