在蓬勃發展的智慧物聯網 (Internet of Things, IoT)、車聯網 (vehicle-to-everything, V2X)、邊緣運算 (edge computing) 趨勢下,各類車輛可被視為具移動能力的智慧代理人 (smart agent),透過共享通訊、儲存、及運算資源,形成車雲網共同完成網路中的應用。多代理人增強式學習 (multi-agent reinforcement learning, MARL),則被視為在充滿不確定性及本質上不穩定的車聯網環境下,找到良好解決方法的學習架構之一,可透過車輛間的協同合作,為網路節點間學習一套新的決策模式及提高多代理人系統 (multi-agent system, MAS) 的效能帶來共同的效益,以處理不斷動態變動的環境。在此架構下,smart agent 間對於資源運用的互動情境,可視為一由無線網路連結起來的 MAS,用其分散式決策的特性與邊緣運算整合,來因應車聯網中的高度互動性問題,效果可期,且尚未有完整的研究成果,值得深入探討。我們規劃從發展可規模化的 MAS 架構出發,再繼續深入研究如何以遷移式學習解決高移動性的問題、部分可觀測模型來提升車輛間能夠共享部分資訊的效益,運用在 V2X 串流與關鍵任務應用上。延續執行中計劃的模擬環境建置與 MARL 運用經驗,探討深度增強式學習的應用效果,並開放原始碼以供驗證。本計劃之創新性在於透過深入了解多代理人深度增強式學習技術,提出領先且可行之車聯網資源管理方法,並做為進一步研究 V2X 服務之重要參考。 ;In the trend of the Internet of Things (IoT), vehicle-to-everything (V2X), and edge computing, vehicles can be considered as smart agents collaborating to provide services through shared communication, storage, and computing resources. The multi-agent reinforcement learning (MARL) technique can be a proper tool to analyze V2X resource allocation issues. The multi-agent system formed by vehicles can be utilized to achieve distributed decisions through edge computing architecture. However, the solution to this topic is still emerging and worth further study. In this proposal, we plan to begin with a scalable MAS architecture, and then investigate how to deal with mobility and information sharing issues using transfer learning and partially observable Markov decision process (POMDP), respectively. Applications such as streaming and mission-critical transmission will be considered. The effectiveness of MARL will be investigated and target for innovative resource allocation toward future V2X applications.