車載網路中基於多代理人強化學習和賽局理論之最小化任務成本的方法

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：34

、訪客IP：3.145.82.50

姓名

楊宗祐(Zong-You Yang) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

車載網路中基於多代理人強化學習和賽局理論之最小化任務成本的方法
(Using Multi-Agent Reinforcement Learning and Game Theory to Minimize the Task Cost in Vehicular Networks)

相關論文

★ 非結構同儕網路上以特徵相似度為基準之搜尋方法	★ 以階層式叢集聲譽為基礎之行動同儕網路拓撲架構
★ 線上RSS新聞資料流中主題性事件監測機制之設計與實作	★ 耐延遲網路下具密度感知的路由方法
★ 整合P2P與UPnP內容分享服務之家用多媒體閘道器：設計與實作	★ 家庭網路下簡易無縫式串流影音播放服務之設計與實作
★ 耐延遲網路下訊息傳遞時間分析與高效能路由演算法設計	★ BitTorrent P2P 檔案系統下載端網路資源之可調式配置方法與效能實測
★ 耐延遲網路中利用訊息編碼重組條件之資料傳播機制	★ 耐延遲網路中基於人類移動模式之路由機制
★ 車載網路中以資料匯集技術改善傳輸效能之封包傳送機制	★ 適用於交叉路口環境之車輛叢集方法
★ 車載網路下結合路側單元輔助之訊息廣播機制	★ 耐延遲網路下以靜態中繼節點（暫存盒）最佳化訊息傳遞效能之研究
★ 耐延遲網路下以動態叢集感知建構之訊息傳遞機制	★ 跨裝置影音匯流平台之設計與實作

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-8-19以後開放)

摘要(中)

隨著自動駕駛技術的快速發展，車載感測器能夠更精確地感知車輛周圍的環境，然而，這也導致了大量的感測數據產生，進而使自動駕駛系統面臨計算能力限制所帶來的延遲增加和能耗過大的挑戰。為了緩解車載計算的負擔，將計算任務轉移至外部計算資源，並通過雲端、邊緣計算節點或附近車載設備進行協作計算，成為一種解決方案。本文提出了一種基於賽局理論和深度強化學習的車載計算任務卸載策略。首先，我們設計了一個綜合考量延遲、功耗和計算資源租賃成本的任務成本函數，旨在制定卸載策略並評估不同卸載決策的優劣。接著，我們將車輛間競爭計算資源的問題描述為一個賽局，並證明該賽局存在納許均衡解。最後，我們將賽局問題整合至多代理人強化學習問題中，採用MATD3架構進行模型訓練，以尋找賽局中的最優策略均衡解。該方法能根據當前網路環境和任務特性，選擇最佳的計算任務卸載決策。在滿足時間容忍度要求的前提下，顯著提高計算任務的完成率並降低任務完成成本。實驗結果顯示，與以往研究相比，本文提出的方法能更好地評估不同卸載決策，制定出最佳的卸載策略，從而降低任務成本並提升任務完成率，為自動駕駛等應用提供更加靈活和高效的解決方案。

摘要(英)

With the rapid advancement of autonomous driving technology, vehicular sensors have increasingly enhanced their capability to accurately perceive the vehicle′s surrounding environment. However, this advancement has resulted in the generation of substantial amounts of sensor data, which presents significant challenges for autonomous driving systems, particularly in terms of increased latency and excessive energy consumption arising from computational capacity limitations. To mitigate the burden on vehicular computing resources, the strategy of offloading computational tasks to external resources through collaboration with cloud computing, edge computing nodes, or nearby vehicular devices has emerged as a viable solution.This thesis proposes a vehicle computing task offloading strategy grounded in game theory and deep reinforcement learning. Initially, a comprehensive task cost function is designed, taking into account latency, power consumption, and the costs associated with renting computational resources. This function aims to formulate offloading strategies while evaluating the advantages and disadvantages of various offloading decisions. Subsequently, the competition for computational resources among vehicles is framed as a game, wherein the existence of a Nash equilibrium solution is demonstrated. Finally, the game problem is integrated into a multi-agent reinforcement learning framework, utilizing the MATD3 architecture for model training to identify the optimal strategy equilibrium solution within the context of the game.This method enables the selection of the optimal computational task offloading decision based on current network environment and task characteristics. Under the conditions of meeting time tolerance requirements, it significantly enhances the completion rate of computational tasks while concurrently reducing task completion costs. Experimental results indicate that, in comparison to previous studies, the proposed method provides a superior evaluation of different offloading decisions and facilitates the formulation of optimal offloading strategies. Consequently, this approach reduces task costs and improves task completion rates, thereby offering a more flexible and efficient solution for applications in autonomous driving.

關鍵字(中)

★ 自動駕駛
★ 車聯網
★ 計算卸載
★ 協作計算
★ 賽局理論
★ 多代理人強化學習

關鍵字(英)

★ Autonomous Driving
★ Vehicular Networks, Computing Offloading, Collaborative Computing, Game Theory, Multi-Agent Reinforcement Learning
★ Computing Offloading
★ Collaborative Computing
★ Game Theory
★ Multi-Agent Reinforcement Learning

論文目次

摘要 i
Abstract ii
圖目錄 v
表目錄 vii
1 簡介 1
2 背景與相關文獻探討 4
2.1 車載網路中的協作計算 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 計算卸載（Computation Offloading） . . . . . . . . . . . . . . . . 6
2.1.2 協作計算（Collaborative Computing） . . . . . . . . . . . . . . . . 8
2.2 賽局理論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 賽局理論簡介 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 車載網路下的賽局理論 . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.1 強化學習簡介 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 車載網路下的強化學習 . . . . . . . . . . . . . . . . . . . . . . . . 21
3 研究方法 24
3.1 系統架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2 環境及問題定義 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2.1 本地端運算 (Local Computing) . . . . . . . . . . . . . . . . . . . . 31
3.2.2 卸載運算 (Remote Computing) . . . . . . . . . . . . . . . . . . . . 32
3.2.3 問題描述 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 賽局理論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.1 賽局描述 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.2 納許均衡 (Nash Equilibrium) . . . . . . . . . . . . . . . . . . . . . 37
3.3.3 潛在賽局 (Potential Game) . . . . . . . . . . . . . . . . . . . . . . 38
3.4 強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.1 單代理人強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.2 多代理人強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.5 演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4 實驗與結果分析 61
4.1 實驗設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2 實驗環境配置 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.3 Sumo 仿真環境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4 實驗參數設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.1 網路環境參數設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.2 強化學習模型參數設計 . . . . . . . . . . . . . . . . . . . . . . . . 68
4.5 實驗對照組 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6 超參數 (Hyperparameter) 調整影響 . . . . . . . . . . . . . . . . . . . . . . 70
4.6.1 學習率（Learning Rate,α）影響 . . . . . . . . . . . . . . . . . . . 71
4.6.2 衰減率（Gamma,γ）影響 . . . . . . . . . . . . . . . . . . . . . . . 73
4.7 實驗評估 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.7.1 不同權重對任務成本的影響 . . . . . . . . . . . . . . . . . . . . . . 76
4.7.2 賽局理論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.7.3 納許均衡 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.7.4 不同強化學習演算法之間比較 . . . . . . . . . . . . . . . . . . . . . 88
4.7.5 不同任務成本之間比較 . . . . . . . . . . . . . . . . . . . . . . . . 95
5 結論與未來研究 111
參考文獻 112

參考文獻

[1] W. Feng, N. Zhang, S. Li, S. Lin, R. Ning, S. Yang, and Y. Gao, “Latency minimization of reverse offloading in vehicular edge computing,” IEEE Transactions on
Vehicular Technology, vol. 71, no. 5, pp. 5343–5357, 2022.
[2] A. T. Jawad, R. Maaloul, and L. Chaari, “A multi-agent reinforcement learning-based
approach for uav-assisted vehicle-to-everything network,” in 2023 9th International
Conference on Control, Decision and Information Technologies (CoDIT), 2023, pp.
123–129.
[3] W. Zhan, C. Luo, J. Wang, C. Wang, G. Min, H. Duan, and Q. Zhu, “Deepreinforcement-learning-based offloading scheduling for vehicular edge computing,”
IEEE Internet of Things Journal, vol. 7, no. 6, pp. 5449–5465, 2020.
[4] D. Pliatsios, P. Sarigiannidis, T. D. Lagkas, V. Argyriou, A.-A. A. Boulogeorgos,
and P. Baziana, “Joint wireless resource and computation offloading optimization for
energy efficient internet of vehicles,” IEEE Transactions on Green Communications
and Networking, vol. 6, no. 3, pp. 1468–1480, 2022.
[5] M. Liwang, J. Wang, Z. Gao, X. Du, and M. Guizani, “Game theory based opportunistic computation offloading in cloud-enabled iov,” IEEE Access, vol. 7, pp.
32 551–32 561, 2019.
[6] Q. Wu, H. Ge, H. Liu, Q. Fan, Z. Li, and Z. Wang, “A task offloading scheme in
vehicular fog and cloud computing system,” IEEE Access, vol. 8, pp. 1173–1184,
2020.
[7] C. Wu, Z. Huang, and Y. Zou, “Delay constrained hybrid task offloading of internet of
vehicle: A deep reinforcement learning method,” IEEE Access, vol. 10, pp. 102 778–
102 788, 2022.
[8] K. Wang, X. Wang, X. Liu, and A. Jolfaei, “Task offloading strategy based on reinforcement learning computing in edge computing architecture of internet of vehicles,”
IEEE Access, vol. 8, pp. 173 779–173 789, 2020.
[9] C. Chen, L. Chen, L. Liu, S. He, X. Yuan, D. Lan, and Z. Chen, “Delay-optimized
v2v-based computation offloading in urban vehicular edge computing and networks,”
IEEE Access, vol. 8, pp. 18 863–18 873, 2020.
[10] X. Dai, Z. Xiao, H. Jiang, H. Chen, G. Min, S. Dustdar, and J. Cao, “A learning-based
approach for vehicle-to-vehicle computation offloading,” IEEE Internet of Things
Journal, vol. 10, no. 8, pp. 7244–7258, 2023.
[11] R.-H. Hwang, M. M. Islam, M. A. Tanvir, M. S. Hossain, and Y.-D. Lin, “Communication and computation offloading for 5g v2x: Modeling and optimization,” in
GLOBECOM 2020 - 2020 IEEE Global Communications Conference, 2020, pp. 1–6.
[12] H. Ge, X. Song, S. Ma, L. Liu, S. Li, X. Cheng, T. Zhou, and H. Feng, “Task
offloading algorithm in edge computing based on dqn,” in 2022 4th International
Conference on Natural Language Processing (ICNLP), 2022, pp. 482–488.
[13] S. Birhanu Engidayehu, T. Mahboob, and M. Young Chung, “Deep reinforcement
learning-based task offloading and resource allocation in mec-enabled wireless networks,” in 2022 27th Asia Pacific Conference on Communications (APCC), 2022,
pp. 226–230.
[14] J. Chen, Z. Wu, L. Wu, Y. Xia, Y. Wang, L. Xiong, and C. Shi, “Hybrid decision
based multi-agent deep reinforcement learning for task offloading in collaborative
edge-cloud computing,” in 2022 IEEE 24th Int Conf on High Performance Computing
& Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on
Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems &
Application (HPCC/DSS/SmartCity/DependSys). IEEE, 2022, pp. 228–235.
[15] P.-D. Nguyen and L. B. Le, “Joint computation offloading, sfc placement, and resource allocation for multi-site mec systems,” in 2020 IEEE Wireless Communications and Networking Conference (WCNC), 2020, pp. 1–6.
[16] Y. Li, C. Yang, M. Deng, X. Tang, and W. Li, “A dynamic resource optimization
scheme for mec task offloading based on policy gradient,” in 2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC), vol. 6, 2022,
pp. 342–345.
[17] P. Teymoori and A. Boukerche, “Dynamic multi-user computation offloading for
mobile edge computing using game theory and deep reinforcement learning,” in ICC
2022 - IEEE International Conference on Communications, 2022, pp. 1930–1935.
[18] Y. Wang, P. Lang, D. Tian, J. Zhou, X. Duan, Y. Cao, and D. Zhao, “A game-based
computation offloading method in vehicular multiaccess edge computing networks,”
IEEE Internet of Things Journal, vol. 7, no. 6, pp. 4987–4996, 2020.
[19] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computation offloading
for mobile-edge cloud computing,” IEEE/ACM transactions on networking, vol. 24,
no. 5, pp. 2795–2808, 2015.
[20] M. Dai, Z. Su, Q. Xu, and N. Zhang, “Vehicle assisted computing offloading for
unmanned aerial vehicles in smart city,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 3, pp. 1932–1944, 2021.
[21] H. Wang, Z. Lin, K. Guo, and T. Lv, “Computation offloading based on game theory
in mec-assisted v2x networks,” in 2021 IEEE International Conference on Communications Workshops (ICC Workshops), 2021, pp. 1–6.
[22] ——, “Computation offloading based on game theory in mec-assisted v2x networks,”
in 2021 IEEE International Conference on Communications Workshops (ICC Workshops), 2021, pp. 1–6.
[23] G. Jain, A. Kumar, and S. A. Bhat, “Recent developments of game theory and
reinforcement learning approaches: A systematic review,” IEEE Access, vol. 12, pp.
9999–10 011, 2024.
[24] T. Rappaport, Wireless Communications: Principles and Practice, 2nd ed. USA:
Prentice Hall PTR, 2001.
[25] S. Jošilo and G. Dán, “Decentralized algorithm for randomized task allocation in fog
computing systems,” IEEE/ACM Transactions on Networking, vol. 27, no. 1, pp.
85–97, 2019.
[26] Z. Xiao, X. Dai, H. Jiang, D. Wang, H. Chen, L. Yang, and F. Zeng, “Vehicular
task offloading via heat-aware mec cooperation using game-theoretic method,” IEEE
Internet of Things Journal, vol. 7, no. 3, pp. 2038–2052, 2020.
[27] H. Wang, Z. Lin, K. Guo, and T. Lv, “Energy and delay minimization based on game
theory in mec-assisted vehicular networks,” in 2021 IEEE International Conference
on Communications Workshops (ICC Workshops), 2021, pp. 1–6.
[28] X. Hu, S. Xu, L. Wang, Y. Wang, Z. Liu, L. Xu, Y. Li, and W. Wang, “A joint
power and bandwidth allocation method based on deep reinforcement learning for
v2v communications in 5g,” China Communications, vol. 18, no. 7, pp. 25–35, 2021.
[29] D. Monderer and L. S. Shapley, “Potential games,” Games and Economic
Behavior, vol. 14, no. 1, pp. 124–143, 1996. [Online]. Available: https:
//www.sciencedirect.com/science/article/pii/S0899825696900445
[30] R. Lowe, Y. Wu, A. Tamar, J. Harb, P. Abbeel, and I. Mordatch, “Multi-agent actorcritic for mixed cooperative-competitive environments,” in Proceedings of the 31st
International Conference on Neural Information Processing Systems, ser. NIPS’17.
Red Hook, NY, USA: Curran Associates Inc., 2017, p. 6382–6393.
[31] L. Wang, K. Wang, C. Pan, W. Xu, N. Aslam, and L. Hanzo, “Multi-agent deep reinforcement learning-based trajectory planning for multi-uav assisted mobile edge computing,” IEEE Transactions on Cognitive Communications and Networking, vol. 7,
no. 1, pp. 73–84, 2021.
[32] J. J. Ackermann, V. Gabler, T. Osa, and M. Sugiyama, “Reducing overestimation
bias in multi-agent domains using double centralized critics,” ArXiv, vol. abs/
1910.01465, 2019. [Online]. Available: https://api.semanticscholar.org/CorpusID:
203642167
[33] T. Rappaport, Wireless Communications: Principles and Practice. Cambridge
University Press, 2024. [Online]. Available: https://books.google.com.tw/books?id=
X3r5EAAAQBAJ

指導教授

胡誌麟(Chih-Lin Hu)

審核日期

2024-8-20

推文