車載網路下基於 Stackelberg 賽局和多代理人強化學習之中繼傳輸群組建立及即時影像分享

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：30

、訪客IP：18.217.195.183

姓名

朱育成(Yu-Cheng Chu) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

車載網路下基於 Stackelberg 賽局和多代理人強化學習之中繼傳輸群組建立及即時影像分享
(Using Stackelberg Game and Multi-Agent Reinforcement Learning to Self-Organize Relaying Groups for Real-Time Video Sharing in Vehicular Networks)

相關論文

★ 非結構同儕網路上以特徵相似度為基準之搜尋方法	★ 以階層式叢集聲譽為基礎之行動同儕網路拓撲架構
★ 線上RSS新聞資料流中主題性事件監測機制之設計與實作	★ 耐延遲網路下具密度感知的路由方法
★ 整合P2P與UPnP內容分享服務之家用多媒體閘道器：設計與實作	★ 家庭網路下簡易無縫式串流影音播放服務之設計與實作
★ 耐延遲網路下訊息傳遞時間分析與高效能路由演算法設計	★ BitTorrent P2P 檔案系統下載端網路資源之可調式配置方法與效能實測
★ 耐延遲網路中利用訊息編碼重組條件之資料傳播機制	★ 耐延遲網路中基於人類移動模式之路由機制
★ 車載網路中以資料匯集技術改善傳輸效能之封包傳送機制	★ 適用於交叉路口環境之車輛叢集方法
★ 車載網路下結合路側單元輔助之訊息廣播機制	★ 耐延遲網路下以靜態中繼節點（暫存盒）最佳化訊息傳遞效能之研究
★ 耐延遲網路下以動態叢集感知建構之訊息傳遞機制	★ 跨裝置影音匯流平台之設計與實作

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

由於快速的城市化導致交通路況變得更不穩定，若是後方車輛無法得知前方的路況為何，當前方有事故發生或是異常狀況，將會導致反應不及發生追撞事件，造成嚴重的交通事故與安全問題。因此車輛間畫面的協作共享將會成為一項重要的議題，隨著 5G 和人工智慧的蓬勃發展，不但能利用無線通訊讓車載裝置之間進行快速的溝通，也能針對所收集到的數據進行分析部屬。有鑑於此，本研究首先使用車載仿真模擬器進行真實環境的建模，接著利用賽局理論對車載環境進行詳細的描述與定義，然後將其集成至多代理人強化學習模型，並採用 MADDPG 模型解決此問題，以挑選擁有最低延遲、最高數據傳輸率的最佳傳輸路徑，最終將車輛組成自組織網路以實現畫面傳輸共享。在分析方面，本研究針對不同的車載訊息傳輸方式、車載間跳點裝置的最大數，皆有進行評估比較，並比較了多代理人與單代理人強化學習之間的評估，實驗結果表明，部屬多代理人強化學習能使車載傳遞訊息時的效果更好，有較高的效能。最終本研究將針對傳輸延遲、數據傳輸率、功耗等三項指標進行不同模型之間的評估分析。

摘要(英)

Due to rapid urbanization, traffic conditions have become increasingly unpredictable. In the scenarios of neighbor vehicles crowds, vehicles in the rear are unaware of the current road conditions ahead. Accidents or abnormal situations occur in the front can lead to delayed reactions and rear-end collisions, this phenomenon which results in severe traffic accidents and safety concerns. Collaborative sharing of visual information among vehicles becomes an important issue. With the rapid development of 5G and artificial intelligence, not only can wireless communications be utilized for fast data transmissions between in vehicle devices, but the data collected can also be analyzed and deployed. Hence, the study in this thesis first utilizes a vehicular simulation emulator to model real-world environments. Subsequently, the game theory is employed to provide a detailed description and definition of the vehicular environment. Both of the above two efforts are then integrated into a multi-agent reinforcement learning model, using the Multi-Agent Deep Deterministic Policy Gradient（MADDPG）approach. The objective is to select the optimal transmission path with the lowest latency and highest data transmission rate, thereby enabling vehicles to form a self-organizing network for video transmission and sharing. This study evaluates and compares different vehicular information transmission methods and the maximum number of hop devices between vehicles. In addition, this study compares the evaluations between multi-agent and single-agent reinforcement learning approaches. Experimental results demonstrate that deploying multi-agent reinforcement learning yields better performance and higher efficiency in vehicular message transmission. Finally, this study conducts evaluation and analysis among different models based on three metrics: transmission latency, data transmission rate, and power consumption.

關鍵字(中)

★ 仿真環境模擬
★ 邊緣計算
★ 車聯網
★ 賽局理論
★ 多代理人強化學習
★ 車載自組織網路

關鍵字(英)

★ Simulation environment modeling
★ Edge computing
★ Vehicular networks
★ Game theory
★ Multi-agent reinforcement learning
★ Vehicular self-organizing network

論文目次

摘要 i
Abstract ii
致謝 iii
圖目錄 vii
表目錄 ix
1 簡介 1
2 研究背景及文獻探討 4
2.1 影像碼率調整與傳輸 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 自適應比特率調整 . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 影像串流傳輸 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 賽局理論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 多代理人強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.1 機器學習背景 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.3.2 強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.3 多代理人強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 邊緣計算 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.5 車載自組織網路 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 研究方法 15
3.1 系統架構 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.1 車載環境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.1.2 數據接收率比值 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.3 傳輸功耗 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.4 數據正規化 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.5 系統流程 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 賽局理論 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 領導者效用函數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.2 跟隨者效用函數 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.1 單代理人強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.2 多代理人強化學習 . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.4 自組織跳點網路 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.5 演算法 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4 實驗與結果分析 39
4.1 實驗環境 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.1 參數表 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.2 模型參數設計 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2 Sumo 仿真環境介紹 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2.1 高速公路模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2.2 城市模型 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.3 超參數調整影響 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.1 學習率（α）影響 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.3.2 衰減率（γ ）影響 . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 模擬評估 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.1 車載傳輸方式比較 . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4.2 單跳點環境模型評估 . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4.3 K-跳極限計算 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.4.4 多跳點環境模型評估 . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.5 實驗結果 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.1 數據傳輸率評估比較 . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.5.2 傳輸延遲評估比較 . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.5.3 功耗評估比較 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5 結論與未來研究 83
參考文獻 84

參考文獻

[1] Statistics of accidents by the daoan information inquiry network. [Online].Available: https://roadsafety.tw/Dashboard/Custom?type=%E7%B5%B1%E8%A8%88%E5%BF%AB%E8%A6%BD
[2] (2019) Cisco visual networking index, global mobile data traffic forecast update, 2017-2022 white paper. [Online]. Available: http://media.mediapost.com/uploads/CiscoForecast.pdf
[3] S. Takahashi, K. Yamagishi, P. Lebreton, and J. Okamoto, “Impact of quality factors on users＇viewing behaviors in adaptive bitrate streaming services,” in 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 2019, pp. 1–6.
[4] Z. Zhou, P. Liu, Z. Chang, C. Xu, and Y. Zhang, “Energy-efficient workload offloading and power control in vehicular edge computing,” in 2018 IEEE Wireless Communications and Networking Conference Workshops (WCNCW). IEEE, 2018, pp. 191–196.
[5] X. Huang, R. Yu, J. Kang, and Y. Zhang, “Distributed reputation management for secure and efficient vehicular edge computing and networks,” IEEE Access, vol. 5, pp. 25 408–25 420, 2017.
[6] M. Seufert, S. Egger, M. Slanina, T. Zinner, T. Hoßfeld, and P. Tran-Gia, “A survey on quality of experience of http adaptive streaming,” IEEE Communications Surveys & Tutorials, vol. 17, no. 1, pp. 469–492, 2014.
[7] A.-T. Tran, N.-N. Dao, and S. Cho, “Bitrate adaptation for video streaming services in edge caching systems,” IEEE Access, vol. 8, pp. 135 844–135 852, 2020.
[8] Z. Wang, Y. Cui, X. Hu, X. Wang, W. T. Ooi, Z. Cao, and Y. Li, “Multilive: Adaptive bitrate control for low-delay multi-party interactive live streaming,” IEEE/ACM Transactions on Networking, vol. 30, no. 2, pp. 923–938, 2021.
[9] H. Mao, R. Netravali, and M. Alizadeh, “Neural adaptive video streaming with pensieve,” in Proceedings of the conference of the ACM special interest group on data communication, 2017, pp. 197–210.
[10] M. Naresh, N. Gireesh, P. Saxena, and M. Gupta, “Sac-abr: Soft actor-critic based deep reinforcement learning for adaptive bitrate streaming,” in 2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS). IEEE, 2022, pp. 353–361.
[11] Y. Guo, F. R. Yu, J. An, K. Yang, C. Yu, and V. C. Leung, “Adaptive bitrate streaming in wireless networks with transcoding at network edge using deep reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 69, no. 4, pp. 3879–3892, 2020.
[12] H. Jin, Q. Wang, S. Li, and J. Chen, “Joint qos control and bitrate selection for video streaming based on multi-agent reinforcement learning,” in 2020 IEEE 16th International Conference on Control & Automation (ICCA). IEEE, 2020, pp. 1360–1365.
[13] J. Cao, X. Su, B. Finley, A. Pauanne, M. Ammar, and P. Hui, “Evaluating multimedia protocols on 5g edge for mobile augmented reality,” in 2021 17th International Conference on Mobility, Sensing and Networking (MSN). IEEE, 2021, pp. 199–206.
[14] Y. He, X. Hu, H. Wang, and J. Li, “Development and realization of home online teaching system based on video data analysis,” in 2020 5th international conference on mechanical, control and computer engineering (ICMCCE). IEEE, 2020, pp. 2097–2101.
[15] Y. Jing and Q. Gao, “Design and implementation of live streaming system for wearable devices,” in 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS). IEEE, 2018, pp. 1–5.
[16] Z. Zhu, X. Feng, Z. Tang, N. Jiang, T. Guo, L. Xu, and S. Wei, “Power-efficient live virtual reality streaming using edge offloading,” in Proceedings of the 32nd Workshop on Network and Operating Systems Support for Digital Audio and Video, 2022, pp. 57–63.
[17] J. F. Fisac, E. Bronstein, E. Stefansson, D. Sadigh, S. S. Sastry, and A. D. Dragan,“Hierarchical game-theoretic planning for autonomous vehicles,” in 2019 International conference on robotics and automation (ICRA). IEEE, 2019, pp. 9590–9596.
[18] N. Li, Y. Yao, I. Kolmanovsky, E. Atkins, and A. R. Girard, “Game-theoretic modeling of multi-vehicle interactions at uncontrolled intersections,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 2, pp. 1428–1442, 2020.
[19] S. Çalışır and M. K. Pehlivanoğlu, “Model-free reinforcement learning algorithms: A survey,” in 2019 27th signal processing and communications applications conference (SIU). IEEE, 2019, pp. 1–4.
[20] X. Hu, S. Xu, L. Wang, Y. Wang, Z. Liu, L. Xu, Y. Li, and W. Wang, “A joint power and bandwidth allocation method based on deep reinforcement learning for v2v communications in 5g,” China Communications, vol. 18, no. 7, pp. 25–35, 2021.
[21] X. Wei, M. Zhou, S. Kwong, H. Yuan, and T. Xiang, “Joint reinforcement learning and game theory bitrate control method for 360-degree dynamic adaptive streaming,” in ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021, pp. 4230–4234.
[22] X. Li, H. Yang, Q. Yao, B. Bao, J. Li, and J. Zhang, “Deep reinforcement learningbased power and caching joint optimal allocation over mobile edge computing,” in 2020 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). IEEE, 2020, pp. 1–3.
[23] T. Li, K. Zhu, N. C. Luong, D. Niyato, Q. Wu, Y. Zhang, and B. Chen, “Applications of multi-agent reinforcement learning in future internet: A comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 24, no. 2, pp. 1240–1279, 2022.
[24] X. Li, L. Lu, W. Ni, A. Jamalipour, D. Zhang, and H. Du, “Federated multi-agent deep reinforcement learning for resource allocation of vehicle-to-vehicle communications,” IEEE Transactions on Vehicular Technology, vol. 71, no. 8, pp. 8810–8824, 2022.
[25] Z. Jiandong, Y. Qiming, S. Guoqing, L. Yi, and W. Yong, “Uav cooperative air combat maneuver decision based on multi-agent reinforcement learning,” Journal of Systems Engineering and Electronics, vol. 32, no. 6, pp. 1421–1438, 2021.
[26] J. Aguilar-Armijo, “Multi-access edge computing for adaptive bitrate video streaming,” in Proceedings of the 12th ACM Multimedia Systems Conference, 2021, pp. 378–382.
[27] H. Wang, X. Li, H. Ji, and H. Zhang, “Federated offloading scheme to minimize latency in mec-enabled vehicular networks,” in 2018 IEEE Globecom Workshops (GC Wkshps). IEEE, 2018, pp. 1–6.
[28] W. Shi, Q. Li, R. Zhang, G. Shen, Y. Jiang, Z. Yuan, and G.-M. Muntean, “Qoe ready to respond: a qoe-aware mec selection scheme for dash-based adaptive video streaming to mobile users,” in Proceedings of the 29th ACM International Conference on Multimedia, 2021, pp. 4016–4024.
[29] B. Ravi, J. Thangaraj, and S. Petale, “Stochastic network optimization of data dissemination for multi-hop routing in vanets,” in 2018 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET). IEEE, 2018, pp. 1–4.
[30] G. Luo, Q. Yuan, H. Zhou, N. Cheng, Z. Liu, F. Yang, and X. S. Shen, “Cooperative vehicular content distribution in edge computing assisted 5g-vanet,” China communications, vol. 15, no. 7, pp. 1–17, 2018.
[31] Y. Yang, R. Zhao, and X. Wei, “Research on data distribution for vanet based on deep reinforcement learning,” in 2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM). IEEE, 2019, pp. 484–487.
[32] Y. Liu, “Vanet routing protocol simulation research based on ns-3 and sumo,” in 2021 IEEE 4th International Conference on Electronics Technology (ICET). IEEE, 2021, pp. 1073–1076.
[33] S. Jat, R. S. Tomar, and M. S. P. Sharma, “Traffic analysis for accidents reduction in vanet＇s,” in 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE). IEEE, 2019, pp. 115–118.
[34] M. Tahira, D. Ather, and A. K. Saxena, “Modeling and evaluation of heterogeneous networks for vanets,” in 2018 International Conference on System Modeling & Advancement in Research Trends (SMART). IEEE, 2018, pp. 150–153.
[35] M. H. C. Garcia, A. Molina-Galan, M. Boban, J. Gozalvez, B. Coll-Perales, T. Şahin, and A. Kousaridas, “A tutorial on 5g nr v2x communications,” IEEE Communications Surveys & Tutorials, vol. 23, no. 3, pp. 1972–2026, 2021.
[36] L. Zou, R. Trestian, and G.-M. Muntean, “edoas: Energy-aware device-oriented adaptive multimedia scheme for wi-fi offload,” in 2014 IEEE Wireless Communications and Networking Conference (WCNC). IEEE, 2014, pp. 2916–2921.
[37] Q. Luo, C. Li, T. H. Luan, and W. Shi, “Collaborative data scheduling for vehicular edge computing via deep reinforcement learning,” IEEE Internet of Things Journal, vol. 7, no. 10, pp. 9637–9650, 2020.
[38] Y.-H. Xu, C.-C. Yang, M. Hua, and W. Zhou, “Deep deterministic policy gradient (ddpg)-based resource allocation scheme for noma vehicular communications,” IEEE Access, vol. 8, pp. 18 797–18 807, 2020.
[39] X. Hu, S. Xu, L. Wang, Y. Wang, Z. Liu, L. Xu, Y. Li, and W. Wang, “A joint power and bandwidth allocation method based on deep reinforcement learning for v2v communications in 5g,” China Communications, vol. 18, no. 7, pp. 25–35, 2021.
[40] S.-W. Kim, B. Qin, Z. J. Chong, X. Shen, W. Liu, M. H. Ang, E. Frazzoli, and D. Rus, “Multivehicle cooperative driving using cooperative perception: Design and experimental validation,” IEEE Transactions on Intelligent Transportation Systems, vol. 16, no. 2, pp. 663–680, 2014.
[41] G. Noh, J. Kim, S. Choi, N. Lee, H. Chung, and I. Kim, “Feasibility validation of a 5g-enabled mmwave vehicular communication system on a highway,” IEEE Access, vol. 9, pp. 36 535–36 546, 2021.
[42] Y. Yao, Y. Hu, G. Yang, and X. Zhou, “On mac access delay distribution for ieee 802.11 p broadcast in vehicular networks,” IEEE Access, vol. 7, pp. 149 052–149 067, 2019.
[43] P. Droździel, S. Tarkowski, I. Rybicka, and R. Wrona, “Drivers＇reaction time research in the conditions in the real traffic,” Open Engineering, vol. 10, no. 1, pp. 35–47, 2020.
[44] T.-Y. Chen, Y. Chiang, J.-H. Wu, H.-T. Chen, C.-C. Chen, and H.-Y. Wei, “Ieee p1935 edge/fog manageability and orchestration: Standard and usage example„” 2023.

指導教授

胡誌麟(Chih-Lin Hu)

審核日期

2023-8-14

推文