深度強化學習於適應性號誌控制之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：103

、訪客IP：3.148.105.127

姓名

王亦凡(I-Fan Wang) 查詢紙本館藏

畢業系所

土木工程學系

論文名稱

深度強化學習於適應性號誌控制之研究
(Research on Deep Reinforcement Learning for Adaptive Traffic Signal Control)

相關論文

★ 圖書館系統通閱移送書籍之車輛途程問題	★ 起迄對旅行時間目標下高速公路匝道儀控之研究
★ 結合限制規劃法與螞蟻演算法求解運動排程問題	★ 共同邊界資料包絡分析法在運輸業之應用-以國內航線之經營效率為例
★ 雙北市公車乘客知覺服務品質、知覺價值、滿意度、行為意向路線與乘客之跨層次中介效果與調節式中介效果	★ Investigating the influential factors of public bicycle system and cyclist heterogeneity
★ A Mixed Integer Programming Formulation for the Three-Dimensional Unit Load Device Packing Problem	★ 高速公路旅行時間預測之研究--函數資料分析之應用
★ Behavior Intention and its Influential Factors for Motorcycle Express Service	★ Inferring transportation modes (bus or vehicle) from mobile phone data using support vector machine and deep neural network.
★ 混合羅吉特模型於運具選擇之應用-以中央大學到桃園高鐵站為例	★ Preprocessing of mobile phone signal data for vehicle mode identification using map-matching technique
★ 含額外限制式動態用路人均衡模型之研究	★ 動態起迄旅次矩陣推估模型之研究
★ 動態號誌時制控制模型求解演算法之研究	★ 不同決策變數下動態用路人均衡路徑選擇模型之研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

本研究旨在探討深度強化學習在適應性號誌控制中的應用，將透過微觀交通模擬軟體Vissim來模擬尖峰時段臺北市路口的車流情境，在考量不同車種當量影響和機車兩段式左轉設計下，建構基於深度強化學習演算法的適應性號誌控制系統，以改善目前市區路口尖峰時段的交通狀態。
架構上將透過深度強化學習網路Rainbow DQN作為號誌控制系統的判斷模型，考量流向基礎之車流狀態和時相狀態，動作選擇以時制順序切換與延長綠燈時間作為號誌控制方式，獎勵目標以最小化路口總壓力，並將結果與定時號誌為基準比較兩者間的路口績效表現。
實驗設計將晨峰和昏峰拆分成各三個不同時段場景訓練，結果顯示透過深度強化學習於適應性號誌控制確實能降低路口之停等長度，在各實驗場景皆可快速收斂於100回合內，並於晨峰尖峰時段改善50%的績效，且模型設計能適應研究設計中市區內不同尖峰時段的車流量，彈性的狀態、動作和獎勵設計能將模型一般化應用於其他場景應用。

摘要(英)

This study aims to explore the application of deep reinforcement learning in adaptive traffic signal control. Using the microscopic traffic simulation software Vissim, we simulate the traffic conditions at intersections in Taipei City during peak hours. Considering the effects of different vehicle types and the two-stage left-turn design for motorcycles, we construct an adaptive traffic signal control system based on a deep reinforcement learning algorithm to improve the current traffic conditions at urban intersections during peak hours.
The framework employs the deep reinforcement learning network Rainbow DQN as the decision model for the signal control system. The model considers traffic flow conditions and phase states, with action choices focusing on phase sequence switching and green light extension as control methods. The reward objective is to minimize the total intersection pressure. The system′s performance is compared with fixed-time signals as a baseline.
The experimental design splits morning and evening peaks into three different time periods for training. Results show that deep reinforcement learning in adaptive traffic signal control effectively reduces waiting times at intersections. The model converges quickly within 100 episodes across all experimental scenarios and improves performance by 50% during peak morning hours. Furthermore, the model design can adapt to varying traffic volumes during different peak periods in urban areas, with flexible state, action, and reward designs enabling generalization to other scenarios.

關鍵字(中)

★ 適應性號誌控制
★ 深度強化學習
★ Rainbow DQN
★ 交通模擬

關鍵字(英)

★ adaptive signal control
★ deep reinforcement learning
★ Rainbow DQN
★ traffic simulation

論文目次

摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vi
表目錄 vii
第一章緒論 1
第二章文獻回顧 4
2.1強化學習於號誌控制的應用 4
2.2類神經網路架構設計 5
2.3強化學習機制 5
第三章研究方法 8
3.1深度強化學習算法 8
3.2價值基礎之Rainbow DQN 8
3.2.1 Double DQN 9
3.2.2 Prioritized Experience Replay 10
3.2.3 Dueling Network 11
3.2.4 Distributional DQN 12
3.2.5 Noisy Net 13
3.2.6 n步學習 14
第四章模型與實驗設計 16
4.1強化學習之模型設計 16
4.1.1代理人設計 16
4.1.2類神經網路架構 19
4.1.3訓練流程 24
4.2研究範圍 27
4.2.1使用資料 28
4.2.2模擬軟體 29
4.3實驗設計 29
4.4號誌控制於模擬場景 30
第五章實驗訓練結果 31
5.1訓練績效 31
5.1.1等候長度和停等延滯 31
5.1.2車輛數分析 33
5.1.3損失分析 35
5.2車種當量設定比較 36
第六章結論與建議 37
6.1結論 37
6.2建議 38
第七章參考文獻 39
附錄 43

參考文獻

[1] 李秉原，2023，應用價值基礎之元強化學習方法於交通號誌控制之研究，國立中央大學土木工程系碩士論文。
[2] 胡守任、葉志韋、林定憲、劉瀚聰，2020，都市適應性號誌控制原理與發展，土木水利，第四十七卷，第四期，第28-39頁。
[3] 陳惠國，2022，強化學習應用於交通號誌控制之展望，中華道路季刊，第六十一卷，第四期，第43-54頁。
[4] Abdoos, M., Mozayani, N. and Bazzan, A. L., 2013, Holonic multi-agent system for traffic signals control. Engineering Applications of Artificial Intelligence, Vol.26, No.5, pp.1575–1587.
[5] Abdulhai, B., Pringle, R. and Karakoulas G. J., 2003, Reinforcement learning for true adaptive traffic signal control, Journal of Transportation Engineering, Vol.129, No.3, pp.278–285.
[6] Arel, I., Liu, C., Urbanik, T. and Kohls AG., 2010, Reinforcement learning based multi-agent system for network traffic signal control, IET Intelligent Transport Systems, Vol.4, No.2, pp.128–135.
[7] Bakker, B., Whiteson, S., Kester L. and Groen F. C., 2010, Traffic light control by multiagent reinforcement learning systems, Interactive Collaborative Information Systems, pp.475–510.
[8] Chen, C., Wei, H., Xu, N., Zheng, G., Yang, M., Xiong, Y., Xu, K. and Li, Z., 2020, Toward a thousand lights: Decentralized deep reinforcement learning for large-scale traffic signal control, Proceedings of the AAAI Conference on Artificial Intelligence, Vol.34, No.4, pp.3414–3421.
[9] Dabney, W., Rowland, M., Bellemare, M. and Munos, R., 2018, Distributional reinforcement learning with quantile regression, Proceedings of the AAAI conference on artificial intelligence, Vol.32, No.1.
[10] El-Tantawy, S., Abdulhai, B. and Abdelgawad, H., 2013, Multiagent reinforcement learning for integrated network of adaptive traffic signal controllers (MARLIN-ATSC): methodology and large-scale application on downtown Toronto, IEEE Transactions on Intelligent Transportation Systems, Vol.14, No.3, pp.1140–1150.
[11] Fortunato, M., Azar, M. G., Piot, B., Menick, J., Osband, I., Graves,A., Mnih, V., Munos, R., Hassabis, D., Pietquin, O., Blundell, C. and Legg, S., 2018, Noisy Networks for Exploration., The Twelfth International Conference on Learning Representations(ICLR).
[12] Gao, J., Shen, Y., Liu, J., Ito, M. and Shiratori, N., 2017, Adaptive traffic signal control: Deep reinforcement learning algorithm with experience replay and target network, arXiv:1705.02755.
[13] Genders, W. and Razavi, S., 2016, Using a deep reinforcement learning agent for traffic signal control, arXiv:1611.01142.
[14] Hasselt, V., Hado, Guez, A. and Silver, D., 2016, Deep reinforcement learning with double q-learning, Proceedings of the AAAI conference on artificial intelligence, Vol. 30, No.1.
[15] Hessel, M., Modayil, J., Hasselt, H., Schaul, T., Ostrovski, G., Dabney, W., Horgan, D., Piot, B., Azar, M. and Silver, D., 2017, Rainbow: combining improvements in deep reinforcement learning, arXiv:1710.02298.
[16] Li, L., Lv, Y. and Wang, F.Y., 2016, Traffic signal timing via deep reinforcement learning, IEEE/CAA Journal of Automatica Sinica , Vol.3, No.3, pp.247–254.
[17] Liu, M., Deng, J., Xu, M., Zhang, X. and Wang, W., 2017, Cooperative deep reinforcement learning for traffic signal control. In 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), Halifax.
[18] Mannion, P., Duggan, J. and Howley, E., 2016, An experimental review of reinforcement learning algorithms for adaptive traffic signal control, Autonomic Road Transport Support Systems, pp.47–66.
[19] Schaul, T., Quan, J., Antonoglou, I. and Silver, D., 2015, Prioritized experience replay, arXiv:1511.05952.
[20] Sutton, R. S. and Barto, A. G., 2018, Reinforcement Learning: An Introduction (2nd ed.). MIT Press.
[21] van der Pol E. and Oliehoek F. A., 2016, Coordinated deep reinforcement learners for traffic light control, Proceedings of learning, inference and control of multi-agent systems (at NIPS), Vol.8, pp.21-38.
[22] Wang, Z., Schaul, T., Hessel, M., Hasselt, H., Lanctot, M. and Freitas, N., 2016, Dueling network architectures for deep reinforcement learning. Proceedings of Machine Learning Research (PMLR), pp.1995-2003.
[23] Wei, H., Xu, N., Zhang, H., Zheng, G., Zang, X., Chen, C., Zhang, W., Zhu, Y., Xu, K. and Li, Z., 2019, PressLight: Learning max pressure control to coordinate traffic signals in arterial network, Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), New York, USA, pp.1290–1298.
[24] Wei, H., Zheng, G., Yao, H. and Li, Z., 2018, IntelliLight: A reinforcement learning approach for intelligent traffic light control, Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (KDD), London, UK, pp.2496-2505.
[25] Wiering, M., 2000, Multi-agent reinforcement learning for traffic light control, In Machine Learning: Proceedings of the Seventeenth International Conference (ICML), pp.1151–1158.
[26] Zang, X., Yao, H., Zheng, G., Xu, N., Xu, K. and Li, Z., 2020, MetaLight: Value-based meta-reinforcement learning for traffic signal control, Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, No.1, pp.1153-1160.
[27] Zhang, H., Liu, C., Zhang, W., Zheng, G. and Yu, Y., 2020, Generalight: Improving environment generalization of traffic signal control via meta reinforcement learning, Proceedings of the 29th ACM international conference on information & knowledge management, pp.1783-1792.
[28] Zhao, W., Ye, Y., Ding, J., Wang, T., Wei, T. and Chen, M., 2022, IPDALight: Intensity and phase duration-aware traffic signal control based on reinforcement learning, Journal of Systems Architecture, Vol. 123, pp.102374-102385.
[29] Zheng, G., Xiong, Y., Zang, X., Feng, J., Wei, H., Zhang, H., Li, Y., Xu, K. and Li, Z., 2019, Learning phase competition for traffic signal control, Proceedings of the 28th ACM International Conference on Information and Knowledge Management, pp.1963-1972.
[30] Zheng, G., Zang, X., Xu, N., Wei, H., Yu, Z., Gayah, V., Xu, K. and Li, Z., 2019, Diagnosing reinforcement learning for traffic signal control, arXiv:1905.04716.

指導教授

陳惠國(Huey-Kuo Chen)

審核日期

2024-8-15

推文