參考文獻 |
[1] Hanbyul Seo, Ki-Dong Lee, Shinpei Yasukawa, Ying Peng, and Philippe Sartori.Lte evolution for vehicle-to-everything services.IEEE communications magazine,54(6):22–28, 2016.
[2] Zhipeng Liu, Yinhui Han, Jianwei Fan, Lin Zhang, and Yunzhi Lin. Joint optimizationof spectrum and energy efficiency considering the c-v2x security: A deep reinforce-ment learning approach.arXiv preprint arXiv:2003.10620, 2020.
[3] Le Liang, Shijie Xie, Geoffrey Ye Li, Zhi Ding, and Xingxing Yu. Graph-basedresource sharing in vehicular communication.IEEE Transactions on Wireless Com-munications, 17(7):4579–4592, 2018.
[4] Le Liang, Joonbeom Kim, Satish C Jha, Kathiravetpillai Sivanesan, and Geoffrey YeLi. Spectrum and power allocation for vehicular communications with delayed csifeedback.IEEE Wireless Communications Letters, 6(4):458–461, 2017.
[5] Muhammad Ikram Ashraf, Mehdi Bennis, Cristina Perfecto, and Walid Saad. Dy-namic proximity-aware resource allocation in vehicle-to-vehicle (v2v) communica-tions. In2016 IEEE Globecom Workshops (GC Wkshps), pages 1–6. IEEE, 2016.
[6] Bo Bai, Wei Chen, Khaled Ben Letaief, and Zhigang Cao. Low complexity outageoptimal distributed channel allocation for vehicle-to-vehicle communications.IEEEJournal on Selected Areas in Communications, 29(1):161–172, 2010.
[7] Hao Ye and Geoffrey Ye Li. Deep reinforcement learning based distributed resourceallocation for v2v broadcasting. In2018 14th International Wireless Communications& Mobile Computing Conference (IWCMC), pages 440–445. IEEE, 2018.
[8] Le Liang, Hao Ye, and Geoffrey Ye Li. Toward intelligent vehicular networks: Amachine learning framework.IEEE Internet of Things Journal, 6(1):124–135, 2018.
[9] Hao Ye, Geoffrey Ye Li, and Biing-Hwang Fred Juang. Deep reinforcement learningbased resource allocation for v2v communications.IEEE Transactions on VehicularTechnology, 68(4):3163–3173, 2019.
[10] Liang Wang, Hao Ye, Le Liang, and Geoffrey Ye Li. Learn to compress csi andallocate resources in vehicular networks.IEEE Transactions on Communications,2020.
[11] Helin Yang, Xianzhong Xie, and Michel Kadoch. Intelligent resource managementbased on reinforcement learning for ultra-reliable and low-latency iov communicationnetworks.IEEE Transactions on Vehicular Technology, 68(5):4157–4169, 2019.
[12] Min Zhao, Yifei Wei, Mei Song, and Guo Da. Power control for d2d communicationusing multi-agent reinforcement learning. In2018 IEEE/CIC International Confer-ence on Communications in China (ICCC), pages 563–567. IEEE, 2018.
[13] Zheng Li, Caili Guo, and Yidi Xuan. A multi-agent deep reinforcement learningbased spectrum allocation framework for d2d communications. In2019 IEEE GlobalCommunications Conference (GLOBECOM), pages 1–6. IEEE, 2019.
[14] Le Liang, Hao Ye, and Geoffrey Ye Li. Spectrum sharing in vehicular networksbased on multi-agent reinforcement learning.IEEE Journal on Selected Areas inCommunications, 37(10):2282–2292, 2019.[15] Dohyun Kwon and Joongheon Kim. Multi-agent deep reinforcement learning forcooperative connected vehicles. In2019 IEEE Global Communications Conference(GLOBECOM), pages 1–6. IEEE, 2019.
[16] Ranjit Nair, Milind Tambe, Maayan Roth, and Makoto Yokoo. Communications forimproving policy computation in distributed pomdps. InProceedings of the ThirdInternational Joint Conference on Autonomous Agents and Multiagent Systems, 2004.AAMAS 2004., pages 1098–1105. IEEE, 2004.
[17] Rose E Wang, Michael Everett, and Jonathan P How. R-maddpg for partially observ-able environments and limited communication.arXiv preprint arXiv:2002.06684,2020.
[18] Shayegan Omidshafiei, Jason Pazis, Christopher Amato, Jonathan P How, and JohnVian. Deep decentralized multi-task multi-agent reinforcement learning under partialobservability.arXiv preprint arXiv:1703.06182, 2017.
[19] Jakob N Foerster, Yannis M Assael, Nando de Freitas, and Shimon Whiteson. Learn-ing to communicate to solve riddles with deep distributed recurrent q-networks.arXivpreprint arXiv:1602.02672, 2016.
[20] WANG amd iang. Learn to allocate resources in vehicular network.arXiv preprintarXiv:1908.03447, 2019.
[21] Ibrahim Althamary, Chih-Wei Huang, and Phone Lin. A survey on multi-agent rein-forcement learning methods for vehicular networks. In2019 15th International Wire-less Communications & Mobile Computing Conference (IWCMC), pages 1154–1159.IEEE, 2019.
[22] Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil AnthonyBharath. A brief survey of deep reinforcement learning.IEEE Signal ProcessingMagazine, 34(6):26–38, 2017.
[23] Mehdi Mohammadi, Ala Al-Fuqaha, Sameh Sorour, and Mohsen Guizani. Deeplearning for iot big data and streaming analytics: A survey.IEEE CommunicationsSurveys & Tutorials, 20(4):2923–2960, 2018.
[24] Tianshu Chu, Jie Wang, Lara Codec`a, and Zhaojian Li. Multi-agent deep reinforce-ment learning for large-scale traffic signal control.IEEE Transactions on IntelligentTransportation Systems, 2019.
[25] Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness,Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, GeorgOstrovski, et al. Human-level control through deep reinforcement learning.Nature,518(7540):529, 2015.
[26] Vijay R Konda and John N Tsitsiklis. Actor-critic algorithms. InAdvances in neuralinformation processing systems, pages 1008–1014, 2000.
[27] Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez,Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep rein-forcement learning.arXiv preprint arXiv:1509.02971, 2015.
[28] Ryan Lowe, Yi I Wu, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, and Igor Mor-datch. Multi-agent actor-critic for mixed cooperative-competitive environments. InAdvances in neural information processing systems, pages 6379–6390, 2017.
[29] Frans A Oliehoek. Decentralized pomdps. InReinforcement Learning, pages 471–503. Springer, 2012.
[30] Matthew Hausknecht and Peter Stone. Deep recurrent q-learning for partially observ-able mdps. In2015 AAAI Fall Symposium Series, 2015.
[31] Sepp Hochreiter and J ̈urgen Schmidhuber. Long short-term memory.Neural compu-tation, 9(8):1735–1780, 1997.
[32] Simulation of urban mobility.http://sumo.sourceforge.net/. Accessed: 2019-06-30.
[33] Lixia Xue, Yuchen Yang, and Decun Dong. Roadside infrastructure planning schemefor the urban vehicular networks.Transportation Research Procedia, 25:1380–1396,2017.
[34] Prithviraj Patil and Aniruddha Gokhale. Improving the reliability and availability ofvehicular communications using voronoi diagram-based placement of road side units.In2012 IEEE 31st Symposium on Reliable Distributed Systems, pages 400–401. IEEE,2012.
[35] Yi-Han Xu, Cheng-Cheng Yang, Min Hua, and Wen Zhou. Deep deterministic policygradient (ddpg)-based resource allocation scheme for noma vehicular communica-tions.IEEE Access, 8:18797–18807, 2020.
[36] Jakob Foerster, Nantas Nardelli, Gregory Farquhar, Triantafyllos Afouras, Philip HSTorr, Pushmeet Kohli, and Shimon Whiteson. Stabilising experience replay for deepmulti-agent reinforcement learning. InProceedings of the 34th International Confer-ence on Machine Learning-Volume 70, pages 1146–1155. JMLR. org, 2017.
[37] Thanh Thi Nguyen, Ngoc Duy Nguyen, and Saeid Nahavandi. Deep reinforcementlearning for multiagent systems: A review of challenges, solutions, and applications.IEEE Transactions on Cybernetics, 2020.
[38] Qian Long, Zihan Zhou, Abhibav Gupta, Fei Fang, Yi Wu, and Xiaolong Wang. Evo-lutionary population curriculum for scaling multi-agent reinforcement learning.arXivpreprint arXiv:2003.10423, 2020. |