Meta-Learning Traffic Pattern Adaptation for DRL-Based Radio Resource Management

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：24

、訪客IP：18.222.107.103

姓名

陳宇睿(CHEN,YU-JUI) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

(Meta-Learning Traffic Pattern Adaptation for DRL-Based Radio Resource Management)

相關論文

★ 基於馬賽克特性之低失真實體電路佈局保密技術	★ 多路徑傳輸控制協定下從無線區域網路到行動網路之無縫換手
★ 感知網路下具預算限制之異質性子頻段分配	★ 下行服務品質排程在多天線傳輸環境下的效能評估
★ 多路徑傳輸控制協定下之整合型壅塞及路徑控制	★ Opportunistic Scheduling for Multicast over Wireless Networks
★ 適用多用戶多輸出輸入系統之低複雜度比例公平性排程設計	★ 利用混合式天線分配之 LTE 異質網路 UE 與 MIMO 模式選擇
★ 基於有限預算標價式拍賣之異質性頻譜分配方法	★ 適用於 MTC 裝置 ID 共享情境之排程式分群方法
★ Efficient Two-Way Vertical Handover with Multipath TCP	★ 多路徑傳輸控制協定下可亂序傳輸之壅塞及排程控制
★ 移動網路下適用於閘道重置之群體換手機制	★ 使用率能小型基地台之拍賣是行動數據分流方法
★ 高速鐵路環境下之通道預測暨比例公平性排程設計	★ 用於行動網路效能評估之混合式物聯網流量產生器

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著第五代行動通訊技術的發展逐步成熟，人們對於透過行動網路進行交流與溝通的需求也持續飛快增長，未來將出現許多對延遲和可靠度要求嚴格的新型應用服務，例如：以360度8K超高畫質XR來觀看直播或遊玩多人互動式線上遊戲，使得5G網路的環境變得更加擁擠且複雜。因此未來在5G網路中如何有效率地進行資源分配將成為一項重要的任務，並且隨著人工智慧領域的蓬勃發展，許多關於機器學習的研究已被用於處理無線資源管理的問題上。然而傳統的深度學習方法雖然效能強大並已能夠在單一任務上取得優異的表現，但是當遇到變化較大的新環境時，無外乎都得再從頭開始學習。而元學習是機器學習領域近年來興起的新技術，被認為能夠有效地解決這個問題，因此在本研究中我們提出了基於元學習適應性的深度強化學習演算法，用以解決在5G MEC下多使用者互動環境中的資源分配問題，來讓系統的頻寬效益最大化。最後我們的實驗結果顯示，透過元學習的技術不但能有效的提高現有深度強化學習演算法的效能，並且未來將我們的資源分配機器學習模型部署到新環境時，更能快速適應從未接觸過的新環境。

摘要(英)

As the development of the fifth-generation mobile communication technology gradually matures, people′s demand for communication through mobile networks continues to grow rapidly. In the future, many new application services with strict requirements on latency and reliability will appear, such as 360-degree 8K Ultra-high-definition XR watching the live stream or playing multiplayer interactive online games makes the 5G network environment more crowded and complicated. Therefore, how to efficiently allocate resources in 5G networks will become an important task in the future, and with the vigorous development of artificial intelligence, many kinds of research on machine learning have been used to deal with wireless resource management issues. Although traditional deep learning methods are powerful and have been able to achieve excellent performance on a single task. But when encountering a new environment with great changes, it is nothing more than learning from scratch. Meta-Learning is a new technology that has emerged in the field of machine learning in recent years. It is considered to be able to effectively solve this problem. In this research, we propose a deep reinforcement learning algorithm based on Meta-Learning adaptability to solve the problem of resource allocation in a multi-user interactive environment under 5G MEC to maximize the bandwidth benefit of the system. The final experimental results show that the Meta-Learning technology can not only effectively improve the performance of existing deep reinforcement learning algorithms, but also when deploying our resource allocation machine learning model to a new environment in the future, it will be able to quickly adapt to a new environment that has never been touched before.

關鍵字(中)

★ 無線資源管理

關鍵字(英)

論文目次

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 B5G/6G Application Scenarios . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Background and Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1 RRM for MEC Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Meta-Learning for DRL . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Traffic Pattern Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3 Hybrid traffic scenario and multi-user interactive radio resource management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3 Traffic Pattern Adaptation Problem . . . . . . . . . . . . . . . . . . . . . 9

4 MDP Model for Edge RRM . . . . . . . . . . . . 11
4.1 The MDP Model and State-Action-Reward Setting . . . . . . . . . . . . 11

5 Meta-Learning Based Traffic Pattern Adaptation . . . . 14
5.1 Decoupled Exploration and Execution Strategy via Meta-Learning . . . . 14
5.2 Model-Based Meta-Learning . . . . . . . . . . . . . . . . . . . . . . . . 16

6 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.1 Simulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
6.2 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

7 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

參考文獻

[1] Y.-H. Hsu, J.-H. Cheng, K.-Y. Liao, Y.-S. Wang, T.-H. Chen, H.-Y. Chen, C.-K.
Yen, and W. Liao, “Ntu smart edge for wireless virtual reality,” in 2020 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-Taiwan), 2020, pp.
1–2.
[2] Q.-V. Pham, F. Fang, V. N. Ha, M. J. Piran, M. Le, L. B. Le, W.-J. Hwang, and
Z. Ding, “A survey of multi-access edge computing in 5g and beyond: Fundamentals, technology integration, and state-of-the-art,” IEEE Access, vol. 8, pp. 116 974–
117 017, 2020.
[3] B. Cao, L. Zhang, Y. Li, D. Feng, and W. Cao, “Intelligent offloading in multi-access
edge computing: A state-of-the-art review and framework,” IEEE Communications
Magazine, vol. 57, no. 3, pp. 56–62, 2019.
[4] X. Wang, Y. Han, C. Wang, Q. Zhao, X. Chen, and M. Chen, “In-edge ai: Intelligentizing mobile edge computing, caching and communication by federated learning,”
IEEE Network, vol. 33, no. 5, pp. 156–165, 2019.
[5] Y. Wei, F. R. Yu, M. Song, and Z. Han, “Joint optimization of caching, computing,
and radio resources for fog-enabled iot using natural actor–critic deep reinforcement
learning,” IEEE Internet of Things Journal, vol. 6, no. 2, pp. 2061–2073, 2019.
[6] L. T. Tan and R. Q. Hu, “Mobility-aware edge caching and computing in vehicle
networks: A deep reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 67, no. 11, pp. 10 190–10 203, 2018.
[7] X. Hu, S. Liu, R. Chen, W. Wang, and C. Wang, “A deep reinforcement learningbased framework for dynamic resource allocation in multibeam satellite systems,”
IEEE Communications Letters, vol. 22, no. 8, pp. 1612–1615, 2018.
[8] Z. Du, Y. Deng, W. Guo, A. Nallanathan, and Q. Wu, “Green deep reinforcement
learning for radio resource management: Architecture, algorithm compression, and
challenges,” IEEE Vehicular Technology Magazine, vol. 16, no. 1, pp. 29–39, 2021.
[9] C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation
of deep networks,” in International Conference on Machine Learning. PMLR,
2017, pp. 1126–1135.
[10] P.-C. Chen, Y.-C. Chen, W.-H. Huang, C.-W. Huang, and O. Tirkkonen, “Ddpgbased radio resource management for user interactive mobile edge networks,” in
2020 2nd 6G Wireless Summit (6G SUMMIT), 2020, pp. 1–5.
[11] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and
D. Wierstra, “Continuous Control with Deep Reinforcement Learning,” in International Conference on Learning Representations (ICLR), Feb. 2016.
[12] Y. Mao, J. Zhang, and K. B. Letaief, “Dynamic computation offloading for mobileedge computing with energy harvesting devices,” IEEE Journal on Selected Areas
in Communications, vol. 34, no. 12, pp. 3590–3605, 2016.
[13] C. You, K. Huang, H. Chae, and B.-H. Kim, “Energy-efficient resource allocation
for mobile-edge computation offloading,” IEEE Transactions on Wireless Communications, vol. 16, no. 3, pp. 1397–1411, 2017.
[14] A. Al-Shuwaili and O. Simeone, “Energy-efficient resource allocation for mobile
edge computing-based augmented reality applications,” IEEE Wireless Communications Letters, vol. 6, no. 3, pp. 398–401, 2017.
[15] X. Chen, L. Jiao, W. Li, and X. Fu, “Efficient multi-user computation offloading for
mobile-edge cloud computing,” IEEE/ACM Transactions on Networking, vol. 24,
no. 5, pp. 2795–2808, 2016.
[16] J. Zhang, W. Xia, F. Yan, and L. Shen, “Joint computation offloading and resource
allocation optimization in heterogeneous networks with mobile edge computing,”
IEEE Access, vol. 6, pp. 19 324–19 337, 2018.
[17] T. Yang, Y. Hu, M. C. Gursoy, A. Schmeink, and R. Mathar, “Deep reinforcement
learning based resource allocation in low latency edge computing networks,” in 2018
15th International Symposium on Wireless Communication Systems (ISWCS), 2018,
pp. 1–5.
[18] T. Alfakih, M. M. Hassan, A. Gumaei, C. Savaglio, and G. Fortino, “Task offloading
and resource allocation for mobile edge computing by deep reinforcement learning
based on sarsa,” IEEE Access, vol. 8, pp. 54 074–54 084, 2020.
[19] N. Shan, X. Cui, and Z. Gao, ““drl+ fl”: An intelligent resource allocation model
based on deep reinforcement learning for mobile edge computing,” Computer Communications, vol. 160, pp. 14–24, 2020.
[20] M. Ren, E. Triantafillou, S. Ravi, J. Snell, K. Swersky, J. B. Tenenbaum,
H. Larochelle, and R. S. Zemel, “Meta-learning for semi-supervised few-shot classification,” arXiv preprint arXiv:1803.00676, 2018.
[21] C. Finn, K. Xu, and S. Levine, “Probabilistic model-agnostic meta-learning,”
in Advances in Neural Information Processing Systems, S. Bengio, H. Wallach,
H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31.
Curran Associates, Inc., 2018. [Online]. Available: https://proceedings.neurips.cc/
paper/2018/file/8e2c381d4dd04f1c55093f22c59c3a08-Paper.pdf
[22] M. Botvinick, S. Ritter, J. X. Wang, Z. Kurth-Nelson, C. Blundell, and D. Hassabis,
“Reinforcement learning, fast and slow,” Trends in cognitive sciences, vol. 23, no. 5,
pp. 408–422, 2019.
[23] K. Rakelly, A. Zhou, C. Finn, S. Levine, and D. Quillen, “Efficient off-policy metareinforcement learning via probabilistic context variables,” in International conference on machine learning. PMLR, 2019, pp. 5331–5340.
[24] J. D. Co-Reyes, Y. Miao, D. Peng, E. Real, Q. V. Le, S. Levine,
H. Lee, and A. Faust, “Evolving reinforcement learning algorithms,” in
International Conference on Learning Representations, 2021. [Online]. Available:
https://openreview.net/forum?id=0XXpJ4OtjW
[25] X. Song, Y. Yang, K. Choromanski, K. Caluwaerts, W. Gao, C. Finn, and
J. Tan, “Rapidly adaptable legged robots via evolutionary meta-learning,” in 2020
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020,
pp. 3769–3776.
[26] Q. He, A. Moayyedi, G. Dan, G. P. Koudouridis, and P. Tengkvist, “A meta-learning ´
scheme for adaptive short-term network traffic prediction,” IEEE Journal on Selected Areas in Communications, vol. 38, no. 10, pp. 2271–2283, 2020.
[27] A. A. Rusu, D. Rao, J. Sygnowski, O. Vinyals, R. Pascanu, S. Osindero,
and R. Hadsell, “Meta-learning with latent embedding optimization,” in
International Conference on Learning Representations, 2019. [Online]. Available:
https://openreview.net/forum?id=BJgklhAcK7
[28] K. Lee, Y. Seo, S. Lee, H. Lee, and J. Shin, “Context-aware dynamics model for
generalization in model-based reinforcement learning,” in International Conference
on Machine Learning. PMLR, 2020, pp. 5757–5766.
[29] H. Mao, M. Alizadeh, I. Menache, and S. Kandula, “Resource management with
deep reinforcement learning,” in Proceedings of the 15th ACM workshop on hot
topics in networks, 2016, pp. 50–56.
[30] E. Z. Liu, A. Raghunathan, P. Liang, and C. Finn, “Decoupling exploration and exploitation for meta-reinforcement learning without sacrifices,” in International Conference on Machine Learning. PMLR, 2021, pp. 6925–6935.
[31] S. Belkhale, R. Li, G. Kahn, R. McAllister, R. Calandra, and S. Levine, “Modelbased meta-reinforcement learning for flight with suspended payloads,” IEEE
Robotics and Automation Letters, vol. 6, no. 2, pp. 1471–1478, 2021.
[32] M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul, “An introduction to
variational methods for graphical models,” Machine learning, vol. 37, no. 2, pp.
183–233, 1999.
[33] C. M. Bishop, Pattern Recognition and Machine Learning. Springer, 2006.
[34] 3GPP TS 23.501, “System Architecture for the 5G System.”
[35] T. Xu, Q. Liu, L. Zhao, and J. Peng, “Learning to explore via meta-policy gradient,”
in International Conference on Machine Learning. PMLR, 2018, pp. 5463–5472.

指導教授

黃志煒

審核日期

2021-10-28

推文