博碩士論文 107523034 完整後設資料紀錄

DC 欄位 語言
DC.contributor通訊工程學系zh_TW
DC.creator陳彥辰zh_TW
DC.creatorYen-Chen Chenen_US
dc.date.accessioned2020-8-20T07:39:07Z
dc.date.available2020-8-20T07:39:07Z
dc.date.issued2020
dc.identifier.urihttp://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=107523034
dc.contributor.department通訊工程學系zh_TW
DC.description國立中央大學zh_TW
DC.descriptionNational Central Universityen_US
dc.description.abstract由於第五代通訊系統(5G)的發展,增強了網路系統的能力及靈活度,將允許更多極端且嚴峻的應用服務出現在第五代通訊系統上,例如大型多人的虛擬實境線上遊戲。邊緣雲端網路架構期待能有效地用來提升虛擬實境應用。然而,在多人虛擬實境的環境下,使用者的行為會受到其他使用者或是虛擬環境中的物件影響,也因此導致了資源管理的複雜度增加而變得比以往更加困難。在這篇研究中,我們採用了Deep Deterministic Policy Gradient (DDPG) 機器學習演算法來進行資源管理。我們整合了3D資源管理架構並針對機器學習提出了組件化的執行動作,並利用使用者的互動狀態進行分組。 由於現有的機器學習探索策略不適合用在長時間的資源管理上,我們提出了透過meta learning架構的探索策略來強化DDPG演算法。機器學習面臨的另一個挑戰是,當我們改變了輸入資料的維度會導致已經訓練好的模型會陷入無用武之地。我們提出「環境資訊對輸入」的翻譯機,在放入機器學習演算法之前,將環境資訊編碼成輸入,編碼後的輸入資料會擁有固定維度,就能放入已經訓練好的模型之中。 從實驗結果顯示,我們提出的meta DDPG演算法可以達到最高的滿足率,而我們提出的編碼架構雖然會讓表現稍微變差,不過當我們的模型遇到新的環境時,可以不用重新訓練新的模型,能夠直接使用,而這也會是比較有效率的學習方式。zh_TW
dc.description.abstractThe development of the fifth-generation (5G) system on capability and flexibility enables emerging applications with stringent requirements. Mobile edge cloud (MEC) is expected to be an effective solution to serve virtual reality (VR) applications over wireless networks. In multi-user VR environments, highly dynamic interaction between users increases the difficulty and complexity of radio resource management (RRM). Furthermore, a trained management model is often obsolete when particular key environment parameters are changed. In this thesis, a scalable deep reinforcement learning-based approach is proposed specifically for resource scheduling in the edge network. We integrate a 3D radio resource structure with componentized Markov decision process (MDP) actions to work on user interactivity-based groups. A translator-inspired "information-to-state" encoder is applied to generate a scalable RRM model, which can be reused for environments with various numbers of base stations. Also, a meta-learning-based exploration strategy is introduced to improve the exploration in the deep deterministic policy gradient (DDPG) training process. The result shows that the modified meta exploration strategy improves DDPG significantly. The scalable learning structure with complete model reuse provides comparable performance to individually trained models.en_US
DC.subject資源管理zh_TW
DC.subject虛擬實境zh_TW
DC.titleScalable Radio Resource Management using DDPG Meta Reinforcement Learningen_US
dc.language.isoen_USen_US
DC.type博碩士論文zh_TW
DC.typethesisen_US
DC.publisherNational Central Universityen_US

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明