博碩士論文 107523001 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:144 、訪客IP:3.135.246.193
姓名 高懿辰(KAO YI-CHEN)  查詢紙本館藏   畢業系所 通訊工程學系
論文名稱 改良式強化學習於太陽能獵取之 多用戶上鏈功率控制研究
(Novel Reinforcement Learning-based Multiuser Uplink Power Control with Solar Energy Harvesting)
相關論文
★ 基於干擾對齊方法於多用戶多天線下之聯合預編碼器及解碼器設計★ 應用壓縮感測技術於正交分頻多工系統之稀疏多路徑通道追蹤與通道估計方法
★ 應用於行動LTE 上鏈SC-FDMA 系統之通道等化與資源分配演算法★ 以因子圖為基礎之感知無線電系統稀疏頻譜偵測
★ Sparse Spectrum Detection with Sub-blocks Partition for Cognitive Radio Systems★ 中繼網路於多路徑通道環境下基於領航信號的通道估測方法研究
★ 基於代價賽局在裝置對裝置間通訊下之資源分配與使用者劃分★ 應用於多用戶雙向中繼網路之聯合預編碼器及訊號對齊與天線選擇研究
★ 多用戶波束成型和機會式排程於透明階層式蜂巢式系統★ 應用於能量採集中繼網路之最佳傳輸策略研究設計及模擬
★ 感知無線電中繼網路下使用能量採集的傳輸策略之設計與模擬★ 以綠能為觀點的感知無線電下最佳傳輸策略的設計與模擬
★ 二使用者於能量採集網路架構之合作式傳輸策略設計及模擬★ 基於Q-Learning之雙向能量採集通訊傳輸方法設計與模擬
★ 多輸入多輸出下同時訊息及能量傳輸系統之設計與模擬★ 附無線充電裝置間通訊於蜂巢式系統之設計與模擬
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 能量獵取技術(Energy Harvest)被視為能夠實現延長無線通訊設備使用時效及自我維持工作條件的有效技術,而近些年來物聯網的興起,小規模的無線通訊遽烈增加,由於有限容量電池的限制,小規模無線通訊裝置有著電量消耗及使用時效的問題,而能量獵取技術能夠藉由獵取周遭環境的能源來維持小規模無線通訊裝置使用時效並最佳化功率控制藉此解決有限容量電池造成的消耗及裝置壽命問題。為了最大化無線通訊系統整體傳輸吞吐量,傳統可以使用凸優化、馬可夫決策過程等最佳化方式來進行系統效能的最佳化,但上述方法存在著必須得知未來一段時間的通道增益、能量獵取情況、系統狀態轉移機率才能夠執行最佳化的計算,而強化學習基於與環境的相互探索,能夠藉由多次的疊代優化出最佳的動作價值函數。
本論文採用強化學習(Reinforcement Learning)來最佳化無線通訊系統的行動策略(policy),藉由使用獵取能量數值、有限容量電池數值及無線通道增益來定義出馬可夫決策過程中的狀態,研究能量獵取無線通訊上鏈系統在多用戶的環境下之功率控制最佳方案,並且在觀察狀態的資訊後,可以得到在電池狀態分量中有著小區段線性相關的關係,藉由加權方式改進傳統強化學習的探索及評估數值使系統效能在複雜度小幅提升的情況下,使基於能量獵取的多用戶無線通訊上鏈系統的傳輸吞吐量有明顯的增加。
摘要(英) Energy harvesting is regarded as an effective technology that can extend the lifetime of wireless communication applications and self-sustain working conditions. In recent years, the rise of the Internet of Things has led to a rapid increase in small-scale wireless communication. Restrictions, small-scale wireless communication devices have power consumption and use time issues, and energy harvesting technology can maintain the use timeliness of small-scale wireless communication devices by gathering energy from the surrounding environment and optimize power control to solve the problem of limited capacity batteries consumption and device life issues. In order to maximize the overall transmission throughput of the wireless communication system, optimization methods such as convex optimization and Markov decision process can be used to optimize the system performance by power control. However, the above methods rely on the perfect knowledge of the future information of energy harvesting conditions and channel gains control, which makes it difficult to be implemented in real applications. Reinforcement learning can approximate the best action-value function by multiple iterations based on mutual exploration with the environment.
In this thesis, reinforcement learning is used to optimize the action strategy of the wireless communication system. By using the harvested energy value, the limited capacity battery value, and the wireless channel gain to define the state in the Markov decision process, this thesis investigate energy harvesting power control of uplink wireless communication systems in a multi-user environment, and after observing the state of the environment information, it can be obtained that there is a piecewise linear correlation in a battery state component, and the conventional reinforcement learning is improved by weighting the exploration and evaluation values of learning enable the system performance to be slightly increased in the case of a small increase in complexity, which significantly increases the transmission throughput of the multi-user wireless communication uplink system based on energy harvesting.
關鍵字(中) ★ 強化學習
★ 多層感知器
★ 能量獵取
★ 功率控制
關鍵字(英) ★ Reinforcement Learning
★ Deep Neural Networks
★ Energy Harvesting
★ Power Control
論文目次 目錄
摘要 i
Abstract ii
致謝 iv
目錄 v
圖目錄 vii
表目錄 viii
符號說明 ix
第一章 緒論 1
1-1研究動機 1
1-2文獻探討 4
1.2.1具能量獵取功能的無線通訊系統 4
1.2.2深度學習與強化學習於無線通訊設計相關應用 5
第二章 背景理論介紹 7
2-1能量獵取模型(Energy Harvesting Model) 7
2-2馬可夫決策過程(Markov Decision Processes) 8
2-3強化學習(Reinforcement Learning) 9
2-3-1 Q學習(Q-Learning) 11
2-3-2深度強化學習(Deep Reinforcement Learning) 12
2-4瓦片編碼(Tile Coding) 14
第三章 基於強化學習的多用戶無線通訊上鏈系統 15
3-1太陽能獵取數據模型 15
3-2 多用戶系統模型 16
3-2-1多用戶通道容量最佳化問題 17
3-3 強化學習之多用戶功率控制 18
3-4區域加權強化學習方法 22
3-5基於瓦片編碼之多用戶無線功率控制 26
3-6深度強化學習之多用戶功率控制 30
第四章 模擬結果 38
4-1 基於強化學習多用戶上鍊功率控制模擬結果 39
第五章 結論 45
參考文獻 46
參考文獻 參考文獻
[1] A. Kumar, K. Singh, and D. Bhattacharya, “Green communication and wireless networking,” in Proc. ICGCE, pp. 49-52, 2013.
[2] I. U. Ramirez and N. A. B. Tello, “A survey of challenges in green wireless communications research,” in Proc. ICMEAE, pp. 197-200, 2014.
[3] P. Gandotra, and R. K. Jha, “Next generation cellular networks and green communication,” in Proc. COMSNETS, pp. 522-524, 2018.
[4] J. Lu, H. Okada, T. Itoh, R. Maeda, and T. Harada, “Towards the world smallest wireless sensor nodes with low power consumption for ‘Green’ sensor networks,” in Proc. IEEE ICSENS, pp. 1-4, 2013.
[5] B. Atwood, B. Warneke, and K. S. J. Pister, “Smart dust mote forerunners,” in Proc. IEEE ICMS, pp. 357-360, 2001.
[6] N. Chand, P. Mishra, C. R. Krishna, E. S. Pilli, and M. C. Govil, “A comparative analysis of SVM and its stacking with other classification algorithm for intrusion detection,” in Proc. IEEE ICACCA, pp. 1-6, 2016.
[7] K. Y. Huang, L. C. Shen, K. J. Chen, and M. C. Huang, “Multilayer perceptron with genetic algorithm for well log data inversion,” in Proc. IEEE IGARSS, pp. 1544-1547, 2013.
[8] D. Ciregan, U. Meier, and J. Schmidhuber, “Multi-column deep neural networks for image classification,” in Proc. IEEE CVPR, pp. 3642-3649, 2012.
[9] Y. Li, “Deep reinforcement learning: an overview,” arXiv:1701.07274, 2017.
[10] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436-444, May 2015.

[11] M. Tacca, P. Monti, and A. Fumagalli, “Cooperative and reliable ARQ protocols for energy harvesting wireless sensor nodes,” IEEE Trans. Wireless Commun., vol. 6, no. 7, pp. 2519-2529, Jul. 2007.
[12] S. Reddy and C. R. Murthy, “Profile-based load scheduling in wireless energy harvesting sensors for data rate maximization,” in Proc. IEEE ICC, pp. 1-5, 2010.
[13] N. Michelusi, K. Stamatiou, and M. Zorzi, “Transmission policies for energy harvesting sensors with time-correlated energy supply,” IEEE Trans. Commun., vol. 61, no. 7, pp. 2988-3001, Jul. 2013.
[14] O. Ozel, K. Tutuncuoglu, J. Yang, S. Ulukus, and A. Yener, “Transmission with energy harvesting nodes in fading wireless channels: optimal policies,” IEEE J. Sel. Areas Commun., vol. 29, no. 8, pp. 1732-1743, Sep. 2011.
[15] B. Medepally and N. B. Mehta, “Voluntary energy harvesting relays and selection in cooperative wireless networks,” IEEE Trans. Wireless Commun., vol. 9, no. 11, pp. 3543-3553, Nov. 2010.
[16] C. K. Ho, P. D. Khoa, and P. C. Ming, “Markovian models for harvested energy in wireless communications,” in Proc. IEEE ICCS, pp. 311-315, 2010.
[17] M.-L. Ku, Y. Chen and K. J. Ray Liu, “Data-driven stochastic models and policies for energy harvesting sensor communications,” IEEE J. Sel. Areas Commun., vol. 33, no. 8, pp. 1505-1520, Aug. 2015.
[18] Hsin-Hung Tsai, “Design and simulation of cooperative transmission policies for two-user energy harvesting networks”, Master Thesis, National Central University, 2015.
[19] H. Ye, G. Y. Li, and B.-H. Juang, “Power of learning for channel estimation and signal detection in OFDM systems,” IEEE Wireless Commun. Lett., vol. 7, no. 1, pp. 114-117, Sep. 2017.
[20] N. Krishna Prakash and D. Prasanna Vadana, “Machine learning based residential energy management system,” in Proc. IEEE ICCIC, pp. 684-687, 2017
[21] W. Lee, M. Kim, D.-H. Cho, “Deep power control: Transmit power control scheme based on convolutional neural network,” IEEE Commun. Lett., vol. 22, no. 6, pp. 1276-1279, Jun. 2018.
[22] M. Min, L. Xiao, Y. Chen, P. Cheng, D. Wu, and W. Zhuang, ‘‘Learning-based computation offloading for IoT devices with energy harvesting,’’ IEEE Trans. Veh. Technol., vol. 68, no. 2, pp. 1930-1941, Feb. 2019.
[23] O. Naparstek and K. Cohen, “Deep multi-user reinforcement learning for distributed dynamic spectrum access,” IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 310-323, Jan. 2019.
[24] Y. S. Nasir and D. Guo, “Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks,” IEEE J. Sel. Areas Commun., vol. 37, no. 10, pp. 2239-2250, Oct. 2019.
[25] L. Huang, S. Bi, and Y.-J. A. Zhang, “Deep reinforcement learning for online offloading in wireless powered mobile-edge computing networks,” arXiv:1808.01977, 2018.
[26] Y. Wei, F. R. Yu, M. Song, and Z. Han, "User scheduling and resource allocation in HetNets with hybrid energy supply: An actor-critic reinforcement learning approach", IEEE Trans. Wireless Commun., vol. 17, no. 1, pp. 680-692, Jan 2018.
[27] M. Adel and P. A. Massi “A 24-h forecast of solar irradiance using artificial neural network: Application for performance prediction of a grid-connected PV plant at Trieste, Italy,” Solar Energy, vol. 84, no. 5, pp. 807-821, May. 2010.
[28] Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press
(1998)
[29] NREL. Solar radiation resource information, Golden, CO, USA. [Online].
Available: http://www.nrel.gov/rredc/
[30] A. Ortiz, H. Al-Shatri, X. Li, T. Weber and A. Klein, "Reinforcement learning for energy harvesting point-to-point communications", Proc. of the IEEE International Conference on Communications (ICC 2016) Kuala Lumpur: Malaysia, pp. 1-6, May 2016.
[31] V. Mnih et al., “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, 2015.
[32] K. He, X. Zhang, S. Ren, and J. Sun, “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification,” arXiv: 1502.01852v1, pp. 1-11, Feb. 2015.
[33] S. Ioffe, and C. Szegedy, “Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,” arXiv: 1502.03167v3, pp. 1-11, Mar. 2015.
[34] D. Mishkin, and J. Matas, “All you need is a good init,” arXiv: 1511.06422v7, pp. 1-13, Feb. 2016.
[35] Andrej Karpathy’s blog, “Hacker’s guide to Neural Networks,” [Online]. Available: http://karpathy.github.io/neuralnets/.
[36] Frederik Kratzert’s blog, “Understanding the backward pass through Batch NormalizationLayer,”[Online].Available:http://kratzert.github.io/2016/02/12/understanding-the-gradient-flow-through-the-batch-normalization-layer.html.
[37] D. P. Kingma, and J. Ba, “Adam: A Method for Stochastic Optimization,” arXiv: 1412.6980v9, pp. 1-15, Jan. 2017.
[38] M. Abadi, A. Agarwal, and et al. “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” arXiv:1603.04467v2, pp. 1-19, Mar. 2016.
指導教授 古孟霖(Meng-Lin Ku) 審核日期 2021-1-6
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明