基於Q-Learning之雙向能量採集通訊傳輸方法設計與模擬

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：23

、訪客IP：3.145.130.31

姓名

吉米(Jean-jimmy Julien) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

基於Q-Learning之雙向能量採集通訊傳輸方法設計與模擬
(Design and Simulation of Q-Learning Based Transmission Schemes for Two-Way Energy Harvesting Communications)

相關論文

★ 基於干擾對齊方法於多用戶多天線下之聯合預編碼器及解碼器設計	★ 應用壓縮感測技術於正交分頻多工系統之稀疏多路徑通道追蹤與通道估計方法
★ 應用於行動LTE 上鏈SC-FDMA 系統之通道等化與資源分配演算法	★ 以因子圖為基礎之感知無線電系統稀疏頻譜偵測
★ Sparse Spectrum Detection with Sub-blocks Partition for Cognitive Radio Systems	★ 中繼網路於多路徑通道環境下基於領航信號的通道估測方法研究
★ 基於代價賽局在裝置對裝置間通訊下之資源分配與使用者劃分	★ 應用於多用戶雙向中繼網路之聯合預編碼器及訊號對齊與天線選擇研究
★ 多用戶波束成型和機會式排程於透明階層式蜂巢式系統	★ 應用於能量採集中繼網路之最佳傳輸策略研究設計及模擬
★ 感知無線電中繼網路下使用能量採集的傳輸策略之設計與模擬	★ 以綠能為觀點的感知無線電下最佳傳輸策略的設計與模擬
★ 二使用者於能量採集網路架構之合作式傳輸策略設計及模擬	★ 多輸入多輸出下同時訊息及能量傳輸系統之設計與模擬
★ 附無線充電裝置間通訊於蜂巢式系統之設計與模擬	★ 波束成形與資源分配於無線充電蜂巢式網路之設計與模擬

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

在本文中，雙向通信能量採集系統，研究了利用馬可夫決策過程（MDP）。這個過程提供了在各種情況下的結果是部分地決策者和部分地隨機的控制下，及模擬決策的數學框架。通信傳達從一個區域或指向另一個通過電磁，聲學和許多其他波的物理信道的信息。這裡的信息通常是表現為電流或電壓，它們可以是連續的，並具有一組已知的可能值的可能值，也離散變量的無限數量。該通信系統連接的機器，包括網絡系統傳達數據雙向包括多個其他節點，也是存儲和召回信息的存儲系統。
數據和能量抵達地在發射機被建模為馬可夫過程。延遲限制的通信總是通過採取假設底層信道是阻塞帶存儲器衰落和也瞬時信道狀態信息是在接收器和發射器再次可用考慮。總發送數據，該數據在發送器的激活期間預期下組不同的假設被最大化，這些都是關於在發射有關的基本隨機過程的可用信息三套。
因此，能量採集（EH）實際上已經成為一個有前途的技術，它擴展了通信產業和網絡。比如機對機或無線傳感器網絡，補充通過收集和收穫周圍可用的能源，包括太陽能，散熱梯度和振動目前電池供電的收發器。不同於電池有限的服務，能量採集系統採用馬爾可夫決策過程理論上可以工作在無限的時間範圍。因此，以優化通信性能，並與有限量的零星到達能量，最好是通過使用關於能量和數據到達過程的可用信息以最大化發送策略。

摘要(英)

In this thesis, a two-way communication energy harvesting system is studied by the use of Markov Decision Process (MDP). This process is the ultimate process that provides a mathematical framework for modeling decision-making in various situations where the outcomes are partly under the control of the decision maker and partly random. Communication conveys the information from one area or points to another through physical channels that propagate particle density, electromagnetic, acoustic and many other waves. The information referred here is usually manifest as currents or voltages, and they may be continuous and have an infinite number of possible values and also discrete variables having a set of known possible values. This communication system links machines which include networks systems which convey data two way including multiple other nodes and also the memory systems that store and recall information.
Both the data and the energy arrivals at the transmitter are modeled as Markov processes. The delay-limited communication is always considered by taking the assumption that the underlying channel is a blocking fading with memory and also the instantaneous channel state information is again available in both the receiver and the transmitter. The total transmitted data which is expected during the transmitter’s activation period is maximized under different sets of assumptions; these are three sets regarding the available information in the transmitter concerning the underlying stochastic processes.
Therefore energy harvesting (EH) has actually emerged as a promising technology that expands the communication industry and networks. For instance machine to machine or wireless sensor network which complements the current battery-powered transceivers by collecting and harvesting the ambient available energy including the solar, thermal- gradient, and vibration. Unlike the battery limited services, the energy harvesting system by employing the Markov Decision process can theoretically operate over an unlimited time horizon. Therefore to optimize the communication performance and with the sporadic arrival energy in limited amounts, it is advisable to maximize the transmission policy by using the available information about the energy and data arrival processes.

關鍵字(中)

★ 雙向通訊
★ 綠能採集
★ 馬可夫決策過程
★ 通道
★ 無線感知網路

關鍵字(英)

論文目次

Acknowledgment 2
ABSTRACT 5
List of Figures 9
List of Tables 10
List of Algorithms 11
Chapter 1 Introduction 12
1.1 Background 12
1.2 Motivation 18
1.3 Organization 19
Chapter 2 Theoretical Background of Markow Decision Process 20
2.1 Markov Decision Processes 21
2.2 Solving Markov Decision Processes 23
2.2.1 Value Iteration 23
2.2.2 Policy Iteration 25
Chapter 3 System model 27
3.1 Transmission Policy Action 28
3.2 System states 29
3.2.1 Solar EH state 30
3.2.2 Battery state 30
3.2.3 Channel State 31
3.2.4 MDP State Transition 32
3.3 Reward function 32
3.4 Optimization of Transmission Policy 33
Chapter 4 Reinforcement Learning Algorithms 35
4.1 Q-learning 36
4.1.1 Greedy Method 37
4.1.2 ϵ-greedy Method 37
4.2 Speedy Q-learning 38
Chapter 5 Simulations and Discussion 41
Chapter 6 Conclusion 50
References 51

參考文獻

[1] Prabuchandran K. J., Sunil Kumar Meena, and Shalabh Bhatnagar,” Q-Learning Based Energy Management Policies for a Single Sensor Node with Finite Buffer”, IEEE 2013

[2] M.-L. Ku, Y. Chen, And K. J. Ray Liu, “ Data-driven stochastic transmission policies

for energy harvesting sensor communications, ” IEEE J. Sel. Areas Commun., vol. PP,

no. 99, pp. 1, Jan. 2015.

[3] S. Luo, R. Zhang, and T. J. Lim, “Diversity analysis of an outage minimizing energy

harvesting wireless protocol in Rayleigh fading,” in Proc. SPCOM, July 2012, pp.1-5.

[4] Mohammad Gheshlaghi Azar, Remi Munos, Mohammad Ghavamzadeh, Hilbert J.

Kappen,” Speedy Q-Learning” Advances in Neural Information Processing Systems,

2011, Spain.

[5] Wei Li, M.-L. Ku, Y. Chen and K. J. Ray Liu, “On outage probability for stochastic

Energy harvesting communications in fading channels,” Submitted, 2015.

[6] Berk Gurakan, Omur Ozel, Jing Yang, and Sennur Ulukus, “Two-way and multiple-

access energy harvesting systems with energy cooperation, ” IEEE Trans. Commun.,

vol. 61, no. 12, Dec. 2013.

[7] https://en.wikipedia.org/

[8] Ársæll Þór Jóhannsson,” GPU-Based Markov Decision Process Solver”, Reykjavík

University, June 2009.

[9] Yogeswaran Mohan, Ponnambalam S. G.,”Q learning policies for a single agent
Foraging tasks” UAE, April 20-22, 2010.

[10] Pol Blasco, Deniz Gunduz and Mischa Dohler,” A Learning Theoretic Approach to
Energy Harvesting Communication System Optimization,
[11] Shannon, C. E. (1961, April). Two-way communication channels. In Proc. 4th Berkeley Symp. Math. Stat. Prob (Vol. 1, pp. 611-644).
[12] Kaya Tutuncuoglu, Aylin Yener,” Communicating with Energy Harvesting

Transmitters and Receivers”, the Pennsylvania State University, University Park,

PA 16802

[13] M. Puterman, Markov Decision Process-Discrete Stochastic Dynamic Programming.

John Wiley and Sons, 1994.

[14] S. Ulukus, A. Yener, E. Erkip, O. Simeone, M. Zorzi, P. Grover, and K. Huang,

“Energy Harvesting Wireless Communications: A Review of Recent Advances”,

IEEE, VOL. 33, NO. 3, March 2015

[15] Professor Aylin Yener, “Energy Harvesting Wireless Communication Networks”
Submitted on March 2014.
[16] Elena Pashenkova, Irina Rish, Rina Dechter,”Value Iteration and Policy iteration
Algorithms for Markov decision problem”, University of California at Irvine,
April 1986
[17] Muhammad Ali Murtaza, and Muhammad Tahir,” Optimal Data Transmission and

Battery Charging Policies for Solar Powered Sensor Networks using Markov

Decision Process”, IEEE WCMC 2013.

[18] Fang Wang, Yong Li, Zhaocheng Wang, Zhixing Yang,” Markov Decision Process

Based Content Dissemination in Hybrid Wireless Networks”, WCMC 2012 8th

International

[19] Kaya Tutuncuoglu, Aylin Yener,” Communicating with Energy Harvesting

Transmitters and Receivers”, ITA 2012

[20] Alireza Seyedi, Biplab Sikdar,” Modeling and Analysis of Energy Harvesting

Nodes in Wireless Sensor Networks”, 2008 46th Annual Allerton conference.

[21] Varan, B.; Yener, A.” Energy Harvesting Two-Way Communications with

Limited Energy and Data Storage”, 2014 48th Asilomar conference

[22] http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-661-

receivers-antennas-and-signals-spring-2003/readings/ch4new.pdf

[23] http://encyclopedia2.thefreedictionary.com/One-Way+Communication

[24] https://en.wikipedia.org/wiki/Two-way_communication

[25] Mohammad Gheshlaghi Azar, Remi Munos, Mohammad Ghavamzadeh, Hilbert J.

Kappen,” Speedy Q-Learning”

[26] G. Shani, “Learning and Solving Partially Observable Markov Decision Processes”.

Jul. 2007.

[27] H. S. Wang and N. Moayeri, “Finite-state Markov channel-a useful model for radio

Communication channels,” IEEE Trans. Wireless Commun.,vol. 44, no. 1, pp. 163–

171, Feb. 1995.

[28] S. Sudevalayam and P. Kulkarni, “Energy harvesting sensor nodes: survey and

Implications,” IEEE Commun. Surveys Tutorials, vol. 13, no. 3, pp. 43-461, Third

Quad. 2011.

指導教授

古孟霖(Meng-Lin KU)

審核日期

2015-7-31

推文