基於深度強化學習之預編碼器設計於多輸入多輸出混合式波束合成系統進行波束追蹤

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：103

、訪客IP：3.15.29.73

姓名

黃柏銓(Po-Chuan Huang) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

基於深度強化學習之預編碼器設計於多輸入多輸出混合式波束合成系統進行波束追蹤
(Beam Tracking with Deep Reinforcement Learning for Analog Precoder in Hybrid Beamforming for MIMO Systems)

相關論文

★ 具輸出級誤差消除機制之三位階三角積分D類放大器設計	★ 應用於無線感測網路之多模式低複雜度收發機設計
★ 用於數位D類放大器的高效能三角積分調變器設計	★ 交換電容式三角積分D類放大器電路設計
★ 適用於平行處理及排程技術的無衝突定址法演算法之快速傅立葉轉換處理器設計	★ 適用於IEEE 802.11n之4×4多輸入多輸出偵測器設計
★ 應用於無線通訊系統之同質性可組態記憶體式快速傅立葉處理器	★ 3GPP LTE正交分頻多工存取下行傳輸之接收端細胞搜尋與同步的設計與實現
★ 應用於3GPP-LTE下行多天線接收系統高速行駛下之通道追蹤與等化	★ 適用於正交分頻多工系統多輸入多輸出訊號偵測之高吞吐量QR分解設計
★ 應用於室內極高速傳輸無線傳輸系統之設計與評估	★ 適用於3GPP LTE-A之渦輪解碼器硬體設計與實作
★ 下世代數位家庭之千兆級無線通訊系統	★ 協作式通訊於超寬頻通訊系統之設計
★ 適用於3GPP-LTE系統高行車速率基頻接收機之設計	★ 多使用者多輸入輸出前編碼演算法及關鍵組件設計

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2027-8-31以後開放)

摘要(中)

在本篇論文中，基於多輸入輸出系統，我們採用混合式波束成型的架構來減少運算複雜度及硬體成本，並且利用深度強化學習來進行波束追蹤。在已知通道狀態資訊的前提下，我們使用了深度強化學習當中的深度確定性梯度下降算法(Deep Deterministic Policy Gradient)來取得類比預編碼器，將通道容量作為代理人採取動作後，從環境得到的反饋及獎勵，透過其獎勵的變化及收斂趨勢來判斷訓練是否成功，同時為了符合相移器(phase shifter)的大小限制，我們在深度確定性梯度下降算法裡動作策略網路(actor network)的輸出層增加正規化功能。我們利用深度強化學習其對環境擁有高度的容忍及適應力同樣在時變通道(time-varying channels)中進行波束追蹤來取得類比預編碼器，並觀察其效能表現。通道容量為我們評斷其效能的標準，因此我們將自己所提出的深度確定性梯度下降混和式波束合成演算法(DDPG for Hybrid Precoder Algorithm)與另外兩種傳統演算法包含全數位演算法(Fully-Digital Algorithm)和傳統基於奇異值分解之單使用者混和式波束合成演算法(Single-User Hybrid Precoder Algorithm)所得到的通道容量作比較，可以發現在非時變通道及時變通道下，透過100組通道的模擬及測試，深度確定性梯度下降混和式波束合成演算法的平均效能優於傳統基於奇異值分解之單使用者混和式波束合成演算法，更靠近擁有最佳效能的全數位演算法。

摘要(英)

In this thesis, hybrid beamforming architecture is adopted to reduce the computational complexity and hardware cost in multiple-input multiple-output (MIMO) system and deep reinforcement learning (DRL) is employed for beam tracking. With channel state information (CSI), we use deep deterministic policy gradient (DDPG) algorithm to compute analog precoder. Channel capacity is regarded as the feedback reward from the environment when the agent takes the action corresponding to the values for the analog precoder. To satisfy the magnitude constraint of phase shifters, we propose to add normalization function in the output layer of actor network in DDPG, which shows good convergence. Furthermore, beam tracking capability of DDPG is also examined in time-varying channels by exploiting the adaptability toward environment from DRL. From the simulation results, we can see that given the proper adjustment of the variance in Ornstein-Uhlenbeck random process, the average performance of DDPG for hybrid precoder algorithm is better than the conventional single-user hybrid precoder algorithm and approaches to fully-digital algorithm under time-invariant and time-varying channels.

關鍵字(中)

★ 深度強化學習
★ 深度確定性策略梯度下降算法
★ 混合式波束合成系統

關鍵字(英)

★ Deep Reinforcement Learning
★ Deep Deterministic Policy Gradient
★ Hybrid Beamforming

論文目次

摘要 i
Abstract ii
表目錄 v
圖目錄 vi
第一章緒論 1
1.1 簡介 1
1.2 研究動機 1
1.3 論文組織 2
第二章深度強化學習(Deep Reinforcement Learning) 3
2.1 深度Q網路(Deep Q Network, DQN) 3
2.1.1 神經網路(Neural Network) 3
2.1.2 深度Q網路(Deep Q Network)演算法 4
2.1.3 ε-貪婪策略(ε-gredy) 6
2.1.4 經驗回放(Experience Replay) 6
2.1.5 模擬環境與結果 10
2.2 深度確定性策略梯度下降算法(Deep Deterministic Policy Gradient, DDPG) 14
2.2.1 奧恩斯坦-烏倫貝克過程(Ornstein-Uhlenbeck Process, OU Process) 15
2.2.2 深度確定性策略梯度下降演算法 17
2.2.3 模擬環境與結果 20
第三章系統架構及傳統演算法 25
3.1 混合式波束合成系統架構(Hybrid Beamforming System Model) 25
3.2 全數位預編碼器和結合器系統(Fully Digital Precoder and Combiner System) 27
3.3 單一使用者混合式預編碼系統(Single-user Hybrid Precoding System) 29
3.4 不同演算法之模擬結果與效能比較 30
第四章深度強化學習於混合式預編碼系統 31
4.1 深度確定性梯度下降算法於混合式預編碼系統架構與演算法 31
4.2 模擬結果與比較 38
4.3 深度確定性策略梯度下降算法於數位預編碼器 45
第五章時變通道下進行波束追蹤 49
5.1 時變通道的定義 49
5.2 時變通道下利用深度確定性梯度下降算法於混合式預編碼系統 49
5.3 模擬結果之比較與細節探討 50
5.3.1 模擬結果與比較 50
5.3.2 模擬結果之細節探討 53
第六章複雜度分析 62
6.1 DDPG神經網路架構之運算複雜度 62
6.1.1 正向傳遞(Forward Propagation) 62
6.1.2 倒傳遞(Backward Propagation) 64
6.1.3 運算複雜度之比較 65
6.2 DDPG神經網路架構之時間複雜度 66
6.2.1 時間複雜度之評估 66
6.2.2 時間複雜度之分析 67
第七章結論 69
參考資料 70

參考文獻

[1] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra and M. Riedmiller, “Playing Atari with Deep Reinforcement Learning,” in NIPS Deep Learning Workshop, 2013.
[2] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra and M. Riedmiller, “Deterministic Policy Gradient Algorithms,” in Proceedings of the 31st International Conference on Machine Learning, 2014, pp. 387-395.
[3] E. Todorov, T. Erez and Y. Tassa, “MuJoCo: A physics engine for model-based control,” in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012, pp. 5026-5033.
[4] F. Sohrabi and W. Yu, “Hybrid digital and analog beamforming design for large-scale antenna arrays,” IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 501-513, 2016.
[5] X. Gao, L. Dai, S. Han, C. I and R. W. Heath, “Energy-Efficient Hybrid Analog and Digital Precoding for MmWave MIMO Systems With Large Antenna Arrays,” in IEEE Journal on Selected Areas in Communications, vol. 34, no. 4, pp. 998-1009, April 2016
[6] 黃孝生, “多使用者多輸入多輸出系統下之混合式波束合成演算法與架構設計,” 碩士論文, 國立中央大學電機工程學系, 2018
[7] A. Alkhateeb G. Leus R. W. Heath, “Achievable rates of multi-user millimeter wave systems with hybrid precoding,” Proc. IEEE International Conference on Communication (ICC 2015), June 2015.
[8] A. M. Elbir and K. V. Mishra, “Online and offline deep learning strategies for channel estimation and hybrid beamforming in multicarrier mm-Wave massive MIMO systems,” 2019. [Online]. Available: arXiv:1912.10036.
[9] A. M. Elbir, “CNN-based precoder and combiner design in mmWave MIMO systems,” IEEE Commun. Lett., vol. 23, no. 7, pp. 1240–1243, Jul. 2019.
[10] Qisheng Wang, Keming Feng, Xiao Li, Shi Jin, “PrecoderNet: hybrid beamforming for millimeter wave systems with deep reinforcement learning”, IEEE Wireless Communications Letters, vol 9, no. 10, pp. 1678-1680, Oct. 2020.
[11] Q. Wang, X. Li, S. Jin and Y. Chen, “Hybrid beamforming for mmWave MU-MISO systems exploiting multi-agent deep reinforcement learning,” IEEE Wireless Communications Letters, vol. 10, no. 5, pp. 1046-1050, May 2021.
[12] X. Yu, J.-C. Shen, J. Zhang, and K. B. Letaief, “Alternating minimization algorithms for hybrid precoding in millimeter wave MIMO systems,” IEEE J. Sel. Topics Signal Process., vol. 10, no. 3, pp. 485–500, Apr. 2016.
[13] Pei-Yun Tsai, Yi Chang, and Jian-Lin Li, “Fast-Convergence Singular Value Decomposition for Tracking Time-Varying Channels in Massive Mimo Systems,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr. 2018.
[14] Hui Zhang, Zhaojie Li, Heqing Yang, and Xu Cheng, Xiaoyang Zeng, “A High-Efficient and Configurable Hardware Accelerator for Convolutional Neural Network,” IEEE 14th International Conference on ASIC (ASICON), Oct. 2021.
[15] S. Kala, B. R. Jose, J. Mathew and S. Nalesh, “High-Performance CNN Accelator on FPGA Using Winograd-GEMM Architecture,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, pp. 2816-2828.

指導教授

蔡佩芸(Pei-Yun Tsai)

審核日期

2022-8-12

推文