以作者查詢圖書館館藏 、以作者查詢臺灣博碩士 、以作者查詢全國書目 、勘誤回報 、線上人數:83 、訪客IP:18.221.11.166
姓名 黃祺凱(Chi-Kai Huang) 查詢紙本館藏 畢業系所 電機工程學系 論文名稱 基於改良精化雅可比法與雙脈動陣列架構之軟性輸出的巨量多輸入多輸出偵測器設計
(A Design of Soft-output MMSE Detector with Dual Systolic Array Based on Modified Refinement Jacobi Method for Massive MIMO Systems)相關論文 檔案 [Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] 至系統瀏覽論文 (2026-8-31以後開放) 摘要(中) 隨著行動流量呈等比級數增長,巨量多輸入多輸出 (Massive Multi-Input-Multi-Output) 系統被視為下一代無線通訊系統中一項關鍵的技術,相較於傳統MIMO系統在頻譜效率、可靠性、傳輸速度與波束成型有更好的改善,然而隨著天線數的增長,伴隨而來的是指數型成長的運算複雜度。最小均方誤差 (Minimum Mean Square Error) 解能以線性疊代的方式去實現並逼近最大似然解 (Maximum Likelihood, ML) ,但其中格拉姆矩陣 (Gram matrix) 反矩陣運算的時間複雜度O(N_t^3),N_t代表上行端使用者數量,隨著使用者增加,其硬體實現會越加困難。近代的文獻中,解決128×8 (下行端128根天線,上行端8根天線) 瑞利衰落頻道 (Rayleigh fading channel) 的硬體架構已發展得相當成熟,然而這些架構的演算法往往無法再處理更多的上行使用者,因此本論文提出一個全新的演算法架構來挑戰128×32的陣列通道。前端採用加速權重諾伊曼級數展開式 (Accelerated Weighted Neumann Series Expansion) 來取得一個較佳的初始值,後端迭代將精化雅可比 (Refinement of Jacobi) 演算法加入鬆弛因子 (Relaxation factor) 來做調整,只需經過兩次迭代即可達到近似MMSE的效能;硬體實現上採用雙脈動陣列 (Dual Systolic array) 來達成高收斂速度與高硬體效率,此外因為演算法中矩陣的重複使用以及格拉姆矩陣的對稱性,大大地節省了硬體資源。為了提升吞吐量,原先需要396個時脈運算才能完成一次輸出,經過三級管線架構處理,每一級只需要132個時脈就能處理下一筆資料。最後經由對數似然比 (Log Likelihood Ratio) 配合格雷碼 (Gray code) 的星座圖簡化軟性輸出值的運算。晶片實作上採用TSMC 40 nm製程,核心面積為3.04 mm^2,最高操作頻率為510 MHz且功率消耗為752 mW,並可達到742 Mbps的傳輸速度。 摘要(英) Massive Multi-Input-Multi-Output (MIMO) system is considered as one of the key technologies for the next-generation wireless networks in order to satisfy the geometric growth of mobile data traffic. It increases the spectral efficiency, link reliability, throughput and beamforming gain compared to traditional MIMO system. However, the use of more antennas is always accompanied by the exponential growth of computational complexity. We can linearly and iteratively apply MMSE detection which approaches ML performance with O(N_t^3) complexity of Gram matrix inversion, where N_t is the number of transmit antennas. The more antennas increase, the harder hardware realizes. Recent research which focus on hardware implementation of 128×8 Rayleigh fading channel has grown into a mature technology. Nevertheless, these algorithms are often unable to handle more uplink users. This paper proposes a whole new algorithm to face challenges with 128×32 channel model. First, Accelerated Weighted Neumann Series Expansion as a pre-iteration-based method is presented to get a better initial value. Second, Refinement of Jacobi as an algorithm adjusted by a Relaxation factor achieves near MMSE performance with only two iterations. Third, a dual systolic array is utilized to get high convergence rate and high hardware efficiency. According to reuse and symmetry of the matrix, this thesis reduces the computation of gram matrix value which only need to compute lowest. This architecture needs 396 clock cycles to accomplish one complete output. In order to increase throughput, it just needs 132 clock cycles to process another signal with a three-stage pipeline structure. Finally, the LLR with constellation diagram of Gray code is introduced to reduce computing load. The chip design is implemented in TSMC 40 nm CMOS technology. The core area is 3.04 mm^2, maximum frequency is 510 MHz, and dynamic power consumption is 752 mW. Most important of all, the throughput achieves 742 Mbps. 關鍵字(中) ★ 巨量多輸入多輸出
★ 最小均方誤差
★ 格拉姆矩陣
★ 加速權重諾伊曼級數展開式
★ 精化雅可比法
★ 雙脈動陣列關鍵字(英) ★ Massive Multi-Input-Multi-Output
★ Minimum Mean Square Error
★ Gram matrix
★ Accelerated Weighted Neumann Series Expansion
★ Refinement of Jacobi
★ Dual Systolic array論文目次 目錄
摘要 i
Abstract ii
致謝 iii
目錄 iv
圖目錄 vii
表目錄 xi
第一章 緒論 1
1.1 背景 1
1.2 研究動機 4
1.3 論文架構 6
第二章 巨量多輸入多輸出系統介紹 7
2.1 單輸入單輸出系統 7
2.2 單輸入多輸出系統 8
2.3 多輸入單輸出系統 8
2.4 點對點多輸入多輸出系統 8
2.5 多用戶多輸入多輸出系統 9
2.6 巨量多輸入多輸出系統 12
2.6.1 通道容量 12
2.6.2 通道硬化 14
2.6.3 對角優勢矩陣 14
2.7 空間多工線性偵測演算法 16
2.7.1 最大比率合成(Maximum Ratio Combining) 16
2.7.2 強制歸零(Zero Forcing) 17
2.7.3 最小均方誤差(Minimum Mean Square Error) 18
2.8 空間多工非線性偵測演算法 20
2.8.1 最大相似偵測法(Maximum Likelihood) 20
2.8.2 深度優先 22
2.8.3 廣度優先 23
2.9 偵測器解調輸出 25
2.9.1 硬性解調輸出 26
2.9.2 軟性解調輸出 27
2.9.3 迴旋碼編碼器 29
2.9.4 維特比解碼器 31
第三章 軟性解調巨量多輸入多輸出偵測器 35
3.1 系統架構 37
3.2 初始值演算法 38
3.2.1 諾伊曼級數展開(Neumann Series Expression) 38
3.2.2 牛頓-拉弗森疊代法(Newton-Raphson Iteration Method) 40
3.2.3 加速權重諾伊曼(Accelerated Weighted Neumann Series) 42
3.3 定常疊代法(Stationary Iterative Method) 44
3.3.1 分裂矩陣(Splitting Matrix) 45
3.3.2 雅可比法(Jacobi) 47
3.3.3 高斯賽德爾法(Gauss-Seidel) 47
3.3.4 阻尼雅可比法(Damped Jacobi) 48
3.3.5 逐次超鬆弛(Successive Over Relaxation) 48
3.3.6 理查德森法(Richardson) 49
3.4 階梯矩陣(Stair Matrix) 50
3.5 疊代精化(Iterative Refinement) 56
3.6 訊擾雜比的簡化 58
3.7 軟性解調輸出值產生器 61
3.8 效能分析 67
第四章 硬體架構設計 69
4.1 硬體設計規格 69
4.2 基本電路介紹 71
4.3 前處理電路(Preprocessing Circuit) 71
4.4 倒數電路(Reciprocal Circuit) 77
4.4.1 查表法 77
4.4.2 牛頓法 80
4.4.3 CORDIC 83
4.5 對稱矩陣複用(Symmetric Matrix Reuse) 85
4.6 符號縮減(Sign Contraction) 87
第五章 晶片實現 90
5.1 設計流程 90
5.2 定點數模擬分析 95
5.3 模擬驗證 96
5.4 晶片設計結果 100
5.5 晶片規格與其他文獻比較 108
第六章 結論與未來展望 112
參考文獻 113
圖目錄
圖1 - 1 Global mobile data traffic (EB per month) 1
圖1 - 2 Total number of active device connections worldwide 2
圖1 - 3 傳輸分集示意圖 3
圖1 - 4 空間多工示意圖 3
圖2 - 1 SISO、SIMO、MISO、MIMO示意圖 8
圖2 - 2 點對點多輸入多輸出系統 9
圖2 - 3 上行多用戶 MIMO 系統 10
圖2 - 4 多接入通道模型(MAC) 11
圖2 - 5 小區域MIMO的天線間互相干擾示意圖 12
圖2 - 6通道容量比較 13
圖2 - 7 Nt=8, 不同 Nr 下的正規化 W 反矩陣權重分佈 15
圖2 - 8 球型解碼示意圖 22
圖2 - 9 深度優先樹狀解碼示意圖 23
圖2 - 10 K最佳演算法示意圖 24
圖2 - 11 雜度修正球型解碼器 24
圖2 - 12 簡化複雜度修正球型解碼器 25
圖2 - 13 接收訊號的機率分佈 26
圖2 - 14 軟性解調輸出示意圖 27
圖2 - 15 171o 133o 迴旋碼編碼器架構 29
圖2 - 16 編位元速率2/3壓縮編碼圖 30
圖2 - 17 Trellis Diagram (code rate 1⁄2,constraint length 3) 31
圖2 - 18 Viterbi decoder (a) Hard: Hamming Distance (b) Soft: Euclidean Distance 32
圖2 - 19 Trellis Diagram A 33
圖2 - 20 Trellis Diagram B 33
圖2 - 21 追朔路徑示意圖 34
圖2 - 22 有無經過維特比解碼器與硬性解調、軟性解調之模擬結果 34
圖3 - 1 線性疊代演算法近似MMSE解示意圖 36
圖3 - 2 MIMO 系統模型 37
圖3 - 3 平坦瑞利衰減通道的馬爾琴科-帕斯圖爾分布圖 [23] 43
圖3 - 4 (a) 階梯矩陣一型 (b) 階梯矩陣二型 50
圖3 - 5 階梯矩陣之反矩陣運算 51
圖3 - 6 階梯矩陣之位元錯誤率效果比較 52
圖3 - 7 階梯矩陣之位元錯誤率效果比較(放大版) 53
圖3 - 8 不同權重之位元錯誤率(ω=0~2) 54
圖3 - 9 不同權重之位元錯誤率(ω=0.5~0.7) 54
圖3 - 10 各種改良雅可比法的位元錯誤率效能比較圖 56
圖3 - 11 格雷碼16QAM星座圖 63
圖3 - 12 第一位元軟性解調示意圖 63
圖3 - 13 簡化λ_(b=1) (s_i)轉移函數 64
圖3 - 14 8×128通道-64QAM各演算法位元錯誤率效能比較 67
圖3 - 15 16×128通道-64QAM各演算法位元錯誤率效能比較 67
圖3 - 16 32×128通道-64QAM各演算法位元錯誤率效能比較 68
圖4 - 1 軟性輸出巨量多輸入多輸出偵測器之方塊圖 69
圖4 - 2 內部系統方塊圖 71
圖4 - 3 處理元件A方塊圖 72
圖4 - 4 GRAM方塊圖 72
圖4 - 5 GRAM_P4 方塊圖 73
圖4 - 6 累加器方塊圖 73
圖4 - 7 D^(-1) ED^(-1) E與D^(-1) ED^(-1)運算示意圖 74
圖4 - 8 處理元件B與DD方塊圖 75
圖4 - 9 處理元件C方塊圖 75
圖4 - 10 DEDE方塊圖 76
圖4 - 11 處理元件D、E、F方塊圖 77
圖4 - 12 多工器方塊圖 77
圖4 - 13 初始值縮放流程圖 78
圖4 - 14 格拉姆矩陣對角線的值 78
圖4 - 15 查表法近似倒數誤差 79
圖4 - 16 牛頓法近似過程 80
圖4 - 17 牛頓法之倒數電路架構圖 82
圖4 - 18 切比雪夫多項式搭配牛頓法之倒數電路架構圖 82
圖4 - 19 牛頓法近似倒數誤差 82
圖4 - 20 CORDIC倒數電路方塊圖a 83
圖4 - 21 CORDIC倒數電路方塊圖b 83
圖4 - 22 CORDIC倒數電路方塊圖c 84
圖4 - 23 CORDIC倒數電路方塊圖d 84
圖4 - 24 CORDIC預測電路程式碼 85
圖4 - 25 實數分解後的H、H^H (a): H matrix (b): H^H matrix 85
圖4 - 26 G matrix 86
圖4 - 27 格拉姆矩陣運算流程 86
圖4 - 28 對稱矩陣複用推廣示意圖 87
圖4 - 29 尚未改良的PEC之程式碼 88
圖4 - 30 位元縮減後的PEC之程式碼 88
圖4 - 31 第一個版本晶片核心的面積大小 89
圖4 - 32 最終版本晶片核心的面積大小 89
圖5 - 1 電路設計流程圖 90
圖5 - 2 電路設計時序圖 91
圖5 - 3 整體電路設計架構 92
圖5 - 4 電路設計架構(上) 93
圖5 - 5 電路設計架構(下) 94
圖5 - 6 訊號量化步驟 95
圖5 - 7 浮點數與定點數位元錯誤率 95
圖5 - 8 浮點數與定點數位元錯誤率在SNR=14的損失 96
圖5 - 9 Matlab模擬結果 97
圖5 - 10 Behaviour模擬結果 98
圖5 - 11 Post - Route的模擬結果 99
圖5 - 12 錯誤覆蓋率 101
圖5 - 13 晶片佈局圖(ICC LayoutWindow) 102
圖5 - 14 晶片佈局左上角放大圖(ICC LayoutWindow) 102
圖5 - 15 晶片佈局圖(Laker) 103
圖5 - 16 晶片佈局左上角放大圖(Laker) 103
圖5 - 17 LVS驗證結果 104
圖5 - 18 核心電路 105
圖5 - 19 核心各元件面積圓餅圖 107
圖5 - 20 核心各元件功耗圓餅圖 107
表目錄
表1 - 1線性與非線性演算法的比較 5
表2 - 1 在不同天線與不同調變下的符元向量總合 21
表3 - 1不同 K 值 Neumann 近似複雜度 [19] 40
表3 - 2常見的定常疊代法與Krylov子空間向量法分類 44
表3 - 3 傳統線性疊代演算法的分裂矩陣表 46
表3 - 4 不同疊代法的乘法運算次數 49
表3 - 5 階梯矩陣變因表 52
表3 - 6 s_i 各個位元位置對應到的 λ_b (s_i) 65
表4 - 1 輸入輸出訊號 70
表5 - 1 晶片合成各階段的最大操作速度 101
表5 - 2 核心各元件面積與功耗比例表 106
表5 - 3 晶片規格 108
表5 - 4 硬體比較表(上) 109
表5 - 5 硬體比較表(下) 110參考文獻 參考文獻
[1] M. Chiang and T. Zhang, "Fog and IoT: An Overview of Research Opportunities," IEEE Internet of Things Journal, pp. 854-864, Dec. 2016.
[2] J. G. Andrews et al., “What Will 5G Be?,” IEEE Journal on Selected Areas in Communications, pp. 1065-1082, June 2014.
[3] F. Rusek et al., “Scaling Up MIMO: Opportunities and Challenges with Very Large Arrays,” IEEE Signal Processing Magazine, Jan. 2013.
[4] E. Perahia, R. Stacey, Next Generation Wireless LANs: Throughtput, Robustness, and Reliability in 802.11n, Cambridge University Press, Sep. 2008.
[5] C. Qin, Y. Miao, Y. Gao, J. Chen, J. Zhang and A. A. Glazunov, “Simulation-based Investigation on Spatial Channel Hardening of Massive MIMO in Different Indoor Scenarios and with Different Array Topologies,” 2020 XXXIIIrd General Assembly and Scientific Symposium of the International Union of Radio Science, pp. 1-4, 2020.
[6] H. Q. Ngo and E. G. Larsson, “No Downlink Pilots Are Needed in TDD Massive MIMO,” IEEE Transactions on Wireless Communications, pp. 2921-2935, May 2017.
[7] I. A. Khoso et al., “A Fast-Convergent Detector Based on Joint Jacobi and Richardson Method for Uplink Massive MIMO Systems,” 2019 28th Wireless and Optical Communications Conference (WOCC), pp. 1-5, 2019.
[8] Mahmoud A. Albreem, Mohammed H. Alsharif, Sunghwan Kim, “A Robust Hybrid Iterative Linear Detector for Massive MIMO Uplink Systems,” MDPI, 21 February 2020.
[9] C. Studer, S. Fateh and D. Seethaler, “ASIC Implementation of Soft-Input Soft-Output MIMO Detection Using MMSE Parallel Interference Cancellation,” IEEE Journal of Solid-State Circuits, pp. 1754-1765, July 2011.
[10] Babak Hassibi and Haris Vikalo, “On the Sphere-Decoding Algorithm I. Expected Complexity,” IEEE Transaction on Signal. Proc, AUGUST 2005.
[11] X. Chu and J. McAllister, “Software-Defined Sphere Decoding for FPGA-Based MIMO Detection,” IEEE Transactions on Signal Processing, pp. 6017-6026, Nov. 2012.
[12] L. Liu, G. Peng, and S. Wei, Massive MIMO Detection Algorithm and VLSI Architecture, Singapore: Springer, 2019.
[13] J. Jalden, L. G. Barbero, B. Ottersten and J. S. Thompson, “Full Diversity Detection in MIMO Systems with a Fixed-Complexity Sphere Decoder,” 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP ′07, pp. III-49-III-52, 2007.
[14] C. Xiong, X. Zhang, K. Wu and D. Yang, “A simplified fixed-complexity sphere decoder for V-BLAST systems,” IEEE Communications Letters, pp. 582-584, August 2009.
[15] LAN/MAN Standards Committee, “IEEE Standard for Information technology –Telecommunications and information exchange between systems-Local and metropolitan area networks-Specific requirements.,” USA, 2009 OCT..
[16] G. D. Forney, “The viterbi algorithm,” Proceedings of the IEEE, pp. 268-278, March 1973.
[17] B. Moision, “A Truncation-Depth Rule of Thumb for Convolutional Codes,” ITA Workshop, pp. 555-557, 2008.
[18] Xiaoqing Zhao; Zhengquan Li; Song Xing; Yang Liu 1; QiongWu; Baolong Li, “An Improved Jacobi-Based Detector for Massive MIMO Systems,” MDPI, 5 May 2019.
[19] J. Chen, Z. Zhang, H. Lu, J. Hu and G. E. Sobelman, “An Intra-Iterative Interference Cancellation Detector for Large-Scale MIMO Communications Based on Convex Optimization,” IEEE Transactions on Circuits and Systems I:Regular Papers, pp. 2062-2072, Nov. 2016.
[20] X. Gao, L. Dai, Y. Hu, Z. Wang and Z. Wang, “Matrix inversion-less signal detection using SOR method for uplink large-scale MIMO systems,” 2014 IEEE Global Communications Conference, pp. 3291-3295, 2014.
[21] Q. Deng, X. Liang, X. Wang, M. Huang, C. Dong and Y. Zhang, “Fast Converging Iterative Precoding for Massive MIMO Systems: An Accelerated Weighted Neumann Series-Steepest Descent Approach,” IEEE Access, pp. 50244-50255, 2020.
[22] J. Minango, C. de Almeida, and C. D. Altamirano, “Low-complexity MMSE detector for massive MIMO systems based on Damped Jacobi method,” Proc. IEEE Int. Symp. Pers., Indoor Mobile Radio Commun., Montreal, QC, Canada, p. 1–5, Oct. 2017.
[23] H. J. B. Costa and V. O. Roda, “A scalable soft Richardson method for detection in a massive MIMO system,” Przeglad Elektrotechniczny, p. 199–203, 199–203 2016.
[24] Wu, H.; Shen, B.; Zhao, S.; Gong, P., “Low-Complexity Soft-Output Signal Detection Based on Improved Kaczmarz Iteration Algorithm for Uplink Massive MIMO System,” MDPI, 11 March 2020.
[25] G. Stewart, Matrix Algorithms Volume I: Basic Decompositions, Society for Industrial and Applied Mathematics, 1998.
[26] M. A. Albreem et al., “Low Complexity Linear Detectors for Massive MIMO: A Comparative Study,” IEEE Access, pp. 45740-45753, 2021.
[27] P. C. Tsai, K. K. Lee and C. Chen, “An Eigen-based Matrix Inverse Approximation Scheme with Stair Matrix Splitting for Massive MIMO Systems,” 2018 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), pp. 378-381, 2018.
[28] M. Albreem, M. Juntti, S. Shahabuddin, “Efficient initialisation of iterative linearmassive MIMO detectors using a stair matrix,” Electronics Letters 9th, p. 50–52, January 2020.
[29] R. Chataut, R. Akl and M. Robaei, “Accelerated and Preconditioned Refinement of Gauss-Seidel Method for Uplink Signal Detection in 5G Massive MIMO Systems,” 2020 10th Annual Computing and Communication Workshop and Conference (CCWC), Las Vegas, NV, USA, pp. 0083-0089, 2020.
[30] M. Zhang and S. Kim, “Evaluation of MMSE-Based Iterative Soft Detection Schemes for Coded Massive MIMO System,” IEEE Access, pp. 10166-10175, 2019.
[31] Y. Hama and H. Ochiai, “A low-complexity matched filter detector with parallel interference cancellation for massive MIMO systems,” 2016 IEEE 12th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), pp. 1-6, 2016.
[32] A. Yu et al., “Efficient Successive Over Relaxation Detectors for Massive MIMO,” IEEE Transactions on Circuits and Systems I: Regular Papers, pp. 2128-2139, June 2020.
[33] G. Peng, L. Liu, S. Zhou, S. Yin and S. Wei, “A 1.58 Gbps/W 0.40 Gbps/mm2 ASIC Implementation of MMSE Detection for 128×8 64 -QAM Massive MIMO in 65 nm CMOS,” IEEE Transactions on Circuits and Systems I: Regular Papers, pp. 1717-1730, May 2018.
[34] M. Wu, B. Yin, G. Wang, C. Dick, J. R. Cavallaro and C. Studer, “Large-Scale MIMO Detection for 3GPP LTE: Algorithms and FPGA Implementations,” IEEE Journal of Selected Topics in Signal Processing, pp. 916-929, Oct. 2014.
[35] Shirshendu Roy, “Advanced Digital System Design - A Practical Guide to Verilog Based FPGA and ASIC Implementation,” January 2021.
[36] A. Habegger, A. Stahel, J. Goette and M. Jacomet, “An Efficient Hardware Implementation for a Reciprocal Unit,” 2010 Fifth IEEE International Symposium on Electronic Design, Test & Applications, pp. 183-187, 2010.
[37] J. Tu, M. Lou, J. Jiang, D. Shu and G. He, “An Efficient Massive MIMO Detector Based on Second-Order Richardson Iteration: From Algorithm to Flexible Architecture,” IEEE Transactions on Circuits and Systems I: Regular Papers, pp. 4015-4028, Nov. 2020.
[38] L. Liu et al., “Energy- and Area-Efficient Recursive-Conjugate-Gradient-Based MMSE Detector for Massive MIMO Systems,” IEEE Transactions on Signal Processing, pp. 573-588, 2020.
[39] H. Prabhu, J. N. Rodrigues, L. Liu and O. Edfors, “3.6 A 60pJ/b 300Mb/s 128×8 Massive MIMO precoder-detector in 28nm FD-SOI,” 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 60-61, 2017.
[40] B. Yin, M. Wu, G. Wang, C. Dick, J. R. Cavallaro and C. Studer, “A 3.8Gb/s large-scale MIMO detector for 3GPP LTE-Advanced,” 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3879-3883, 2014.指導教授 薛木添(Muh-Tian Shiue) 審核日期 2021-8-23 推文 facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu 網路書籤 Google bookmarks del.icio.us hemidemi myshare