摘要: | 此論文呈現三個主要研究工作: 1. 高產量(throughput)高硬體效率之多輸入多輸出偵測器(MIMO detector)之設計與實作, 2. 在基於QR分解之多輸入多輸出正交分頻多工前編碼(MIMO-OFDM precoding)系統,不完美通道狀態資訊(channel state information, CSI)之改善和處理器的設計,與3. 一個用於多輸入多輸出前編碼的高重組性(configurable)一般化矩陣分解處理器(GMDP)之設計與實作。在我們第一個研究工作中,我們提出依層(layer-dependent) K最佳搜尋演算法,以降低多輸入多輸出偵測器之複雜度並仍有合理的位元錯誤率(bit error rate, BER)效能。我們也提出一基於on-demand expansion (ODE)演算法之高硬體效率高產量K最佳搜尋架構。我們提出的多輸入多輸出偵測器基於所提出的依層K最佳演算法與所提出的ODE架構達到每個時脈週期輸出一筆多輸入多輸出偵測結果。基於所提出的架構,我們實作一4 × 4偵測器積體電路(IC)晶片,並作測量。根據量測結果,它達到4.08 Gbps的產量與17.6 Mbps/kilogate的標準化硬體效率(normalized hardware efficiency, NHE)。為了設計8 × 8偵測器,intra-layer folding方案被提出以降低過剩的產量換取較小的硬體複雜度。它的後佈局(post-layout)模擬結果達到4.37 Gbps產量與1713 Mbps/mm2標準化硬體效率。相比於傳統的K最佳多輸入多輸出偵測器與過去的研究工作,我們的設計具有功率效率與硬體效率。此外,我們也分析所提出的8 × 8多輸入多輸出偵測器架構的可擴展性,根據天線數、星座圖大小與K值與對應的產量與邏輯閘數。 在我們第二個研究中,我們首先分析一種不完美通道狀態資訊 - 通道估測雜訊,分析其對基於QRD前編碼系統的衝擊。基於一降低sounding階段雜訊的通道估測過濾的方案–簡稱SndChFilt,我們分析了beamforming 階段通道估測過濾 - 簡稱BeamChFilt方案與顯示其困難性。接著我們提出另兩個beamforming階段通道估測雜訊降低方案,BeamChEqual與BeamChReal,用於相等率(equal rate, ER)-QRD多輸入多輸出正交分頻多工前編碼系統。BeamChEqual方案使用ER-QRD beamforming 之等效通道矩陣之相等通道增益特性減少CSI的雜訊。BeamChReal方案使用實數通道增益特性減少雜訊。兩個方案都消耗低的計算複雜度、不須要額外的通訊協定成本、相容於IEEE 802.11集束封包格式。所提出的前編碼方案; 使用ER-sorted QRD-TH (Tomlinson-Harashima)前編碼與BeamChEqual, BeamChReal, 加上SndChFilt; 在1 或 2個使用者的8 × 8多輸入多輸出前編碼位元錯誤率模擬達到約4 dB訊號雜訊比率(SNR, signal to noise ratio)改善。我們也研究了所提出的ER-SQRD-THP,AMBER-SVD 前編碼,與GMD-THP之間的優缺點比較。模擬顯示ER-SQRD-THP與提出的CSI改善方案達到好的BER效能。我們提出一個使用此CSI改善方案的處理器架構。最後,我們分析通道狀態資訊過濾與回饋硬體的設計考量。 在我們第三個研究中,我們提出一個改進的一般化矩陣分解處理器(GMDP)。其支援4種4 × 4複數矩陣分解演算法 - QR分解(QRD),奇異值分解(SVD),特徵值分解(EVD)與幾何平均分解(GMD),使用了16個可重組處理單元(PE)陣列與記憶體的架構。每個PE包含一個CORDIC,用來得到分解後的值與基底矩陣。與值矩陣計算同時,基底矩陣的計算使用值矩陣計算的反運算。在改進的架構中,每個CORDIC依據所有會用到的運算作客製。我們的GMDP實作的QRD, EVD, SVD與GMD分別達到每秒的4 × 4複數矩陣分解產量為9.47M, 0.94M, 0.88M, 2.8M。客製的CORDIC架構節省了12%的邏輯閘數。總結,這本論文呈現三個用於多輸入多輸出正交分頻多工系統之高品質處理器:一高產量多輸入多輸出偵測處理器、一通道狀態資訊品質改善處理器用於MIMO-OFDM precoding與一高重組性一般化矩陣分解處理器。 ;This dissertation presents three main research works: 1. the design and implementation of high-throughput hardware-efficient MIMO detectors, 2. the imperfect channel state information (CSI) improvement and a processor design for QRD-based MIMO-OFDM precoding system, and 3. a high configurable generalized matrix decomposition processor (GMDP) design and implementation for MIMO precoding. In our first work, we proposed the layer-dependent K-best search algorithm to reduce MIMO detector complexity with reasonable bit error rate (BER) performance. We also proposed a hardware-efficient high-throughput K-best search hardware architecture based on on-demand expansion (ODE) algorithm. Our proposed MIMO detector architecture based on the layer-dependent K-best algorithm and the proposed ODE architecture achieves the MIMO detection rate - 1 MIMO detection result per clock cycle. Based on the proposed architecture, one 4 × 4 detector IC was manufactured and measured. According to the measurement results, it reaches 4.08 Gbps throughput and a 17.6 Mbps/kilogates normalized hardware efficiency (NHE). The intra-layer folding scheme is proposed to trade enough throughput for lower hardware complexity for designing the 8 × 8 detector. Its post-layout simulation result offers 4.37 Gbps throughput and a 1713 Mbps/mm2 NHE. Compared with the conventional K-best MIMO detectors and some previous works, our designs are power-efficient and hardware-efficient. In addition, the scalability of the proposed 8 × 8 MIMO detector architecture is analyzed according to the number of antennas, constellation size, and K values, and the related throughput and gate count are investigated. In our second work, we first analyze one kind of imperfect CSI, channel estimation noise, impact on QRD-based MIMO-OFDM precoding systems. Based on the noise reduction in sounding phase by the channel estimation filtering scheme – SndChFilt, we analyze and show the difficulties to do beamforming channel estimation filtering - the BeamChFilt scheme. Then, we propose other two beamforming channel estimation noise reduction schemes, BeamChEqual and BeamChReal for equal rate (ER)-QRD MIMO-OFDM precoding systems. BeamChEqual scheme reduces noise in CSI by the equal channel gain property of ER-QRD beamforming effective channel matrixes. BeamChReal scheme reduces noise by the real-valued channel gain property. Both schemes consume low computational complexity, require no extra communication protocol overhead, and are compatible with the IEEE 802.11 beamforming packet format. The proposed precoding scheme; ER-sorted QRD-TH (Tomlinson-Harashima) precoding with BeamChEqual, BeamChReal, together with SndChFilt; achieves approximate 4 dB SNR (signal to noise ratio) improvement in 1 or 2-user 8 × 8 precoding BER simulations. The pros and cons comparisons between the proposed ER-SQRD-THP, AMBER-SVD precoding, and GMD-THP are studied. Simulations show the ER-SQRD-THP with the proposed CSI improvement schemes achieves a good BER performance. We propose a processor architecture with those CSI improvement schemes. Finally, the CSI feedback with filtering hardware design considerations are analyzed. In our third work, we propose an improved generalized matrix decomposition processor (GMDP). It supports computations of four kinds of 4 × 4 complex matrix decomposition algorithms, QR decomposition (QRD), singular value decomposition (SVD), eigenvalue decomposition (EVD), and geometric mean decomposition (GMD), using an array of 16 configurable processing elements and memory-based architecture. Each processing element contains one CORDIC for obtaining decomposition value and basis matrixes. The basis matrixes are computed by inverses of operations on the value matrix at the same time as the value matrix. In the improved architecture, each CORDIC is customized by all used operations. Our GMDP implementation achieves throughputs of 9.47M, 0.94M, 0.88M, 2.8M matrixes per second for 4 × 4 complex QRD, EVD, SVD, and GMD, respectively. The CORDIC-customized architecture saves a 12% gate count. In summary, this dissertation presents designs of three high quality processors for MIMO-OFDM systems: a high throughput MIMO detection processor, a CSI quality improvement processor for MIMO-OFDM precoding, and a high configurable generalized matrix decomposition processor. |