本篇論文提出改良型的前編碼技術,適用於 8x8多使用者多輸入多輸出系統,並且完成硬體設計。此無線通訊系統傳送機以及接收機星座圖對應支援到16QAM,而多輸入多輸出系統則是可以支援傳送端和接收端各8根天線。我們利用回傳的通道資訊來實現前編碼技術,並且為了善用多使用者多輸入輸出(MU-MIMO)中訊號多樣性的特性(Spatial diversity),我們捨棄了傳統的前編碼技術,改用了區塊性的前編碼技術,其中包括使用QR分解(QR decomposition)技術來消去多重存取干擾(multiple access interference, MAI),並且搭配區塊性THP (block-Tomlinson-Harashima precoding )將餘數索引(modulo index)分解以及消去剩餘的符際干擾(inter symbol interference, ISI),而我們也提出區塊性的排序方法來平衡各個使用者中的對角線的能量分布,最後,改良過後的球面解碼器不僅僅增加在區塊性前編碼系統之解碼效能,在複雜度上面也跟傳統解碼器相同。 在硬體實作方面,我們採用管線式架構來達到高吞吐量的目的,並且以相同的4x4 Sorted QR硬體架構,堆疊出8x8 Sorted QR硬體,並且提供逐層排序、逐區塊排序的功能。並且設計反向輸入的硬體架構,使同樣的硬體能更有效率的應用。在硬體實現上面,我們採用Givens Rotation演算法,並且以CORDIC實現。最後我們將設計實作,整體gate count 是1098 K,面積則是2605um*2605um,吞吐量則可以達到9.46MQRD/s,並且可以支援多層排序或者區塊排序等多種模式。 ;This thesis presents a multi-user MIMO transceiver design with a block-based decomposition and precoding scheme. To exploit spatial diversity, we propose to use block-diagonal QR decomposition (BD-QRD) to decompose the channel matrix. To eliminate multiple access interference (MAI), block-Tomlinson-Harashima precoding (B-THP) is further proposed to be combined with BD-QRD so that the equivalent channel matrix after precoding at the transmitter becomes a block diagonal matrix. On the other hand, the block-based sorting is adopted to balance the energy spread among all spatial pipes for BD-QRD and thus the performance can be further enhanced. With these decomposition and precoding techniques at the transmitter, the sphere decoding (SD) techniques can be employed at the receiver with a small revision to constrain the search space. We show that the proposed BD-QRD, B-THP, and constrained SD for multi-user MIMO systems retaining the spatial diversity ,outperforms the conventional QRD-THP and BD-SVD in the interested SNR region. In hardware design, we use pipelined architecture to achieve high throughput . In order to reduce the complexity, the 8x8 Sorted QR can be built up by 4x4 Sorted QR. Two kinds of sorting strategies, one is per-layer sorting and the one is block-based sort. The Givens Rotation of the QRD is adopted for its merit in pipelining with the CORDIC operation. We have implemented the proposed system in TSMC 0.90um CMOS technology. The gate-count are 1098K, throughput is 9.46MQRD/s and chip area is 2605um*2605um.