摘要: | 傳統的稀疏碼多工存取的接收器使用了複數維度的碼字來傳遞使用者的資訊來提高資源使用率,但傳統的訊息傳遞演算法(Message Propagation Algorithm)接收器複雜度隨著碼簿大小而呈現冪次方上升,以硬體實現是一個極大的考驗,因此基於期望值傳遞演算法(Expectation Propagation Algorithm, EPA),本論文提出用分組近似期望值傳遞演算法(Group-Approximate EPA, GA-EPA)來處理接收器的偵測問題,我們使用了三種方法來降低演算法的運算複雜度,其中近似計算是將部分計算利用取最大值來做近似,並將運算轉換至對數域,可降低約30%的乘法個數與約60%的除法個數,而本論文所使用的碼簿為16點,若直接使用近似期望值演算法(Approximate EPA)會造成性能不佳,因此使用了機率分組的方法,將碼字透過實部及虛部分別進行分組,再利用分組後的碼字機率來做運算,不僅能達到與取前4個最大值相似的性能還能藉此降低演算法複雜度,最後使用QR分解來減少1/2訊息的傳遞量,在變數節點及資源節點的計算上也能減少約1/2的計算量。此外,本論文亦實現了硬體設計,實現的硬體架構為:總遞迴次數為4次、使用者數為6、資源點數為4、具備4根接收天線並使用16點碼簿的上行系統,除了前述改善外,在對數域的查表利用指數的特性拆解成較小的表格來實現,並利用客製化浮點數優勢降低約17%的硬體面積,根據40nm CMOS製程合成結果,本論文之設計最高操作頻率為166.67MH,吞吐量為363.64Mbps,且產出軟式決策值,相較目前文獻僅使用4點的碼簿實作,我們有著較小的邏輯閘數設計、13倍的正規化解碼吞吐率改善與較佳的硬體效率。;The conventional Sparse Code Multiple Access (SCMA) receiver employs codewords with complex dimensions to transmit user information and improve resource utilization. However, the complexity of the message propagation algorithm increases exponentially with the size of the codebook and the number of antennas, making hardware implementation a significant challenge. Therefore, based on expectation propagation algorithm (EPA), we propose group-approximate EPA (GA-EPA) for complexity reduction. The approximate calculations in the logarithmic domain by maximum value selection is used. This reduces approximately 30% of multiplications and about 60% of divisions. However, since 16-point codebook is adopted, a probability grouping method is proposed, where codewords are grouped to generate the associated probability. This approach not only achieves performance similar to using the top-four maximum values but also reduces the computation complexity. Finally, QR decomposition is employed to eliminate a half of transmitted information, and the computational load on variable nodes and resource nodes is also reduced by about 1/2. We then design the SCMA hardware detector for an uplink system having a total of 6 users, 4 resource elements, and 4 receive antennas, with a 16-point codebook for 4 recursive decoding passes. Besides the probability grouping and QR decomposition for complexity reduction and throughput enhancement, and we also utilize customized floating-point datapath to decrease hardware area by approximately 17% and take advantage of exponential operation to split a large table into two small tables. From synthesis results in 40nm CMOS technology, our design can generate soft decision and achieves a throughput of 363.64Mbps at a maximum operating frequency of 166.67MHz. Compared to conventional design supporting only 4-point codebook, our design has smaller gate count, 13x improvement in normalized throughput and better normalized hardware efficiency. |