基於負載均衡的脈動陣列老化效應緩解達到可靠性提升設計

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：50

、訪客IP：18.221.243.29

姓名

何宜真(Yi-Chen Ho) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

基於負載均衡的脈動陣列老化效應緩解達到可靠性提升設計
(Mitigating Aging Effects in Systolic Arrays: A Load Balancing Approach for Reliability Improvement)

相關論文

★ 用於類比電路仿真之波動數位濾波器架構的自動建構方法	★ 使用波動數位濾波器與非線性MOS模型的類比電路模擬平台
★ 實現波動數位濾波器架構下之類比仿真器的非線性電晶體模型	★ 以節點保留方式進行壓降分析中電源網路模型化簡的方法
★ 以引導式二階權重提取改進辨認二階臨界函數之研究	★ 用於類比電路仿真器的波動數位濾波器架構之定點數實現方法
★ 以基本類比電路架構為基礎的佈局自動化工具	★ 可保留設計風格及繞線行為之類比佈局遷移技術
★ 自動辨識混合訊號電路中數位區塊之方法	★ 運用於記憶體內運算的SRAM功率模型之研究
★ 考量可繞度及淺溝槽隔離效應之類比佈局擺置微調方法	★ 一個適用於量化深度神經網路且可調整精確度的處理單元設計: 一種階層式的設計方法
★ 一個有效的邊緣智慧運算加速器設計: 一種適用於深度可分卷積的可重組式架構	★ 實現類比電路仿真的波動數位濾波器架構生成與模擬
★ 用於類比電路仿真器的波動數位濾波器之硬體最佳化方法	★ 自動辨識混合訊號電路中構成區塊及RLC元件之方法

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2027-7-16以後開放)

摘要(中)

基於脈動陣列的加速器被認為是用於深度神經網絡（DNN）加速數據密集型神經網絡計算的最有前途的架構之一，然而脈動陣列的固定大小可能導致顯著的工作負載不平衡，處理單元（PEs）的利用率可能會有很大的差異，在脈動陣列中的PEs最大和最小使用率之間的差異可能高達14倍。此外老化效應，例如熱載流子注入（HCI）和偏壓溫度不穩定性（BTI），可能會隨著時間的推移引入時序錯誤或功能故障，特別是在高使用率的PEs中，最終由於高負載PEs上的老化效應，整個脈動陣列的壽命可能會縮短。針對這些挑戰，本文介紹了一種創新的老化感知脈動陣列設計框架，該框架包括兩個關鍵步驟——平衡負載的脈動陣列數據路徑設計和老化感知的數據映射策略——緩解因老化引起精度下降的方針。實驗表明，在經歷十年的老化後，我們的系統在預測精度上比傳統的基於脈動陣列的AI加速器提高了60.7%，同時我們的方法可以將脈動陣列的利用率提高至2.4倍。

摘要(英)

The systolic-array-based accelerator stands out as one of the most promising architectures for deep neural network (DNN) acceleration in data-intensive neural network computation. Nevertheless, the fixed size of the systolic array can result in significant workload imbalances, leading to considerable variations in the utilization rates of processing elements (PEs). This discrepancy can reach up to a x14 difference between the maximum and minimum usage. Furthermore, aging effects, such as Hot Carriers Injection (HCI) and Bias Temperature Instability (BTI), can introduce timing errors or functional failures over time, particularly in PEs with high usage. Consequently, the lifespan of the entire systolic array can be shortened due to aging effects on heavily utilized PEs. In response to these challenges, this paper introduces an innovative aging-aware systolic array design framework. Comprising two key components - a balance-loaded systolic array datapath design and an aging-aware data mapping policy - this framework aims to alleviate the aging-induced accuracy degradation. Experiments show that, after ten years of aging, our system can achieve a prediction accuracy improvement of 60.7% over a traditional systolic-array-based AI accelerator. Our approach can increase utilization up to 2.4 times in the meantime.

關鍵字(中)

★ 脈動陣列
★ 老化效應
★ 人工智慧加速器

關鍵字(英)

★ Systolic array
★ Aging effect
★ AI accelerator

論文目次

摘要 i
Abstract ii
致謝 iii
Table of Contents iv
Table of Figures vi
Chapter 1 Introduction 1
1.1 Systolic Arrays 3
1.2 Reliability Issues of Systolic Array 4
1.3 Previous Works Review 6
1.4 Contributions 6
Chapter 2 Background 9
2.1 Systolic-array-based Accelerator 9
2.2 Weight Stationary Dataflow 10
2.3 Aging Effect on AI Accelerator 11
2.4 Related Works 13
Chapter 3 Aging-Aware Task Deployment Framework 14
3.1 Problem Formulation 14
3.2 The Concept Overview 15
3.3 Fission Systolic Array Hardware Design 16
3.4 Balance-Loaded Systolic Array Datapath Design 18
3.4.1 CNN Models Analyze 18
3.4.2 Sub-Mappings Grouping 19
3.4.3 Production of Candidates 20
3.5 Aging-aware Data Mapping Policy 21
Chapter 4 Experimental Results 24
4.1 Balance-Loaded Systolic Array Datapath Validation 24
4.2 Aging-Aware Data Mapping Policy Validation 25
4.3 Overhead Analysis 28
Chapter 5 Conclusions 29
References 30

參考文獻

[1] Jouppi, N. P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., ... & Yoon, D. H. (2017, June). In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture (pp. 1-12).
[2] Kung, H. T., & Leiserson, C. E. (1979, January). Systolic arrays (for VLSI). In Sparse Matrix Proceedings 1978 (Vol. 1, pp. 256-282). Philadelphia, PA, USA: Society for industrial and applied mathematics.
[3] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[4] Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017, February). Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence (Vol. 31, No. 1).
[5] Liu, W., & Chang, C. H. (2019, May). Analysis of circuit aging on accuracy degradation of deep neural network accelerator. In 2019 IEEE International Symposium on Circuits and Systems (ISCAS) (pp. 1-5). IEEE.
[6] Moghaddasi, I., Gorgin, S., & Lee, J. A. (2023). Dependable dnn accelerator for safety-critical systems: A review on the aging perspective. IEEE Access.
[7] Abdullah Hanif, M., & Shafique, M. (2020). Salvagednn: salvaging deep neural network accelerators with permanent faults through saliency-driven fault-aware mapping. Philosophical Transactions of the Royal Society A, 378(2164), 20190164.
[8] Salamin, S., Zervakis, G., Spantidi, O., Anagnostopoulos, I., Henkel, J., & Amrouch, H. (2021, February). Reliability-aware quantization for anti-aging NPUs. In 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE) (pp. 1460-1465). IEEE.
[9] Zhang, J., Rangineni, K., Ghodsi, Z., & Garg, S. (2018, June). Thundervolt: enabling aggressive voltage underscaling and timing error resilience for energy efficient deep learning accelerators. In Proceedings of the 55th Annual Design Automation Conference (pp. 1-6).
[10] Zhang, J. J., Gu, T., Basu, K., & Garg, S. (2018, April). Analyzing and mitigating the impact of permanent faults on a systolic array based neural network accelerator. In 2018 IEEE 36th VLSI Test Symposium (VTS) (pp. 1-6). IEEE.
[11] Ghodrati, S., Ahn, B. H., Kim, J. K., Kinzer, S., Yatham, B. R., Alla, N., ... & Esmaeilzadeh, H. (2020, October). Planaria: Dynamic architecture fission for spatial multi-tenant acceleration of deep neural networks. In 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO) (pp. 681-697). IEEE.
[12] Samajdar, A., Zhu, Y., Whatmough, P., Mattina, M., & Krishna, T. (2018). Scale-sim: Systolic cnn accelerator simulator. arXiv preprint arXiv:1811.02883.
[13] Oboril, F., & Tahoori, M. B. (2012, June). Extratime: Modeling and analysis of wearout due to transistor aging at microarchitecture-level. In IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012) (pp. 1-12). IEEE.
[14] Tiwari, A., & Torrellas, J. (2008, November). Facelift: Hiding and slowing down aging in multicores. In 2008 41st IEEE/ACM International Symposium on Microarchitecture (pp. 129-140). IEEE.
[15] Henkel, J., Ebi, T., Amrouch, H., & Khdr, H. (2013, January). Thermal management for dependable on-chip systems. In 2013 18th Asia and South Pacific Design Automation Conference (ASP-DAC) (pp. 113-118). IEEE.
[16] Chen, Y. H., Krishna, T., Emer, J. S., & Sze, V. (2016). Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE journal of solid-state circuits, 52(1), 127-138.
[17] Xu, R., Ma, S., Wang, Y., Chen, X., & Guo, Y. (2021). Configurable multi-directional systolic array architecture for convolutional neural networks. ACM Transactions on Architecture and Code Optimization (TACO), 18(4), 1-24.

指導教授

周景揚(Jing-Yang Jou)

審核日期

2024-7-16

推文