參考文獻 |
[1] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems, 2012, pp. 1097–1105.
[2] X. Zhang, J. Zou, K. He, and J. Sun, “Accelerating very deep convolutional networks for classification and detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 10, pp. 1943–1955, 2015.
[3] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: unified, realtime object detection,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779–788.
[4] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceeding of The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1–9.
[5] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceeding of The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, pp. 770–778.
[6] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “MobileNets: Efficient convolutional neural networks for mobile vision applications,” Computing Research Repository (CoRR), 2017.
[7] T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, “DianNao: A smallfootprint high-throughput accelerator for ubiquitous machine-learning,” in Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2014, pp. 269–284.
[8] Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, and O. Temam, “ShiDianNao: Shifting vision processing closer to the sensor,” ACM SIGARCH Computer Architecture News, vol. 43, no. 3, pp. 92–104, Jun. 2015.
[9] Y. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE Journal of Solid-State Circuits, vol. 52, no. 1, pp. 127–138, Jan. 2017.
[10] N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia,
N. Boden, A. Borchers et al., “In-datacenter performance analysis of a tensor processing unit,” in Proceedings of ACM/IEEE International Symposium on Computer Architecture (ISCA), 2017, pp. 1–12.
[11] C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong, “Optimizing FPGA-based accelerator design for deep convolutional neural networks,” in Proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2015, pp. 161–170.
[12] M. Motamedi, P. Gysel, V. Akella, and S. Ghiasi, “Design space exploration of FPGAbased deep convolutional neural networks,” in Proceedings of Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2016, pp. 575–580.
[13] N. P. Jouppi, C. Young, N. Patil, D. A. Patterson, G. Agrawal, R. Bajwa et al., “Indatacenter performance analysis of a tensor processing unit,” in 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), 2017, pp. 1–12.
[14] G. Li, S. K. S. Hari, M. Sullivan, T. Tsai, K. Pattabiraman, J. Emer, and S. W. Keckler, “Understanding error propagation in deep learning neural network (DNN) accelerators and applications,” in SC17: International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 1–12.
[15] J. J. Zhang, K. Basu, and S. Garg, “Fault-tolerant systolic array based accelerators for deep neural network execution,” IEEE Design Test, vol. 36, no. 5, pp. 44–53, 2019.
[16] C. Liu, C. Chu, D. Xu, Y.Wang, Q.Wang, H. Li, X. Li, and K.-T. Cheng, “Hyca: A hybrid computing architecture for fault-tolerant deep learning,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 41, no. 10, pp. 3400–3413, 2022.
[17] R. Baumann, “Radiation-induced soft errors in advanced semiconductor technologies,” IEEE Transactions on Device and Materials Reliability, vol. 5, no. 3, pp. 305–316, 2005.
[18] K. Kang, S. Gangwal, S. P. Park, and R. Kaushik, “NBTI induced performance degradation in logic and memory circuits: how effectively can we approach a reliability solution?” in 2008 Asia and South Pacific Design Automation Conference, 2008, pp. 726–731.
[19] J. J. Zhang, T. Gu, K. Basu, and S. Garg, “Analyzing and mitigating the impact of permanent faults on a systolic array based neural network accelerator,” in 2018 IEEE 36th VLSI Test Symposium (VTS), 2018, pp. 1–6.
[20] M. A. Hanif and M. A. Shafique, “SalvageDNN: salvaging deep neural network accelerators with permanent faults through saliency-driven fault-aware mapping,” Philosophical Transactions of the Royal Society A, vol. 378, 2019.
[21] L. H. Hoang, M. A. Hanif, and M. Shafique, “FT-ClipAct: Resilience analysis of deep neural networks and improving their fault tolerance using clipped activation,” CoRR, vol. abs/1912.00941, 2019.
[22] G. B. Hacene, F. Leduc-Primeau, A. B. Soussia, V. Gripon, and F. Gagnon, “Training modern deep neural networks for memory-fault robustness,” CoRR, vol. abs/1911.10287, 2019.
[23] R. E. Lyons and W. Vanderkulk, “The use of triple-modular redundancy to improve computer reliability,” IBM Journal of Research and Development, vol. 6, no. 2, pp. 200–209, 1962.
[24] T. G. Bertoa, G. Gambardella, N. J. Fraser, M. Blott, and J. McAllister, “Fault-tolerant neural network accelerators with selective TMR,” IEEE Design Test, vol. 40, no. 2, pp. 67–74, 2023.
[25] J. Zhang, K. Rangineni, Z. Ghodsi, and S. Garg, “Thundervolt: Enabling aggressive voltage underscaling and timing error resilience for energy efficient deep learning accelerators,” in Proceedings of the 55th Annual Design Automation Conference, ser. DAC ’18. New York, NY, USA: Association for Computing Machinery, 2018.
[26] P. N. Whatmough, S. K. Lee, D. Brooks, and G.-Y. Wei, “DNN engine: A 28-nm timingerror tolerant sparse deep neural network processor for IoT applications,” IEEE Journal of Solid-State Circuits, vol. 53, no. 9, pp. 2722–2731, 2018.
[27] R. W. Hamming, “Error detecting and error correcting codes,” The Bell System Technical Journal, vol. 29, no. 2, pp. 147–160, 1950.
[28] M. Y. Hsiao, “A class of optimal minimum odd-weight-column SEC-DED codes,” IBM Journal of Research and Development, vol. 14, no. 4, pp. 395–401, 1970.
[29] M. Patel, J. S. Kim, H. Hassan, and O. Mutlu, “Understanding and modeling on-die error correction in modern DRAM: An experimental study using real devices,” in 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2019, pp. 13–25.
[30] K.-H. Huang and J. A. Abraham, “Algorithm-based fault tolerance for matrix operations,” IEEE Transactions on Computers, vol. C-33, no. 6, pp. 518–528, 1984.
[31] K. Cho, I. Lee, H. Lim, and S. Kang, “Efficient systolic-array redundancy architecture for offline/online repair,” Electronics, vol. 9, no. 2, 2020.
[32] I. Takanami and T. Horita, “A built-in circuit for self-repairing mesh-connected processor arrays by direct spare replacement,” in 2012 IEEE 18th Pacific Rim International Symposium on Dependable Computing, 2012, pp. 96–104.
[33] K. Zhao, S. Di, S. Li, X. Liang, Y. Zhai, J. Chen, K. Ouyang, F. Cappello, and Z. Chen, “FT-CNN: Algorithm-based fault tolerance for convolutional neural networks,” IEEE Transactions on Parallel and Distributed Systems, vol. 32, no. 7, pp. 1677–1689, 2020.
[34] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in 2009 IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.
[35] Z. Xu and J. Abraham, “Safety design of a convolutional neural network accelerator with error localization and correction,” in 2019 IEEE International Test Conference (ITC), 2019, pp. 1–10.
[36] E. Ozen and A. Orailoglu, “Low-cost error detection in deep neural network accelerators with linear algorithmic checksums,” Journal of Electronic Testing, vol. 36, no. 6, pp. 703– 718, 2020.
[37] M. Safarpour, R. Inanlou, and O. Silv´en, “Algorithm level error detection in low voltage systolic array,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 69, no. 2, pp. 569–573, 2022.
[38] T. Marty, T. Yuki, and S. Derrien, “Safe overclocking for CNN accelerators through algorithm-level error detection,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 12, pp. 4777–4790, 2020.
[39] D. Filippas, N. Margomenos, N. Mitianoudis, C. Nicopoulos, and G. Dimitrakopoulos, “Low-cost online convolution checksum checker,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 30, no. 2, pp. 201–212, 2022.
[40] Z. Xu and J. Abraham, “Design of a safe convolutional neural network accelerator,” in 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2019, pp. 247–252.
[41] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[42] L. Deng, “The mnist database of handwritten digit images for machine learning research [best of the web],” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 141–142, 2012.
[43] Y. Zhao, K. Wang, and A. Louri, “FSA: An efficient fault-tolerant systolic array-based DNN accelerator architecture,” in 2022 IEEE 40th International Conference on Computer Design (ICCD), 2022, pp. 545–552.
[44] A. Saleh, J. Serrano, and J. Patel, “Reliability of scrubbing recovery-techniques for memory systems,” IEEE Transactions on Reliability, vol. 39, no. 1, pp. 114–122, 1990. |