參考文獻 |
[1] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, et al., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
[2] D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, et al., “Mastering the game of go without human knowledge,” Nature, vol. 550, no. 7676, p. 354, 2017.
[3] K. Bong, S. Choi, C. Kim, S. Kang, Y. Kim, and H.-J. Yoo, “14.6 a 0.62 mW ultra-lowpower convolutional-neural-network face-recognition processor and a CIS integrated with always-on haar-like face detector,” in Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), 2017, pp. 248–249.
[4] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
[5] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceeding of The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, pp. 770–778.
[6] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceeding of The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1–9.
[7] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient convolutional neural networks for mobile vision applications,” Computing Research Repository (CoRR), 2017. [Online]. Available: http://arxiv.org/abs/1704.04861
[8] S. Han, H. Mao, and W. J. Dally, “Deep Compression: Compressing deep neural network with pruning, trained quantization and Huffman coding,” Computing Research Repository (CoRR), 2015. [Online]. Available: http://arxiv.org/abs/1510.00149
[9] S. Han, X. Liu, H.Mao, J. Pu, A. Pedram,M. A. Horowitz, andW. J. Dally, “EIE: Efficient inference engine on compressed deep neural network,” ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 243–254, 2016.
[10] S. Zhang, Z. Du, L. Zhang, H. Lan, S. Liu, L. Li, Q. Guo, T. Chen, and Y. Chen, “Cambricon- X: An accelerator for sparse neural networks,” in Proceeding of 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2016, pp. 1–12.
[11] D. Kim, J. Ahn, and S. Yoo, “Zena: Zero-aware neural network accelerator,” IEEE Design & Test, vol. 35, no. 1, pp. 39–46, 2017.
[12] S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer, “cuDNN: Efficient primitives for deep learning,” Computing Research Repository (CoRR), 2014. [Online]. Available: http://arxiv.org/abs/1410.0759
[13] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean,M. Devin, S. Ghemawat, G. Irving, M. Isard, et al., “TensorFlow: A system for large-scale machine learning,” in Proceedings of USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2016, pp. 265–283.
[14] F. Chollet et al., “Keras,” 2015.
[15] C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao, and J. Cong, “Optimizing FPGA-based accelerator design for deep convolutional neural networks,” in Proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), 2015, pp. 161–170.
[16] M. Motamedi, P. Gysel, V. Akella, and S. Ghiasi, “Design space exploration of FPGA-based deep convolutional neural networks,” in Proceedings of Asia and South Pacific Design Automation Conference (ASP-DAC), Jan. 2016, pp. 575–580.
[17] J. Qiu, J.Wang, S. Yao, K. Guo, B. Li, E. Zhou, J. Yu, T. Tang, N. Xu, S. Song, et al., “Going deeper with embedded FPGA platform for convolutional neural network,” in Proceedings of ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2016, pp. 26–35.
[18] T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, “DianNao: A smallfootprint high-throughput accelerator for ubiquitous machine-learning,” in Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2014, pp. 269–284.
[19] Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, and O. Temam, “ShiDianNao: Shifting vision processing closer to the sensor,” ACM SIGARCH Computer Architecture News, vol. 43, no. 3, pp. 92–104, Jun. 2015.
[20] Y. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE Journal of Solid-State Circuits, vol. 52, no. 1, pp. 127–138, Jan. 2017.
[21] B. Moons, R. Uytterhoeven, W. Dehaene, and M. Verhelst, “14.5 Envision: A 0.26-to- 10tops/w subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm FDSOI,” in Proceedings of IEEE International Solid-State Circuits Conference (ISSCC), Feb. 2017, pp. 246–247.
[22] N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, A. Borchers, et al., “In-datacenter performance analysis of a tensor processing unit,” in Proceedings of ACM/IEEE International Symposium on Computer Architecture (ISCA), 2017, pp. 1–12.
[23] D. M. Loroch, F.-J. Pfreundt, N. Wehn, and J. Keuper, “TensorQuant - A simulation toolbox for deep neural network quantization,” in Proceedings of the Machine Learning on HPC Environments, 2017, pp. 1–8.
[24] B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 2704–2713.
[25] B. Moons, K. Goetschalckx, N. Van Berckelaer, and M. Verhelst, “Minimum energy quantized neural networks,” in Proceedings of Asilomar Conference on Signals, Systems, and Computers, Oct. 2017, pp. 1921–1925.
[26] D. Lin, S. Talathi, and S. Annapureddy, “Fixed point quantization of deep convolutional networks,” in Proceedings of International conference on machine learning, 2016, pp. 2849–2858.
[27] L. Yang and B. Murmann, “SRAM voltage scaling for energy-efficient convolutional neural networks,” in Proceedings of IEEE International Symposium on Quality Electronic Design (ISQED), 2017, pp. 7–12.
[28] L. Yang and B.Murmann, “Approximate SRAMfor energy-efficient, privacy-preserving convolutional neural networks,” in Proceedings of IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2017, pp. 689–694.
[29] S. Venkataramani, A. Ranjan, K. Roy, and A. Raghunathan, “AxNN: Energy-efficient neuromorphic systems using approximate computing,” in Proceedings of IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), 2014, pp. 27–32.
[30] Q. Zhang, T. Wang, Y. Tian, F. Yuan, and Q. Xu, “ApproxANN: An approximate computing framework for artificial neural network,” in Proceedings of Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015, pp. 701–706.
[31] V. Mrazek, Z. Vasicek, L. Sekanina, M. A. Hanif, and M. Shafique, “ALWANN: automatic layer-wise approximation of deep neural network accelerators without retraining,” in Proceedings of IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2019, pp. 1–8.
[32] G. Bolt, “Investigating fault tolerance in artificial neural networks,” Department of Computer Science, Technical Report YCS 154, Heslington, York, England, Tech. Rep., 1991.
[33] C.-T. Chin, K. Mehrotra, C. K. Mohan, and S. Rankat, “Training techniques to obtain fault-tolerant neural networks,” in Proceedings of IEEE International Symposium on Fault-Tolerant Computing, 1994, pp. 360–369.
[34] B. Reagen, P. Whatmough, R. Adolf, S. Rama, H. Lee, S. K. Lee, J. M. Hern´andez-Lobato, G.-Y. Wei, and D. Brooks, “Minerva: Enabling low-power, highly-accurate deep neural network accelerators,” in Proceedings of ACM/IEEE Annual International Symposium on Computer Architecture (ISCA), 2016, pp. 267–278.
[35] B. Reagen, U. Gupta, L. Pentecost, P.Whatmough, S. K. Lee, N.Mulholland, D. Brooks, and G.-Y. Wei, “Ares: A framework for quantifying the resilience of deep neural networks,” in Proceedings of ACM/ESDA/IEEE Design Automation Conference (DAC), 2018, pp. 1–6.
[36] B. Salami, O. S. Unsal, and A. C. Kestelman, “On the resilience of RTL NN accelerators: Fault characterization and mitigation,” in Proceedings of International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), 2018, pp. 322–329.
[37] L.-H. Hoang, M. A. Hanif, and M. Shafique, “FT-ClipAct: Resilience analysis of deep neural networks and improving their fault tolerance using clipped activation,” in Proceedings of Design, Automation & Test in Europe Conference & Exhibition (DATE), 2020, pp. 1241–1246.
[38] G. Li, S. K. S. Hari, M. Sullivan, T. Tsai, K. Pattabiraman, J. Emer, and S. W. Keckler, “Understanding error propagation in deep learning neural network (DNN) accelerators and applications,” in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, p. 8.
[39] J. J. Zhang, T. Gu, K. Basu, and S. Garg, “Analyzing and mitigating the impact of permanent faults on a systolic array based neural network accelerator,” in Proceedings of IEEE 36th VLSI Test Symposium (VTS), 2018, pp. 1–6.
[40] M. Abdullah Hanif and M. Shafique, “SalvageDNN: salvaging deep neural network accelerators with permanent faults through saliency-driven fault-aware mapping,” Philosophical Transactions of the Royal Society A, vol. 378, no. 2164, p. 20190164, 2020.
[41] S. Kim, P. Howe, T. Moreau, A. Alaghi, L. Ceze, and V. S. Sathe, “Energy-efficient neural network acceleration in the presence of bit-level memory errors,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 65, no. 12, pp. 4285–4298, 2018.
[42] C. De Sio, S. Azimi, and L. Sterpone, “An emulation platform for evaluating the reliability of deep neural networks,” in Proceedings of IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), 2020, pp. 1–4.
[43] C. Torres-Huitzil and B. Girau, “Fault and error tolerance in neural networks: A review,” IEEE Access, vol. 5, pp. 17 322–17 341, 2017.
[44] S. Ioffe and C. Szegedy, “Batch Normalization: Accelerating deep network training by reducing internal covariate shift,” in Proceedings of International conference on machine learning, 2015, pp. 448–456.
[45] J.-C. Vialatte and F. Leduc-Primeau, “A study of deep learning robustness against computation failures,” Computing Research Repository (CoRR), 2017. [Online]. Available: http://arxiv.org/abs/1704.05396
[46] A. Bosio, P. Bernardi, A. Ruospo, and E. Sanchez, “A reliability analysis of a deep neural network,” in Proceedings of IEEE Latin American Test Symposium (LATS), 2019, pp. 1–6.
[47] H. Kwon, P. Chatarasi, M. Pellauer, A. Parashar, V. Sarkar, and T. Krishna, “Understanding reuse, performance, and hardware cost of dnn dataflow: A data-centric approach,” in Proceedings of IEEE/ACM International Symposium on Microarchitecture, 2019, pp. 754–768.
[48] F. Chollet, “Deep learning models,” https://github.com/fchollet/deep-learningmodels/releases, 2018. |