參考文獻 |
[1] S. S. Farfade, M. Saberian, and L. J. Li, “Multi-view face detection using Deep convolutional neural networks,” in ICMR 2015 - Proceedings of the 2015 ACM International Conference on Multimedia Retrieval, Feb. 2015, pp. 643–650, doi: 10.1145/2671188.2749408.
[2] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Dec. 2016, vol. 2016-December, pp. 779–788, doi: 10.1109/CVPR.2016.91.
[3] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
[4] NEC人臉辨識溫度感測方案,https://www.ankecare.com/article/797-20279.
[5] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. Int. Conf. on Learning Representations, San Diego, CA, 2015.
[6] K. He et al. Deep residual learning for image recognition.arXiv:1512.03385, 2015.
[7] Y.-H. Chen, T. Krishna, J. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE Int. J. Solid State Circuits, vol. 59, no. 1, pp. 262–263, 2016, doi: 10.1109/ISSCC.2016.7418007.
[8] EIE: Efficient Inference Engine on Compressed Deep Neural Network, in: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), IEEE. pp. 243–254.
[9] Z. Yuan et al., “Sticker: A 0.41-62.1 TOPS/W 8bit neural network processor with multi-sparsity compatible convolution arrays and online tuning acceleration for fully connected layers,” in Proc. IEEE Symp. VLSI Circuits, Jun. 2018, pp. 33–34.
[10] D. Masters and C. Luschi, “Revisiting small batch training for deep neural networks,” CoRR, vol. abs/1804.07612, 2018. [Online]. Available: http://arxiv.org/abs/1804.07612.
[11] M. A. Hussain and T. H. Tsai, “Memory Access Optimization for On-Chip Transfer Learning,” IEEE Trans. Circuits Syst. I Regul. Pap., vol. 68, no. 4, pp. 1507–1519, Feb. 2021, doi: 10.1109/TCSI.2021.3055281.
[12] M. Long, Y. Cao, Z. Cao, J. Wang, and M. I. Jordan, “Transferable Representation Learning with Deep Adaptation Networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 12, pp. 3071–3085, 2019, doi: 10.1109/TPAMI.2018.2868685.
[13] A. Gepperth and S. A. Gondal, “Incremental learning with deep neural networks using a test-time oracle,” in Proceedings - European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 2018, no. April, pp. 37–42.
[14] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proc. of the IEEE, 1998.
[15] “Why is so much memory needed for deep neural networks?” [Online]. Available: https://www.graphcore.ai/posts/why-is-so-much-memory-needed-for-deep-neural-networks. [Accessed: 13-Jan-2020].
[16] “TensorFlow.” [Online]. Available: https://www.tensorflow.org/. [Accessed: 13-Jan-2020].
[17] “PyTorch.” [Online]. Available: https://pytorch.org/. [Accessed: 13-Jan-2020].
[18] A. Aimar et al., “NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps,” IEEE Trans. Neural Networks Learn. Syst., vol. 30, no. 3, pp. 644–656, 2019, doi: 10.1109/TNNLS.2018.2852335.
[19] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015
[20] S. Choi, J. Sim, M. Kang, Y. Choi, H. Kim, and L. S. Kim, “An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for in Situ Personalization on Smart Devices,” IEEE J. Solid-State Circuits, vol. 55, no. 10, pp. 2691–2702, Oct. 2020.
[21] D. Han, J. Lee, J. Lee, and H. J. Yoo, “A Low-Power Deep Neural Network Online Learning Processor for Real-Time Object Tracking Application,” IEEE Trans. Circuits Syst. I Regul. Pap., vol. 66, no. 5, pp. 1794–1804, May 2019, doi: 10.1109/TCSI.2018.2880363.
[22] X. Chen, C. Gao, T. Delbruck, and S.-C. Liu, “EILE: Efficient Incremental Learning on the Edge,” in 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS), Jun. 2021, pp. 1–4, doi: 10.1109/AICAS51828.2021.9458554.
[23] IEEE 754 ,https://zh.wikipedia.org/wiki/IEEE_754.
[24] A. Agrawal et al., "DLFloat: A 16-b Floating Point Format Designed for Deep Learning Training and Inference," 2019 IEEE 26th Symposium on Computer Arithmetic (ARITH), 2019, pp. 92-95, doi: 10.1109/ARITH.2019.00023.
[25] U. Koster ¨ et al., “Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks,” in NIPS, 2017.
[26] Gustafson, J.; Yosemite, I. Beating Floating Point at its Own Game: POSIT Arithmetic. Supercomput. Front. Innov. Int. J. 2017, 4, 2409–6008.
[27] Wang Shibo, Kanwar Pankaj "BFloat16: The secret to high performance on Cloud TPUs", Aug. 2019, https://reurl.cc/43Z7qY.
[28] Agarap, A. F. (2018). Deep learning using rectified linear units (relu). ArXiv Preprint ArXiv:1803.08375.
[29] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov and L. -C. Chen, "MobileNetV2: Inverted Residuals and Linear Bottlenecks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 4510-4520, doi: 10.1109/CVPR.2018.00474.
[30] C. Chen, H. Ding, H. Peng, H. Zhu, Y. Wang, and C. J. R. Shi, “OCEAN: An On-Chip Incremental-Learning Enhanced Artificial Neural Network Processor with Multiple Gated-Recurrent-Unit Accelerators,” IEEE J. Emerg. Sel. Top. Circuits Syst., vol. 8, no. 3, pp. 519–530, 2018, doi: 10.1109/JETCAS.2018.2852780.
[31] C.-H. Lu, Y.-C. Wu, and C.-H. Yang, “A 2.25 TOPS/W Fully-Integrated Deep CNN Learning Processor with On-Chip Training,” in IEEE Asian Solid-State Circuits Conference (A-SSCC), Apr. 2019, pp. 65–68, doi: 10.1109/a-sscc47793.2019.9056967.
[32] N. N. Schraudolph, “A fast, compact approximation of the exponential function,” Neural Computation, vol. 11, no. 4, pp. 853-862, 1999.
[33] D. Kim, J. Kung, and S. Mukhopadhyay, “A power-aware digital multilayer perceptron accelerator with on-chip training based on approximate computing,” IEEE Trans. Emerg. Top. Comput., vol. 5, no. 2, pp. 164–178, Apr. 2017, doi: 10.1109/TETC.2017.2673548.
[34] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.
[35] Kalamkar, D., Mudigere, D., Mellempudi, N., Das, D., Banerjee, K., Avancha, S., Vooturi, D.T., Jammalamadaka, N., Huang, J., Yuen, H. and Yang, J., 2019. A study of bfloat16 for deep learning training. arXiv preprint arXiv:1905.12322.
[36] C. S. Turner, “A fast binary logarithm algorithm,” IEEE Signal Process. Mag., vol. 27, no. 5, 2010, doi: 10.1109/MSP.2010.937503. |