參考文獻 |
[1] L. Lin, K. Wang, W. Zuo, M. Wang, J. Luo, and L. Zhang, “A deep structured model with radius–margin bound for 3D human activity recognition,” International Journal of Computer Vision, 1-18, 2015.
[2] S. Ji, W. Xu, M. Yang and K. Yu, “3D Convolutional Neural Networks for Human Action Recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221-231, Jan, 2013.
[3] A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar and L. Fei-Fei, “Large-Scale Video Classification with Convolutional Neural Networks,” IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, pp. 1725-1732, 2014.
[4] L. Pigou, A. Oord, S. Dieleman, M. Herreweghe, and J. Dambre, “Beyond Temporal Pooling: Recurrence and Temporal Convolutions for Gesture Recognition in Video,” arXiv preprint arXiv:1506.01911, 2015.
[5] W. McCulloch and W. Pitts. “A logical calculus of the ideas immanent in nervous activity,” The bulletin of mathematical biophysics, vol. 5, no. 4, pp. 115-133, 1943.
[6] D. Hebb, “The Organization of Behavior: A Neuropsychological Theory,” New York: Wiley, 1949.
[7] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain,” Psychological Review, Vol 65(6), Nov 1958, 386-408.
[8] M. Minsky, S. Papert, “Perceptrons,” M.I.T. Press Perceptrons, 1969
[9] D. Rumelhart, G. Hinton, and R. Williams, “Learning representations by back-propagating errors,” Neurocomputing: foundations of research, James A. Anderson and Edward Rosenfeld (Eds.). MIT Press, Cambridge, MA, USA 696-699, 1988.
[10] M. Minsky and S. Papert, “Perceptrons: Expanded Edition,” MIT Press, Cambridge, MA, USA, 1988.
[11] D. Rumelhart, G. Hinton, R. Williams, “Learning Internal Representations by Error Propagation” Technical rept., Mar-Sep, 1985.
[12] G. Hinton, S. Osindero, Y. Teh, “A Fast Learning Algorithm for Deep Belief Nets” Neural computation, Vol. 18, No. 7, Pages 1527-1554, 2006.
[13] G. Hinton, R. Salakhutdinov, “Reducing the Dimensionality of Data with Neural Networks.” Science, Vol. 313, Issue 5786, pp. 504-507, 2006.
[14] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov 1998.
[15] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, pp. 1-9, 2015.
[16] P. Werbos, “Backpropagation through time: what it does and how to do it,” Proceedings of the IEEE, vol. 78, no. 10, pp. 1550-1560, Oct 1990.
[17] I. Sutskever, O. Vinyals, and Q. Le. “Sequence to sequence learning with neural networks,” Advances in neural information processing systems, 2014.
[18] K. Cho, B. Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk , Y. Bengio, “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.
[19] R. O’Reilly, “Biologically Plausible Error-driven Learning using Local Activation Differences:The Generalized Recirculation Algorithm,” Neural Computation, 8:5, 895-938, 1996.
[20] D. Ciresan, A. Giusti, L. Gambardella, and J. Schmidhuber, “Deep neural networks segment neuronal membranes in electron microscopy images,” Advances in neural information processing systems, 2012.
[21] A. Karpathy and L. Fei-Fei. “Deep visual-semantic alignments for generating image descriptions,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[22] W. Byeon, T. Breuel, F. Raue, and M. Liwicki, “Scene labeling with lstm recurrent neural networks,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[23] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition, 2014.
[24] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[25] J. Donahue, L. Hendricks, S. Guadarrama, M. Rohrbach, S. Venugopalan, K. Saenko, and T. Darrell, “Long-term recurrent convolutional networks for visual recognition and description,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[26] D. Bahdanau, K. Cho, and Y. Bengio. “Neural machine translation by jointly learning to align and translate,” arXiv preprint arXiv:1409.0473, 2014.
[27] R. Girshick, “Fast r-cnn.” Proceedings of the IEEE International Conference on Computer Vision, 2015.
[28] S. Ren, K. He, R. Girshick, and J.Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” Advances in neural information processing systems, 2015.
[29] S. Sharma, R. Kiros, and R. Salakhutdinov, “Action recognition using visual attention,” arXiv preprint arXiv:1511.04119, 2015.
[30] K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. Zemel, Y. Bengio, “Show, attend and tell: Neural image caption generation with visual attention.” arXiv preprint arXiv:1502.03044, 2015.
[31] T. Brox, A. Bruhn, N. Papenberg, J. Weickert, “High accuracy optical flow estimation based on a theory for warping,” Computer Vision-ECCV 2004, Springer Berlin Heidelberg, pp. 25-36, 2004.
[32] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no.8, pp. 1735-1780, 1997.
[33] M. Schuster and K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol.45, no.11, pp. 2673-2681 , 1997.
[34] J. Chung, C. Gulcehre, K. Cho, amd Y. Bengio “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014.
[35] J. Deng, W. Dong, R. Socher, L. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on IEEE, 2009.
[36] C. Ding and D. Tao, “Robust face recognition via multimodal deep face representation,” IEEE Transactions on Multimedia, vol. 17, no. 11, pp. 2049-2058, 2015.
[37] L. Pigou, S. Dieleman, P. Kindermans, and B. Schrauwen, “Sign language recognition using convolutional neural networks,” Workshop at the European Conference on Computer Vision, Springer International Publishing, 2014.
[38] S. Sukittanon, A. Surendran, J. Platt, and C. Burges, “Convolutional networks for speech detection,” Interspeech, 2004.
[39] O. Abdel-Hamid, A. Mohamed, H. Jiang, Li Deng, G. Penn, and D. Yu “Convolutional neural networks for speech recognition,” IEEE/ACM Transactions on audio, speech, and language processing, vol. 22, no. 10, pp. 1533-1545, 2014.
[40] Y. Wang and D. Wang “Cocktail party processing via structured prediction,” Advances in Neural Information Processing System, 2012.
[41] Y. Wang and D. Wang, “Towards scaling up classification-based speech separation,” IEEE Transactions on Audio, Speech, and Language Processing, vol.21, no.7, pp. 1381-1390, 2013.
[42] D. Ciresan, U. Meier, and J. Schmidhuber, “Multi-column deep neural networks for image classification,” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.
[43] Y. Taigman, M. Yang, M. Ranzato, and L. Wolf, “Deepface: Closing the gap to human-level performance in face verification,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[44] D. Ciresan, A. Giusti, L. Gambardella, J. Schmidhuber, “Mitosis detection in breast cancer histology images with deep neural networks,” International Conference on Medical Image Computing and Computer-assisted Intervention, Springer Berlin Heidelberg, 2013.
[45] G. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Processing Magazine, vol.29, no.6, pp. 82-97, 2012.
[46] J. Li, Y. Wei, X. Liang, J. Dong, T. Xu, J. Feng, and S. Yan, “Attentive Contexts for Object Detection,” arXiv preprint arXiv:1603.07415, 2016.
[47] J. Johnson, A. Karpathy, L. Fei-Fei, “Densecap: Fully convolutional localization networks for dense captioning,” arXiv preprint arXiv:1511.07571, 2015.
[48] K. He, X. Zhang, S. Ren, J. Sun, “Spatial pyramid pooling in deep convolutional networks for visual recognition," European Conference on Computer Vision. Springer International Publishing, 2014.
[49] P. Wang, Y. Cao, C. Shen, L. Liu, H. Shen, “Temporal pyramid pooling based convolutional neural networks for action recognition,” arXiv preprint arXiv:1503.01224, 2015.
[50] J. Ng, M. Hausknecht, S. Vijayanarasimhan, O. Vinyals, R. Monga, G. Toderici, “Beyond short snippets: Deep networks for video classification,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.
[51] H. Lee and H. Kwon, “Contextual Deep CNN Based Hyperspectral Classification,” arXiv preprint arXiv:1604.03519, 2016.
[52] P. Scovanner, S. Ali, and M. Shah, “A 3-dimensional sift descriptor and its application to action recognition,” Proceedings of the 15th ACM international conference on Multimedia, ACM, 2007.
[53] A. Klaser, M. Marcin, and S. Cordelia, “A spatio-temporal descriptor based on 3d-gradients,” BMVC 2008-19th British Machine Vision Conference, British Machine Vision Association, 2008.
[54]B. Nair and V. Asari, “Regression Based Learning of Human Actions from Video Using HOF-LBP Flow Patterns,” IEEE International Conference on Systems, Man, and Cybernetics, Manchester, pp. 4342-4347, 2013.
[55]C. Chen, R. Jafari, and N. Kehtarnavaz, “Action Recognition from Depth Sequences Using Depth Motion Maps-Based Local Binary Patterns,” IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, pp. 1092-1099, 2015.
[56]N. Ikizler-Cinbis and S. Sclaroff, “Object, scene and actions: Combining multiple features for human action recognition,” European conference on computer vision, Springer Berlin Heidelberg, 2010.
[57] J. Cho, M. Lee, and S.Oh, “Robust action recognition using local motion and group sparsity,” Pattern Recognition, vol. 47, no. 5, 1813-1825, 2014. |