參考文獻 |
[1] D. Wang and G. J. Brown, “Computational Auditory Scene Analysis: Principles, Algorithms, and Applications”, USA, NJ, Piscataway:IEEE Press, 2006.
[2] A. S. Bregman, “Auditory Scene Analysis,” MIT Press, Cambridge, MA, 1990.
[3] M. Slaney, “The History and Future of CASA,” Speech separation by humans and machines, pp.199-211, Springer US, 2005.
[4] N. Sawhney, “Situational Awareness from Environmental Sounds,” Technical Report, Massachusetts Institute of Technology, 1997.
[5] D. Barchiesi, D. Giannoulis, D. Stowell, M. D. Plumbley, “Acoustic Scene Classification,” in IEEE Signal Processing Magazine, vol. 32, no. 3, pp.16-34, May 2015.
[6] S. Sabour, N. Frosst, and G. E. Hinton, “Dynamic routing between capsules,” in Proceedings of the 31st Conference on Neural Information Pro-cessing Systems, pp. 3859–3869.
[7] Y. C. Wu, P. C. Chang, C. Y. Wang and J. C. Wang, “Asymmetric Kernel Convolutional Neural Network for acoustic scenes classification, ” in 2017 IEEE International Symposium on Consumer Electronics (ISCE), Kuala Lumpur, Malaysia, Nov. 2017.
[8] R. Stiefelhagen, and J. Garofolo, eds, “Multimodal Technologies for Perception of Humans,” First International Evaluation Workshop on Classification of Events, Activities and Relationships, CLEAR 2006, Southampton, UK, April 6-7, 2006, Revised Selected Papers. Vol. 4122. Springer, 2007.
[9] D. Giannoulis, E. Benetos, D. Stowell, and M. D. Plumbley, IEEE AASP CASA Challenge - Public Dataset for Scene Classification Task, retrieved Jun. 29, 2017.
[10] D. Giannoulis, E. Benetos, D. Stowell, and M. D. Plumbley, IEEE AASP CASA Challenge - Private Dataset for Scene Classification Task, retrieved Jun. 29, 2017.
[11] D. STOWELL, et al. “Detection and classification of acoustic scenes and events,” IEEE Transactions on Multimedia, 17.10: 1733-1746, 2015.
[12] A. Mesaros, T. Heittola, T. Virtanen, “TUT database for acoustic scene classification and sound event detection,” in IEEE 2016 24th European Signal Processing Conference (EUSIPCO), p. 1128-1132, 2016.
[13] A. MESAROS, et al, “Detection and classification of acoustic scenes and events,” Outcome of the DCASE 2016 challenge. IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), 26.2: 379-393, 2018.
[14] A. Mesaros, et al, “DCASE 2017 challenge setup: Tasks, datasets and baseline system,” DCASE 2017-Workshop on Detection and Classification of Acoustic Scenes and Events, 2017.
[15] A. Mesaros, T. Heittola, and T. Virtanen, “A multi-device dataset for urban acoustic scene classification,” in IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events (DCASE), 2018.
[16] ETSI Standard Doc., “Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-End Feature Extraction Algorithm; Compression Algorithms,” ES 201 108, v1.1.3, Sep. 2003.
[17] ETSI Standard Doc., “Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-End Feature Extraction Algorithm; Compression Algorithms,” ES 202 050, v1.1.5, Jan. 2007.
[18] Librosa: an open source Python package for music and audio analysis, https://github.com/librosa, retrieved Dec. 1, 2016.
[19] Librosa: an open source Python package for music and audio analysis, https://github.com/librosa, retrieved Dec. 1, 2016.
[20] S. J. Russell, and P. Norvig. “Artificial intelligence: a modern approach. Malaysia,” Pearson Education Limited, 2016.
[21] W. S. Mcculloch and W. Pitts, “A Logical Calculus of the Ideas Immanent in Nervous Activity,” Bulletin of Mathematical Biophysics, vol.5, no.4, pp.115-133, Dec. 1943.
[22] D. O. Hebb, “Organization of Behavior,” New York: Wiley & Sons.
[23] F. Rosenblatt, “The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain,” Cornell Aeronautical Laboratory, Psychological Review, v. 65, no. 6, pp. 386–408.
[24] M. Minsky and S. Paper, “Perceptrons,” Cambridge, MA: MIT Press.
[25] N. Srivastava, G. E. Hinton, A. Krizhevsky, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” Machine Learning Research, vol. 15, pp. 1929-1958. Jun. 2014.
[26] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[27] I. Mrazova, and M. Kukacka, “Hybrid convolutional neural networks,” in 6th IEEE International Conference on Industrial Informatics (INDIN), 2008.
[28] K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” in International Conference on Learning Representations (ICLR), 2015.
[29] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Pro-ceedings of the IEEE Conference on Computer Vision and Pattern Recog-nition (CVPR), pp. 1-9, 2015.
[30] K. He, Zhang, X., S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[31] L. Deng, “The MNIST database of handwritten digit images for machine learning research [best of the web],” IEEE Signal Processing Magazine 29.6 : 141-142, 2012.
[32] T. Tieleman, “affNIST,” URL https://www.cs.toronto.edu/~ tijmen/affNIST/, Dataset URL https://www.cs.toronto.edu/~tijmen/affNIST/. [Accessed on: 2018-05-08], 2013.
[33] F. Vesperini, et al. "Polyphonic Sound Event Detection by using Capsule Neural Networks." IEEE Journal of Selected Topics in Signal Processing, 2019.
[34] TensorFlow: an open source Python package for machine intelligence, https://www.tensorflow.org, retrieved Dec. 1, 2016.
[35] J. Dean, et al. “Large-Scale Deep Learning for Building Intelligent Computer Systems,” in Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, pp. 1-1, Feb. 2016.
[36] A. Mesaros, T. Heittola, and T. Virtanen, “Metrics for polyphonic sound event detection,” Applied Sciences, 6(6):162, 2016.
[37] S. Adavanne , and T. Virtanen, “A report on sound event detection with different binaural features,” arXiv preprint arXiv:1710.02997 , 2017.
|