參考文獻 |
[1] J. C. Wang, Y. H. Chin, B. W. Chen, C. H. Lin, and C. H. Wu, “Speech Emotion Verification Using Emotion Variance Modeling and Discriminant Scale-Frequency Maps,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 23, no. 10, pp. 1552–1562, 2015.
[2] M. J. Gangeh, P. Fewzee, A. Ghodsi, M. S. Kamel, and F. Karray, “Multiview Supervised Dictionary Learning in Speech Emotion Recognition,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 22, no. 6, pp. 1056–1068, Jun. 2014.
[3] S. Lazebnik and M. Raginsky, “Supervised Learning of Quantizer Codebooks by Information Loss Minimization,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 7, pp. 1294–1309, Jul. 2009.
[4] X. C. Lian, Z. Li, C. Wang, B. L. Lu, and L. Zhang, “Probabilistic Models for Supervised Dictionary Learning,” in IEEE Conf. Comput. Vision Pattern Recognition (CPVR), 2010, pp. 2305–2312.
[5] J. Yang, K. Yu, and T. Huang, “Supervised translation-invariant sparse coding,” IEEE Conf. Comput. Vis. Pattern Recognit., pp. 3517–3524, 2010.
[6] H. Zhang, Y. Zhang, and T. S. Huang, “Simultaneous Discriminative Projection and Dictionary Learning for Sparse Representation Based Classification,” Pattern Recognit., vol. 46, no. 1, pp. 346–354, 2013.
[7] Q. Zhang and B. Li, “Discriminative K-SVD for Dictionary Learning in Face Recognition,” in IEEE Conf. Comput. Vision Pattern Recognition (CPVR), 2010, pp. 2691–2698.
[8] Z. Jiang, Z. Lin, and L. S. Davis, “Learning a Discriminative Dictionary for Sparse Coding via Label Consistent K-SVD,” in IEEE Conf. Comput. Vision Pattern Recognition (CPVR), 2011, pp. 1697–1704.
[9] W. Liu, Z. Yu, M. Yang, L. Lu, and Y. Zou, “Joint kernel dictionary and classifier learning for sparse coding via locality preserving K-SVD,” Proc. - IEEE Int. Conf. Multimed. Expo, vol. 2015–Augus, 2015.
[10] X. He and P. Niyogi, “Locality Preserving Projections,” in Proc. Conf. Advances Neural Inform. Process. Syst., 2003, pp. 153–160.
[11] Y. Zhou, J. Gao, and K. E. Barner, “Locality Preserving KSVD for Nonlinear Manifold Learning,” in Acoust., Speech, and Signal Process. (ICASSP), 2013, pp. 3372–3376.
[12] T. Komatsu, Y. Senda, and R. Kondo, “Acoustic Event Detection Based on Non-negative Matrix Factorization with Mixtures of Local Dictionaries and Activation Aggregation,” in Acoust., Speech, and Signal Process. (ICASSP), 2016, pp. 2259–2263.
[13] A. Mesaros, T. Heittola, O. Dikmen, and T. Virtanen, “Sound Event Detection in Real Life Recordings Using Coupled Matrix Factorization of Spectral Representations and Class Activity Annotations,” in Acoust., Speech, and Signal Process. (ICASSP), 2015, pp. 151–155.
[14] Z. Wu, E. S. Chng, and H. Li, “Joint nonnegative matrix factorization for exemplar-based voice conversion.”
[15] S. Fu, P. Li, Y. Lai, C. Yang, L. Hsieh, and Y. Tsao, “Joint Dictionary Learning-Based Non-Negative Matrix Factorization for Voice Conversion to,” vol. 64, no. 11, pp. 2584–2594, 2017.
[16] A. Y. N. Honglak Lee, Alexis Battle, Rajat Raina, “Efficient Sparse coding algorithms,” Adv. nerual infromation Process. Syst., pp. 801–808, 2006.
[17] K. Gregor and Y. Lecun, “Learning Fast Approximations of Sparse Coding,” Vision, Image Signal Process. IEE Proc. -, vol. 152, no. 3, pp. 318–326, 2005.
[18] L. Zhang, M. Yang, and X. Feng, “Sparse representation or collaborative representation: Which helps face recognition?,” Proc. IEEE Int. Conf. Comput. Vis., pp. 471–478, 2011.
[19] Z. Zhang, S. Member, Y. Xu, and S. Member, “A Survey of Sparse Representation : Algorithms and Applications,” IEEE Access, vol. 3, pp. 490–530, 2015.
[20] I. S. Dhillon and S. Sra, “Generalized nonnegative matrix approximations with Bregman divergences,” in Advances in neural information processing systems 18, 2005.
[21] R. Tandon and S. Sra, “Sparse nonnegative matrix approximation: new formulations and algorithms,” Tech Report No. 193, Max-Planck, 2010.
[22] K. Jeong, J. Song, and H. Jeong, “NMF Features for Speech Emotion Recognition,” in Proceedings of the 2009 International Conference on Hybrid Information Technology, 2009, pp. 368–374.
[23] K. Jeong, J. Song, and H. Jeong, “Spectral Analysis for Emotion Recognition by NMF Features,” in 2009 Fifth International Conference on Natural Computation, 2009, vol. 5, pp. 121–125.
[24] S.-Y. Lee, H.-A. Song, and S. Amari, “A new discriminant NMF algorithm and its application to the extraction of subtle emotional differences in speech,” Cognitive Neurodynamics, vol. 6, no. 6. Dordrecht, pp. 525–535, Dec-2012.
[25] D. Kim, S. Y. Lee, and S. I. Amari, “Representative and discriminant feature extraction based on NMF for emotion recognition in speech,” in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5863 LNCS, no. PART 1, C. S. Leung, M. Lee, and J. H. Chan, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, pp. 649–656.
[26] P. Song, S. Ou, W. Zheng, Y. Jin, and L. Zhao, “Speech emotion recognition using transfer non-negative matrix factorization,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016, pp. 5180–5184.
[27] Z. Wu, E. Chng, and H. Li, “Joint nonnegative matrix factorization for exemplar-based voice conversion,” Multimed. Tools Appl., vol. 74, 2014.
[28] L. Zhang, G. Bao, Y. Luo, and Z. Ye, “Monaural Speech Enhancement Using Joint Dictionary Learning with Cross-Coherence Penalties,” Proc. - 2015 8th Int. Symp. Comput. Intell. Des. Isc. 2015, vol. 2, pp. 518–522, 2016.
[29] J. Sadasivan, S. Mukherjee, and C. S. Seelamantula, “Joint dictionary training for bandwidth extension of speech signals,” ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. 2016–May, pp. 5925–5929, 2016.
[30] Y. K. Yılmaz and a T. Cemgil, “Generalised Coupled Tensor Factorisation,” Adv. Neural Inf. Process. Syst., pp. 2151--2159, 2011.
[31] D. Cai, X. He, J. Han, and T. S. Huang, “Graph Regularized Nonnegative Matrix Factorization for Data Representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 8, pp. 1548–1560, 2011.
[32] J. Wang, J. Yang, K. Yu, F. Lv, and T. Huang, “Locality-constrained Linear Coding for Image Classification,” in IEEE Conf. Comput. Vision Pattern Recognition (CPVR), 2010, pp. 3360–3367.
[33] L. Fei-fei, R. Fergus, and P. Perona, “Learning generative visual models from few training examples : An incremental Bayesian approach tested on 101 object categories,” vol. 106, pp. 59–70, 2007.
[34] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust Face Recognition via Sparse Representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2, pp. 210–227, Feb. 2009.
[35] S. Lazebnik and C. Schmid, “Beyond bags of features: spatial pyramid matching for recognizing natural scene categories,” in Proc. IEEE Conf. Comput. Vision and Pattern Recognition (CVPR), 2006, pp. 2169–2178.
[36] J. C. Lin, C. H. Wu, and W. L. Wei, “Error Weighted Semi-Coupled Hidden Markov Model for Audio-Visual Emotion Recognition,” IEEE Trans. Multimed., vol. 14, no. 1, pp. 142–156, Feb. 2012.
[37] M. R. Schädler and B. Kollmeier, “Separable Spectro-temporal Gabor Filter Bank Features: Reducing the Complexity of Robust Features for Automatic Speech Recognition,” J. Acoust. Soc. Am., vol. 137, no. 4, pp. 2047–2059, 2015.
[38] Z. Jiang, Z. Lin, and L. S. Davis, “Label consistent K-SVD: Learning a discriminative dictionary for recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 11, pp. 2651–2664, 2013.
[39] Y. S. Lee, C. Y. Wang, S. Mathulaprangsan, J. H. Zhao, and J. C. Wang, “Locality-preserving K-SVD Based Joint Dictionary and Classifier Learning for Object Recognition,” in Proc. ACM Multimedia Conf., 2016, pp. 481–485.
[40] R. Rubinstein, M. Zibulevsky, and M. Elad, “Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit,” CS Tech., pp. 1–15, 2008.
[41] R. Hennequin, “NMF-matlab, https://github.com/romi1502/NMF-matlab.” 2015.
[42] I. Luengo, E. Navas, I. Hernáez, and J. Sánchez, “Automatic Emotion Recognition using Prosodic Parameters,” in in Proc. of INTERSPEECH, 2005, pp. 493–496. |