參考文獻 |
[1] D. A. Reynolds and R. C. Rose, “Robust text-independent speaker identification using Gaussian mixture models,” IEEE Trans. Speech Audio Process., vol. 3, no. 1, pp. 72–83, Jan. 1995.
[2] B. L. Pellom and J. H. L. Hansen, “An efficient scoring algorithm for Gaussian mixture model based speaker identification,” IEEE Signal Process. Lett., vol. 5, no. 11, pp. 281–284, Nov. 1998.
[3] W. M. Campbell, J. P. Campbell, D. A. Reynolds, E. Singer, and P. A. Torres-Carrasquillo, “Support vector machines for speaker and language recognition,” Comput. Speech Lang., vol. 20, pp. 210–229, 2006.
[4] J. L. Gauvain and C. H. Lee, “Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains,” IEEE Trans. Speech Audio Process., vol. 2, no. 2, pp. 291–298, 1994.
[5] M. R. Hasan, M. Jamil, M. G. Rabbani, and M. S. Rahman, “Speaker identification using Mel frequency cepstral coefficients,” 3rd international Conference on Electrical & Computer Engineering ICECE 2004, 28-30 December 2004, Dhaka, Bangladesh.
[6] H. Hermansky. “Perceptual Linear Predictive (PLP) Analysis of Speech,” Journal of the Acoust. Society ofAmer., 87: 1738- 1752, April, 1990.
[7] T. Kinnunen, V. Hautamäki, and P. Fränti, “Fusion of spectral feature sets for accurate speaker identification,” in Proc. 9th Conf. Speech Comput., St. Petersburg, Russia, 2004, pp. 361–365.
[8] W. Campbell, D. Sturim, D. Reynolds, and A. Solomonoff, “SVM based speaker verification using a GMM supervector kernel and nap variability compensation,” in Proc. ICASSP, Toulouse, France, 2006, pp. 97–100.
[9] W. Campbell, D. Sturim, and D. Reynolds, “Support vector machines using GMM supervectors for speaker verification,” IEEE Signal Process. Lett., vol. 13, no. 5, pp. 308–311, May 2006.
[10] T. Kinnunen and H. Li, “An overview of text-independent speaker recognition: From features to supervectors,” Speech Commun., vol. 52, no. 1, pp. 12–40, 2010.
[11] B. G. B. Fauve, D. Matrouf, N. Scheffer, J.-F. Bonastre, and J. S. D. Mason, “State-of-the-art performance in text-independent speaker verification through open-source software,” IEEE Trans. Audio, Speech, Lang. Process., vol. 15, no. 7, pp. 1960–1968, 2007.
[12] C. H. You, K. A. Lee, and H. Li, “An SVM kernel with GMM-supervector based on the Bhattacharyya distance for speaker recognition,” IEEE Signal Process. Lett., vol. 16, no. 1, pp. 49–52, Jan. 2009.
[13] T. Kailath. The divergence and bhattacharyya distance measures in signal selection. IEEE Transactions on Communications Technology, 15(1):52–60, 1967.
[14] Chang Huai You, Kong Aik Lee, and Haizhou Li, “A gmm supervector kernel with the bhattacharyya distance for svm based speaker recognition,” in Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on, april 2009, pp. 4221 –4224.
[15] A. Solomonoff, W. M. Campbell, and C. Quillen, “Channel compensation for SVM speaker recognition,” in Proc. Odyssey04, 2004, pp. 57–62.
[16] O. Glembek, L. Burget, N. Brummer, and P. Kenny, “Comparison of scoring methods used in speaker recognition with joint factor analysis,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Taipei, Taiwan, Apr. 2009, pp. 4057–4060.
[17] P. Kenny, P. Ouellet, N. Dehak, V. Gupta, and P. Dumouchel, “A study of interspeaker variability in speaker verification,” IEEE Trans. Audio, Speech, Lang. Process., vol. 16, no. 5, pp. 980–988, Jul. 2008.
[18] A. Kanagasundaram, R. Vogt, D. Dean, S. Sridharan, and M. Mason, “i-vector based Speaker Recognition on Short Utterances,” in Interspeech, 2011.
[19] P. Matejka, O. Glembek, F. Castaldo, O. Plchot, P. Kenny, L. Burget, and J. Cernocky, “Full-covariance ubm and heavy-tailed plda in i-vector speaker verification,” Proc. ICASSP ’11, pp. 4828–4831, 2011.
[20] J.M.K. Kua, J. Epps, E. Ambikairajah, “i-vector with sparse representation classification for speaker verification, ” Speech Commun, 2013.
[21] K. Huang and S. Aviyente, “Sparse Representation for Signal Classification,” Neural Information Processing Systems, 2006.
[22] J. M. K. Kua, E. Ambikairajah, J. Epps, and R. Togneri, “Speaker verification using sparse representation classification,” in Proc. ICASSP, May 2011, pp. 4548–4551.
[23] R. Saeidi, A. Hurmalainen, T. Virtanen, and D. A. van Leeuwen, “Exemplar-based Sparse Representation and Sparse Discrimination for Noise Robust Speaker Identification,” in Odyssey speaker and language recognition workshop, Singapore, 2012.
[24] N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front-end factor analysis for speaker verification,” Audio, Speech, and Language Processing, IEEE Transactions on, vol. PP, no. 99, 2010.
[25] “The NIST year 2005 speaker recognition evaluation plan,” 2008. [Online]. Available: http://www.nist.gov
[26] D. A. Reynolds, T. F. Quatieri, and R. Dunn, “Speaker verification using adapted Gaussian mixture models,” Dig. Signal Process., vol. 10, no. 1-3, pp. 19–41, 2000.
[27] D. A. Reynolds, T. F. Quatieri, and R. Dunn, “Speaker verification using adapted Gaussian mixture models,” Dig. Signal Process., vol. 10, no. 1-3, pp. 19–41, 2000.
[28] T. Hasan and J. H. L. Hansen, “Factor analysis of acoustic features using a mixture of probabilistic principal component analyzers for robust speaker verification,” in Proc. Odyssey, Singapore, Jun. 2012.
[29] M. Tipping and C. Bishop, “Mixtures of probabilistic principal component analyzers,” Neural Computation, vol. 11, no. 2, pp. 443–482, 1999.
[30] 李孟穎,「感知因素分析法應用於語音強化」,成功大學資訊工程學系博士論文,2004年。
[31] M.S. Bartlett, “Tests of significance in factor analysis,” British Journal of Psychology, Statistical Section 3, 77–85, 1950
[32] J. Wright, A. Ganesh, S. Rao, and Y. Ma. Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization. Submitted to the Journal of the ACM, 2009.
[33] C. F. Chen, C. P. Wei, and Y.C. F. Wang, “Low-rank matrix recovery with structural incoherence for robust face recognition,” in Proc. IEEE Conf. Comput. Vis. Patt. Recogn. (CVPR), Providence, RI, USA, Jun. 2012, pp. 2618–2625.
[34] L. Zhang, W. D. Zhou, P. C. Chang, J. Liu, Z. Yan, T. Wang, and F. Z. Li, “Kernel sparse representation-based classifier,” IEEE Trans. Signal Processing, vol. 60, no. 4, pp. 1684–1695, Apr. 2012.
[35] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol.52, no. 4, pp. 1289–1306, Apr. 2004.
[36] A. Carmi, P. Gurfil, D. Kanevsky, and B. Ramabhadran, “ABCS: Approximate Bayesian Compressed Sensing,” Tech. Rep., Human Language Technologies, IBM, 2009.
[37] T. N. Sainath, A. Carmi, D. Kanevsky, and B. Ramabhadran, “Bayesian compressive sensing for phonetic classification,” in Proc. Int. Conf. Audio, Speech, Signal Process., 2010, pp. 4370–4373.
[38] A. Kanagasundaram, D. Dean, R. Vogt, M. McLaren, S. Sridharan, M. Mason, “Weighted LDA techniques for i-vector based speaker verification,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4781–4784, 2012.
[39] S. Mikat, G. Fitscht, J. Weston!, B. Scholkopft, and K.-R. Mullert, “Fisher discriminant analysis with kernels,” in Proc. 1999 IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing, Madison, Wisconsin, United States, 1999, Aug. 23–25, pp. 41–48.
[40] Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition,” in Proc. 27th Annu. Asilomar Conf. Signals Syst. Comput., Nov. 1993, vol. 1, pp. 40–44.
[41] S. Mallat, Z. Zhang, “Adaptive time-frequency decomposition with matching pursuits”. IEEE-SP International Symposium on Time- Frequency and Time-Scale Analysis, pp.7–10, 1992.
[42] S. Ji, Y. Xue, and L. Carin, “Bayesian compressive sensing,” IEEE Trans. Signal Process., vol. 56, pp. 2346–2356, 2008. |