參考文獻 |
[1] S. Wegmann, P. Zhan, and L. Gillick, “Progress in broadcast news transcription at dragon systems,” IEEE International Conference on Acoustics, Speech, Signal Processing, vol. 1, pp. 33-36, Mar 1999.
[2] Z. Zhang , S. Furui , and K. Ohtsuki, “On-line incremental speaker adaptation for broadcast news transcription,” Speech Communication, vol. 37, no. 3-4, pp. 271-281, July 2002.
[3] J. Gauvain, L. Lamel, and G. Adda, “The LIMSI broadcast news transcription system,” Speech Communication, vol. 37, no. 1-2, pp. 89-108, 2002.
[4] K. Mori and S. Nakagawa, “Speaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition,” IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 413-416, May 2001.
[5] R. Huang and J. H. L. Hansen, “Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 14, no. 3, pp. 907-919, May 2006.
[6] K. Park, Jeong-sik Park, and Y. H. Oh, “GMM adaptation based online speaker segmentation for spoken document retrieval,” IEEE Transactions on Consumer Electronics, vol.56, no.2, pp.1123-1129, May 2010.
[7] L. Couvreur and J.M. Boite, “Speaker tracking in broadcast audio material in the framework of the THISL project,” Workshop Accessing Information in Spoken Audio, pp. 84-89, 1999.
[8] L. Lu and H. J. Zhang, “Speaker change detection and tracking in real-time news broadcasting analysis,” 10th ACM International Conference on Multimedia, pp. 602-610, Dec. 2002.
[9] L. Lu and H. J. Zhang “Unsupervised speaker segmentation and tracking in real-time audio content analysis,” Multimedia Systems , vol. 10, no. 4, pp. 332-343, April 2005
[10] B. W. Zhou, and John H. L. Hansen, “Unsupervised audio stream segmentation and clustering via the Bayesian Information criterion,“ International Conference Spoken Language Processing , vol.1, pp. 714-717, 2000.
[11] A. Tritschler and R. Gopinath, “Improved speaker segmentation and segments clustering using the Bayesian Information Criterion,” European Conference Speech Communication Technology, pp.679-682, 1999.
[12] M. Siegler, U. Jain, B.Raj, and R. Stern, “Automatic segmentation, classification and clustering of broadcast news audio,” DARPA Speech Recognition Workshop, pp. 97-99, Feb 1997.
[13] M. Cettolo, “Segmentation, classification and clustering of an Italian broadcast news corpus,“ Sixth RIAO-Content-Based Multimedia Information Access Conference, pp. 281-372, 2000.
[14] T. Kemp, M. Schmidt, M. Westphal, and A. Waibel, “Acoustics, strategies for automatic segmentation of audio data,” IEEE International Conference Acoustics, Speech, Signal Process, vol. 3, pp. 1423-1426, June 2000.
[15] S. Meignier, J.-F. Bonastre, and S. Igounet, “E-HMM approach for learning and adapting sound models for speaker indexing,” Speaker Odyssey—The Speaker Recognition Workshop, pp. 175-180, 2001.
[16] D. Moraru, S. Meignier, C. Fredouille, L. Besacier, and J.F. Bonastre, “The ELISA consortium approaches in broadcast news speaker segmentation during the NIST 2003 rich transcription evaluation,” IEEE International Conference Acoustics, Speech, and Signal Processing, vol. 1, pp. I-373-I-376, May 2004.
[17] S. Chen and P. Gopalakrishnan, “Speaker, environment and channel change detection and clustering via the Bayesian information criterion,” DARPA Broadcast News Transcription Understanding Workshop, pp. 127-132, 1998.
[18] S.S. Cheng and H.M. Wang, “A sequential metric-based audio segmentation method via the Bayesian information criterion,” European Conference Speech Communication and Technology , pp. 945-948, 2003.
[19] G. Schwarz, “Estimating the dimension of a model,“ The Annals of Statistics, vol. 6, no. 2, pp. 461-464, 1978.
[20] J. W. Hung, H. M. Wang, and L. S. Lee, “Automatic metric-based speech segmentation for broadcast news via principal component analysis“, 2000 International Conference on Spoken Language Processing, 1998.
[21] J. F. Bonastre, P. Delacourt, C. Fredouille, T. Merlin, and C. Wellekens, “A speaker tracking system based on speaker turn detection for NIST evaluation,” IEEE International Conference Acoustics, Speech, Signal Process, vol. 2, pp. 1177-1180, 2000.
[22] D. Liu and F. Kubala, “Fast speaker change detection for broadcast news transcription and indexing,” European Conference Speech Communication and Technology, pp. 1031-1034, Sept. 1999.
[23] P. Delacourt and C. J. Wellekens, “DISTBIC: a speaker-based segmentation for audio data indexing,” Speech Communication, vol. 32, no. 1-2, pp. 111-126, Sept. 2000.
[24] Mohamed Kamal Omar, Upendra Chaudhari, Ganesh Ramaswamy, “Blind change detection for audio segmentation,” IEEE International Conference on Acoustics, Speech, and Signal Processing, vol.1, no., pp. 501- 504, March 18-23, 2005.
[25] Michele Basseville and Igor V. Nikiforov, Detection of Abrupt Changes: Theory and Application, Prentice-Hall, Inc. Upper Saddle River, NJ, USA 1993.
[26] V. N. Vapnik, “An overview of statistical learning theory,” IEEE Transactions on Neural Networks, vol. 10, pp 988-999, 1999.
[27] S. G, Mallat and Zhifeng Zhang, “Matching pursuits with time-frequency dictionaries”, IEEE Transactions on Signal Processing, vol. 41, no.12, pp.3397-3415,1993.
[28] 王小川,語音訊號處理,修訂版,全華圖書股份有限公司,台北縣,民國96年。
[29] 蘇峻慶,錄音資料中語者切割與分群方法之研究, 清華大學 , 碩士論文 , 民國94年。
[30] S. S. Cheng, H. M. Wang, and H. C.Fu, “BIC-based speaker segmentation using divide-and-conquer strategies with application to speaker diarization,” IEEE Transactions on Audio, Speech and Language Processing, vol. 18,pp. 141 - 157 , JAN. 2010.
|