參考文獻 |
[1] S. Lee, K. Song, and J. Choi, “Access to an automated security system using gesture-based passwords,” Proc. 2012 15th Int. Conf. Network-Based Inf. Syst. NBIS 2012, pp. 760–765, 2012.
[2] M. A. B. Sarijari, R. a. Rashid, M. R. A. Rahim, and N. H. Mahalin, “Wireless home security and automation system utilizing ZigBee based multi-hop communication,” Proc. IEEE 2008 6th Natl. Conf. Telecommun. Technol. IEEE 2008 2nd Malaysia Conf. Photonics, NCTT-MCP 2008, no. August, pp. 242–245, 2008.
[3] Z. Zhou, G. Zhao, X. Hong, and M. Pietikäinen, “A review of recent advances in visual speech decoding,” Image Vis. Comput., vol. 32, no. 9, pp. 590–605, Sep. 2014.
[4] G. Potamianos, C. Neti, A. W. Senior, and S. Member, “Recent advances in the automatic recognition of audiovisual speech,” Proc. IEEE, vol. 91, no. 9, 2003.
[5] I. Matthews, G. Potamianos, C. Neti, and I. Matthews, “Audio-visual automatic speech recognition : an overview,” Issues inVisual Audio-v. Speech Process., 2004.
[6] G. Zhao, M. Barnard, and M. Pietikäinen, “Lipreading with local spatiotemporal descriptors,” IEEE Trans. Multimed., vol. 11, no. 7, pp. 1254–1265, 2009.
[7] E. Gomez, C. M. Travieso, J. C. Briceno, and M. a. Ferrer, “Biometric identification system by lip shape,” Proceedings. 36th Annu. 2002 Int. Carnahan Conf. Secur. Technol., pp. 39–42, 2002.
[8] Y. Lan, R. Harvey, B. Theobald, E. Ong, and R. Bowden, “Comparing visual features for lipreading,” in Auditory-Visual Speech Processing (AVSP), 2009.
[9] D. Bordencea, H. Valean, S. Folea, and A. Dobircau, “Agent based system for home automation, monitoring and security,” 2011 34th Int. Conf. Telecommun. Signal Process. TSP 2011 - Proc., no. 90371862, pp. 165–169, 2011.
[10] W. L. Ng, C. K. Ng, N. K. Noordin, and B. Mohd. Ali, “Gesture based automating household appliances,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 6762 LNCS, no. PART 2, pp. 285–293, 2011.
[11] A. K. Gnanasekar, P. Jayavelu, and V. Nagarajan, “Speech recognition based wireless automation of home loads with fault identification for physically challenged,” 2012 Int. Conf. Commun. Signal Process. ICCSP-2012, pp. 128–132, 2012.
[12] S. Cox, R. Harvey, Y. Lan, J. Newman, and B. Theobald, “The challenge of multispeaker lip-reading,” Int. Conf. Audit. Vis. Speech Process., 2008.
[13] G. Z. G. Zhao and M. Pietikainen, “Local binary pattern descriptors for dynamic texture recognition,” 18th Int. Conf. Pattern Recognit., vol. 2, pp. 18–21, 2006.
[14] P. a. Crook, V. Kellokumpu, G. Zhao, and M. Pietikainen, “Human activity recognition using a dynamic texture based method,” Procedings Br. Mach. Vis. Conf. 2008, pp. 88.1–88.10, 2008.
[15] I. T. Jolliffe, Principal Component Analysis, 2nd ed. New York: Springer-Verlag, 2002.
[16] H. Yu and J. Yang, “A direct LDA algorithm for high-dimensional data with application to face recognition,” Pattern Recognit., vol. 34, no. February, pp. 2067–2070, 2001.
[17] Z.-Q. Zhao, H. Glotin, Z. Xie, J. Gao, and X. Wu, “Cooperative sparse representation in two opposite directions for semi-supervised image annotation.,” IEEE Trans. Image Process., vol. 21, no. 9, pp. 4218–31, Sep. 2012.
[18] P. Comon, “Independent component analysis, a new concept?,” Signal Processing, vol. 36, no. 3, pp. 287–314, 1994.
[19] S. Tsuge, M. Shishibori, S. Kuroiwa, and K. Kita, “Dimensionality reduction using non-negative matrix factorization for information retrieval,” 2001 IEEE Int. Conf. Syst. Man Cybern. e-Systems e-Man Cybern. Cybersp., vol. 2, pp. 960–965, 2001.
[20] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 31, no. 2, pp. 210–227, 2009.
[21] L. Zhang, W.-D. Zhou, P.-C. Chang, J. Liu, Z. Yan, T. Wang, and F.-Z. Li, “Kernel sparse representation-based classifier,” IEEE Trans. Signal Process., vol. 60, no. 4, pp. 1684–1695, Apr. 2012.
[22] Y. Li and A. Ngom, “Sparse representation for the classification of high-dimensional biological data,” BMC Syst. Biol., vol. 07, pp. 306–311, 2013.
[23] M. Chora, “Lips recognition for biometrics,” ICB, pp. 1260–1269, 2009.
[24] H. A. Mahmoud, F. Bin Muhaya, and A. Hafez, “Lip reading based surveillance system,” 2010 5th Int. Conf. Futur. Inf. Technol. Futur. 2010 - Proc., 2010.
[25] S. Sengupta, A. Bhattacharya, P. Desai, and A. Gupta, “Automated lip reading technique for password authentication,” Int. J. Appl. Inf. Syst., vol. 4, no. 3, pp. 18–24, 2012.
[26] P. Singh, V. Laxmi, M. S. Gaur, and Acm, “Lip peripheral motion for visual surveillance,” Proc. Fifth Int. Conf. Secur. Inf. Networks, pp. 173–177, 2012.
[27] I. Matthews, T. F. Cootes, J. a. Bangham, S. Cox, and R. Harvey, “Extraction of visual features for lipreading,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, pp. 198–213, 2002.
[28] X. Liu and Y. M. Cheung, “Learning multi-boosted HMMs for lip-password based speaker verification,” IEEE Trans. Inf. Forensics Secur., vol. 9, no. 2, pp. 233–246, 2014.
[29] S. W. F. S. W. Foo, Y. L. Y. Lian, and L. D. L. Dong, “Recognition of visual speech elements using adaptively boosted hidden markov models,” IEEE Trans. Circuits Syst. Video Technol., vol. 14, no. 5, pp. 693–705, 2004.
[30] T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, “Active shape models-their training and application,” Computer Vision and Image Understanding, vol. 61, no. 1. pp. 38–59, 1995.
[31] A. Lanitis, C. J. Taylor, and T. F. Cootes, “Automatic interpretation and coding of face images using flexible models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 19, no. 7, pp. 743–756, 1997.
[32] T. Cootes and C. Taylor, “Combining point distribution models with shape models based on finite element analysis,” Image Vis. Comput., vol. 13, no. 5, pp. 403–409, 1995.
[33] S. Gao, I. W.-H. Tsang, and L.-T. Chia, “Sparse representation with kernels.,” IEEE Trans. Image Process., vol. 22, no. 2, pp. 423–34, Feb. 2013.
[34] T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” Proc. 5th Eur. Conf. Comput. Vis. (Computer Vis. - ECCV’98), vol. 23, no. 6, pp. 484–498, 1998.
[35] T. F. Cootes and C. J. Taylor, “A mixture model for representing shape variation,” Image Vis. Comput., vol. 17, no. 8, pp. 567–573, 1999.
[36] I. Matthews, T. Cootes, S. Cox, R. Harvey, and J. A. Bangham, “Lipreading using shape, shading and scale,” Audit. Speech Process., no. 1, 1998.
[37] J. F. Guitarte Pérez, A. F. Frangi, E. L. Solano, and K. Lukas, “Lip reading for robust speech recognition on embedded devices,” ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., vol. I, pp. 473–476, 2005.
[38] I. Shdaifat, R. Grigat, and D. Langmann, “A system for automatic lip reading,” in AVSP 2003, International Conference on Audio-Visual Speech Processing, 2003.
[39] T. F. Cootes, G. Edwards, and C. J. Taylor, “Comparing active shape models with active appearance models,” Procedings Br. Mach. Vis. Conf. 1999, pp. 18.1–18.10, 1999.
[40] I. Matthews, T. F. Cootes, J. A. Bangham, S. Cox, and R. Harvey, “Extraction of visual features for lipreading,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, pp. 198–213, 2002.
[41] T. Ojala, M. Pietikäinen, and T. Mäenpää, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971–987, 2002.
[42] T. Ojala, M. Pietikäinen, and T. Mäenpää, “A generalized local binary pattern operator for multiresolution gray scale and rotation invariant texture classification,” Adv. Pattern Recognit., vol. 2013, pp. 399–408, 2001.
[43] T. Kobayashi and J. Ye, “Acoustic feature extraction by statistics based local binary pattern for environmental sound classification,” in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3052–3056, 2014.
[44] F. Perronnin, J. Sánchez, and T. Mensink, “Improving the fisher kernel for large-scale image classification,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 6314 LNCS, pp. 143–156, 2010.
[45] V. Kellokumpu, G. Zhao, and M. Pietikäinen, “Human activity recognition using a dynamic texture based method,” Br. Mach. Vis. Conf., pp. 1–10, 2008.
[46] C. H. Chan, B. Goswami, J. Kittler, and W. Christmas, “Local ordinal contrast pattern histograms for spatiotemporal, lip-based speaker authentication,” in IEEE Transactions on Information Forensics and Security, vol. 7, no. 2, pp. 602–612, 2012.
[47] K. Messer, J.Matas, J.Kittler, J.Luettin, and G.Maitre, “XM2VTSDB: The extended M2VTS database,” in Second International Conference on Audio and Video-based Biometric Person Aunthentication (AVBPA’99), pp. 72–77, 1999.
[48] M. S. Bartlett, J. R. Movellan, and T. J. Sejnowski, “Face recognition by independent component analysis,” IEEE Trans. Neural Networks, vol. 13, no. 6, pp. 1450–1464, 2002.
[49] A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution.,” Neural Comput., vol. 7, no. 6, pp. 1129–1159, 1995.
[50] P. Paatero, “Least squares formulation of robust non-negative factor analysis,” Chemom. Intell. Lab. Syst., vol. 37, no. 1, pp. 23–35, 1997.
[51] D. D. Lee and H. S. Seung, “Learning the parts of objects by non-negative matrix factorization.,” Nature, vol. 401, no. 6755, pp. 788–791, 1999.
[52] C. Ding, T. Li, and M. I. Jordan, “Convex and semi-nonnegative matrix factorizations,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 1, pp. 45–55, 2010.
[53] W.-C. Hsieh, C.-W. Ho, V.-H. Duong, Y.-S. Lee, and J.-C. Wang, “2D semi-NMF of scale-frequency map for environmental sound classification,” Signal Inf. Process. Assoc. Annu. Summit Conf. (APSIPA), 2014 Asia-Pacific, pp. 1–4, Dec. 2014.
[54] V. N. Vapnik, The natural of statistical learning theory, 2nd ed. New York: Springer, 1995.
[55] B. E. Boser, I. M. Guyon, and V. N. Vapnik, “A training algorithm for optimal margin classifiers,” Proc. 5th Annu. ACM Work. Comput. Learn. Theory, pp. 144–152, 1992.
[56] M. Gurban and J. P. Thiran, “Audio-visual speech recognition with a hybrid SVM-HMM system,” in 13th European Signal Processing Conference (EUSIPCO), pp. 728–731, 2005.
[57] J. He and Z. Hua, “Lipreading recognition based on SVM and DTAK,” 2010 4th Int. Conf. Bioinforma. Biomed. Eng. iCBBE 2010, no. 2, pp. 1–3, 2010.
[58] A. a. Shaikh, D. K. Kumar, W. C. Yau, and J. Gubbi, “Lip reading using optical flow and support vector machines,” 3rd Int. Congr. Image Signal Process., pp. 327–330, 2010.
[59] M. Gordan, C. Kotropoulos, and I. Pitas, “Visual speech recognition using support vector machines,” 2002 14th Int. Conf. Digit. Signal Process. Proceedings. DSP 2002 (Cat. No.02TH8628), vol. 2, pp. 1093–1096, 2002.
[60] L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proceedings of the IEEE, vol. 77. pp. 257–286, 1989.
[61] N. Morgan and H. Bourlad, “An introduction to hybrid HMM/connectionist continuous speech recognition,” IEEE Signal Process. Mag., vol. 12, no. 3, pp. 25–42, 1995.
[62] S. Gao, I. W. Tsang, and L. Chia, “Kernel sparse representation for image classification and face recognition,” ECCV, no. i, pp. 1–14, 2010.
[63] S. Siatras, N. Nikolaidis, M. Krinidis, and I. Pitas, “Visual Lip Activity Detection and Speaker Detection Using Mouth Region Intensities,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 1, pp. 133–137, Jan. 2009.
[64] B. Rivet, L. Girin, and C. Jutten, “Visual voice activity detection as a help for speech source separation from convolutive mixtures,” Speech Commun., vol. 49, no. 7–8, pp. 667–677, 2007.
[65] Q. Liu, W. Wang, and P. Jackson, “A visual voice activity detection method with adaboosting,” in Sensor Signal Processing for Defence, pp. 1–5, 2011.
[66] V. Libal, J. Connell, G. Potamianos, and E. Marcheret, “An embedded system for in-vehicle visual speech activity detection,” 2007 IEEE 9Th Int. Work. Multimed. Signal Process. MMSP 2007 - Proc., pp. 255–258, 2007.
[67] T. Huang, G. Yang, and G. Tang, “A fast two-dimensional median filtering algorithm,” IEEE Trans. Acoust., vol. 27, no. 1, 1979.
[68] M. Pietikäinen, A. Hadid, G. Zhao, and T. Ahonen, Computer Vision Using Local Binary Patterns, vol. 40. London: Springer London, 2011.
[69] D. Zhang, S. Chen, and Z. Zhou, “Two-dimensional non-negative matrix factorization for face representation and recognition,” ICCV 2005 Work. Anal. Model. Faces Gestures, pp. 350–363, 2005.
[70] C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Min. Knowl. Discov., vol. 2, no. 2, pp. 121–167, 1998.
[71] J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, and A. Y. Ng, “Multimodal deep learning,” Proc. 28th Int. Conf. Mach. Learn., no. Washington, pp. 689–696, 2011.
|