參考文獻 |
[1] ISO/IEC 11172-3 (F) (1999) Information technology - Coding of moving picture and associated audio for digital storage media at up to about 1.5Mbits/s Part3: Audio.
[2] ISO/IEC 13818-7 (1997) Information technology - Generic coding of moving pictures and associated audio, Part7: Advance Audio Coding.
[3] R. B. Dannenberg, W. P. Birmingham, B. Pardo, N. Hu, C. Meek, and G. Tzanetakis, “A comparative evaluation of search techniques for query-by-humming using the musart testbed,” Journal of the American Society for Information Science and Technology, vol. 58, no. 5, pp. 687-701, 2007.
[4] J. Serrà, E. Gómez, and P. Herrera, Audio cover song identification and similarity: background, approaches, evaluation and beyond, in Advances in Music Information Retrieval, Germany Springer, 2010.
[5] T. Fujishima, “Realtime chord recognition of musical sound: A system using common lisp music,” in Proc. Int. Comput. Music Conf., pp. 464-467, 1999.
[6] M. Müller and S. Ewert, “Towards timbre-invariant audio features for harmony-based music,” IEEE Transactions on Audio Speech and Signal Processing, vol. 18, no. 3, pp. 649-662, 2010.
[7] J. P. Bello and J. Pickens, “A robust mid-level representation for harmonic content in music signals,” in Proc. Int. Conf. Music Inf. Retrieval, pp. 304-311, 2005.
[8] D. Gusfield, Algorithms on strings, trees and sequences: computer sciences and computational biology, Cambridge University Press, 1997.
[9] L. R. Rabiner and B. H. Juang. Fundamental of speech recognition, Prentice, Englewood Cliffs, NJ, 1993.
[10] V. I. Levenshtein, “Binary codes capable of correcting deletions, insertions, and reversals,” Soviet Physics-Doklady, vol. 10, no. 8, pp. 707-710, 1966.
[11] S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequences of two proteins,” Journal of Molecular Biology, vol. 48, no. 3, pp. 443-453, 1970.
[12] P. H. Sellers, “On the theory and computation of evolutionary distances,” SIAM Journal on Applied Mathematics, vol. 26, no. 4, pp. 787-793, 1974.
[13] T. F. Smith and M. S. Waterman, “Identification of common molecular subsequences,” Journal of Molecular Biology, vol. 147, no. 1, pp. 195-197, 1981.
[14] D. P. W. Ellis and G. E. Polliner, “Identifying cover songs with chroma features and dynamic programming beat tracking,” MIREX extended abstract, 2006.
[15] D. P. W. Ellis & G. E. Polliner, “Identifying cover songs with chroma features and dynamic programming beat tracking,” Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 4, pp. 1429-1432, April 2007.
[16] E. Gómez, Tonal description of music audio signals, Ph.D. dissertation, Music Technol. Group, Univ. Pompeu Fabra, Barcelona, Spain, 2006.
[17] E. Gómez and P. Herrera, “Estimating the tonality of polyphonic audio files: Cognitive versus machine learning modelling strategies,” in Proc. Int. Symp. Music. Inf. Retrieval (ISMIR), pp. 92-95, 2004,
[18] M. Riley, E. Heinen, and J. Ghosh, “A text retrieval approach to content-based audio retrieval,” In: Int. Symp. on Music Information Retrieval (ISMIR), pp. 295-300, Sep. 2008.
[19] C Todd, “A Digital Audio System for Broadcast and Prerecorded Media,” in Proc. 75th Conv. Aud. Eng. Soc., Mar. 1984.
[20] E. F. Schroder and W. Voessing, “High Quality Digital Audio Encoding With 3.0 Bits/Sample Using Adaptive Transform Coding,” in Proc. 80th Conv. Aud. Eng. Soc., Mar. 1986.
[21] G. Theile, M. Link, and G. Stoll, “Low-Bit Rate Coding of High Quality Audio Signals”, in Proc. 82nd Conv. Aud. Eng. Soc., Mar. 1987.
[22] K. Brandenburg, “OCF – A New Coding Algorithm for High Quality Sound Signals,” Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 12, pp. 141-144, Apr. 1987.
[23] J. Johnston, “Transform Coding of Audio Signals Using Perceptual Noise Criteria,” IEEE J. Sel. Areas in Comm., vol. 6, no. 2, pp. 314-23, Feb. 1988.
[24] W. Y. Chan and A. Gersho, “High Fidelity Audio Transform Coding With Vector Quantization,” Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 1109-1112, Apr. 1990.
[25] K. Brandenburg and J. D. Johnston, “Second Generation Perceptual Audio Coding: The Hybrid Coder,” in Proc. 88th Conv. Aud. Eng. Soc., Mar. 1990.
[26] K. Brandenburg, et al, “Aspec-Adaptive Spectral Entropy Coding of High Quality Music Signals,” in Proc. 90th Conv. Aud. Eng. Soc., Feb. 1991.
[27] Y. F. Dehery, M. Lever, and P. Urcum, “A MUSICAM Source Codec for Digital Audio Broadcasting and Storage,” Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP), vol. 5, pp. 3605-3608, Apr. 1991.
[28] M. Iwadare, et al., “A 128 kb/s Hi-Fi Audio Codec Based on Adaptive Transform Coding with Adaptive Block Size MDCT”, IEEE J. Sel. Areas in Comm., vol. 10, no. 1, pp. 138-144, Jan. 1992.
[29] T. Painter and A. Spanias, "Perceptual coding of digital audio,” Proceedings of the IEEE, vol. 88, no. 4, pp. 451-513, Apr. 2000.
[30] Steve Vernon, “Design and implementation of AC-3 Coders,” IEEE Transactions on Consumer Electronics, vol. 41, no. 3, pp. 754-759, Aug. 1996.
[31] H. Sakamoto, Y. Shibuya, H. Takano, and O. Kitabatake, “A Dolby AC-3/MPEG1 Audio Decoder Core suitable for Audio/Visual System Integration,” IEEE Custom Integrated Circuits Conference, pp. 241-248, Nov. 1997.
[32] D. Pan, “A Tutorial on MPEG/Audio Compression,” IEEE Multimedia, vol. 2, no.2, pp. 60-71, 1995.
[33] E. Zwicker and H. Fastl, Psychoacoustics - Facts and Models, Springer Berlin, Heidelberg, 1990.
[34] J. D. Johnston and A. J. Ferreira, “Sum-difference stereo transform coding,” in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), vol. 2, pp. 569-572, San Francisco, USA, March 1992.
[35] K. H. Huang and J. F. Yang, Low Data Rate MPEG-1 Layer III Audio Coder Enhancement, Thesis for Master of Science, Department of Electrical Engineering National Cheng Kung University, 2002.
[36] N. V. Patel and I. K. Sethi, “Audio characterization for video indexing,” In Proc. SPIE, vol. 2670, pp. 373-384, 1996.
[37] Y. Nakajima, Y. Lu, M. Sugano, A. Yoneyama, H. Yamagihara, and A. Kurematsu, “A fast audio classification from MPEG coded data,” In proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), vol. 6, pp. 3005-3008, 1999.
[38] X. Shao, C. Xu, Y. Wang, and M. Kankanhalli,“Automatic music summarization in compressed domain,” In Proc. IEEE Int. Conf. Acoustics, Speech and Sig. Proc. (ICASSP), vol. 4, pp. 261-264, 2004.
[39] T. M. Chang, E. T. Chen, C. B. Hsieh, and P. C. Chang, “Cover song identification with direct chroma feature extraction from AAC files,” IEEE 2nd Global Conference on Consumer Electronics, pp. 55-56, 2013.
[40] E. Ravelli, G. Richard, and L. Daudet, “Audio signal representations for indexing in the transform domain,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 3, pp. 434-446, 2010.
[41] C. H. Yu and S. D. You, “On the possibility of only using long windows in MPEG-2 AAC coding,” IEEE Pacific Rim Conference on Multimedia, pp. 663-670, 2002.
[42] T. H. Tsai and C. Liu, “A configurable common filterbank processor for multi-standard audio decoder,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. 90 no.9, pp.1913-1923, 2007.
[43] J. W. Cooley and J. W. Tukey, “An algorithm for the machine calculation of complex Fourier series,” Mathematics of Computation, vol. 19, pp.297-301, 1965.
[44] S. Chen, N. Xiong, J. Park, M. Chen, and R. Hu, “Spatial parameters for audio coding: MDCT domain analysis and synthesis,” Multimedia Tools Applications, vol. 48, no. 2, pp. 225-246, 2010.
[45] H. Malvar, Signal processing with lapped transforms. Artech House, Inc., 1992.
[46] J. Fan and Q. Yao, Nonlinear time series: nonparametric and parametric methods, Springer, 2005.
[47] G. Hinsen and D. Klösters, “The sampling series as a limiting case of Lagrange interpolation,” Applicable Analysis, vol. 49, no. 1-2, pp. 49-60, 1993.
[48] Programs for Digital Signal Processing, IEEE Press, 1979.
[49] G. Oetken, T. W. Parks, and H. W. Schussler, “New results in the design of digital interpolators,” IEEE Trans. Acoust. Speech, Signal Processing, vol. 23, no. 3, pp. 301-309, 1975.
[50] J. Serra, G. Emilia, and H. Perfecto, Advances in music information retrieval, Springer-Verlag, Berlin Heidelberg, 2010.
[51] J. Serra, E. Gomez, P. Herrera, and X. Serra, “Chroma binary similarity and local alignment applied to cover song identification,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 16, no. 6, pp. 1138-1151, 2008.
[52] S. Ravuri and D. P. W. Ellis, “The hydra system of unstructured cover song detection,” Ext. Abstract for the MIREX Audio Cover Song Identification task submission, Kobe, Japan, 2009.
[53] T. Bertin-Mahieux and D.P.W. Ellis, “Large-scale cover song recognition using hashed chroma landmarks,” IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pp. 117-120, 2011.
[54] T. Bertin-Mahieux, D. P. W. Ellis, and B. Whitman, P. Lamere, “The million song dataset,” In Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011.
[55] S. Chakrabarti , R. Khanna , U. Sawant , and C. Bhattacharyya, “Structured learning for non-smooth ranking losses,” Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 88-96, 2008.
[56] M. H. Lee, S. Rho, and E. I. Choi, “Ontology based user query interpretation for semantic multimedia contents retrieval,” Multimedia Tools and Applications, doi:10.1007/s11042-013-1383-2, 2013.
|