參考文獻 |
[1] Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth movers distance as a metric for image retrieval,” Int. J. Computer Vision, vol. 40, no. 2, pp. 99–121, 2000.
[2] M. Caetano and F. Wiering, “Theoretical framework of a computational model of auditory memory for music emotion recognition,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2014, pp. 331–336.
[3] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM J. Sci. Comput., vol. 20, pp. 33–61, 1998.
[4] G. Collier, “Beyond valence and activity in the emotional connotations of music,” Psychology of Music, vol. 35, no. 1, pp. 110–131, 2007.
[5] J. J. Deng, C. H. C. Leung, A. Milani, and L. Chen, “Emotional states associated with music: Classification, prediction of changes, and consideration in recommendation,” ACM Trans. Intel. Systems & Technology, vol. 5, no. 1, pp. 4:1–4:36, 2015.
[6] D. L. Donoho, “For most large underdetermined systems of linear equations the minimal l1-norm solution is also the sparsest solution,” Comm. Pure Appl. Math., vol. 59, pp. 797–829, 2006.
[7] T. Eerola and J. K. Vuoskoski, “A review of music and emotion studies: Approaches, emotion models, and stimuli,” Music Perception, vol. 30, no. 3, pp. 307–340, 2013.
[8] T. Eerola, “Modelling emotions in music: Advances in conceptual, contextual and validity issues,” in AES International Confernece, 2014.
[9] Y. H. Yang and H. H. Chen, “Prediction of the distribution of perceived music emotions using discrete samples,” IEEE Trans. Audio, Speech & Language Processing, vol. 19, no. 7, pp. 2184–2196, 2011.
[10] P. N. Juslin and J. A. Sloboda, Handbook of Music and Emotion: Theory, Research, Applications. New York: Oxford University Press, 2010.
[11] L. B. Meyer, Emotion and Meaning in Music. Chicago: University of Chicago Press, 1956.
[12] A. Gabrielsson, “Emotion perceived and emotion felt: Same or different?” Musicae Scientiae, pp. 123–147, 2002, special issue.
[13] P. Saari, G. Fazekas, T. Eerola, M. Barthet, O. Lartillot, and M. Sandler, “Genre-adatptive semantic computing and audio-based modeling for music mood annotation,” IEEE Trans. Affective Computing, vol. 7, no. 2, pp. 122–135, 2016.
[14] J. D. Gibbons and S. Chakraborti, Nonparametric statistical inference. Springer, 2011.
[15] P. O. Hoyer, “Non-negative sparse coding,” in Proc. IEEE Workshop on Neural Networks for Signal Processing, 2002, pp. 557–565.
[16] X. Hu and Y.-H. Yang, “A study on cross-cultural and cross-dataset generalizability of music mood regression models,” in Proc. Sound and Music Computing Conf., 2014, pp. 1149–1155.
[17] A. Huq, J. P. Bello, and R. Rowe, “Automated music emotion recognition: A systematic evaluation,” J. New Music Research, vol. 39, no. 3, pp. 227–244, 2010.
[18] D. Huron, “Perceptual and cognitive applications in music information retrieval,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2000.
[19] K. MacDorman, S. Ough, and C.-C. Ho, “Automatic emotion prediction of song excerpts: Index construction, algorithm design, and empirical comparison,” J. New Music Research, vol. 36, no. 4, pp. 281–299, 2007.
[20] A. W. Bowman and A. Azzalini, Applied Smoothing Techniques for Data Analysis. New York: Oxford University Press, 1997.
[21] Y. E. Kim, E. M. Schmidt, R. Migneco, B. G. Morton, P. Richardson, J. Scott, J. Speck, and D. Turnbull, “Music emotion recognition: A state of the art review,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2010, pp. 255–266.
[22] S. Koelstra, C. Mühl, M. Soleymani, J.-S. Lee, A. Yazdani, T. Ebrahimi, T. Pun, A. Nijholt, and I. Patras, “DEAP: A database for emotion analysis; using physiological signals,” IEEE Trans. Affective Computing, vol. 3, no. 1, pp. 18–31, 2012.
[23] K. Trohidis, G. Tsoumakas, G. Kalliris, and I. Vlahavas, “Multi-label classification of music into emotions,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2008, pp. 325–330.
[24] K. Krippendorff, Content analysis: An introduction to its methodology. Thousand Oaks, CA: Sage, 2013.
[25] C. Laurier, J. Grivolla, and P. Herrera, “Multimodal music mood classification using audio and lyrics,” in Proc. Int. Conf. Machine Learning and Applications, 2008, pp. 105–111.
[26] J. H. Lee and J. S. Downie, “Survey of music information needs, uses, and seeking behaviours: Preliminary findings,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2004, pp. 441–446.
[27] M. Leman, V. Vermeulen, D. V. L., D. Moelants, and M. Lesaffre, “Prediction of musical affect using a combination of acoustic structural cues,” J. New Music Research, vol. 34, no. 1, pp. 39–67, 2005.
[28] T. Li and M. Ogihara, “Detecting emotion in music,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2003, pp. 239–240.
[29] L. Lu, D. Liu, and H.-J. Zhang, “Automatic mood detection and tracking of music audio signals,” IEEE Trans. Audio, Speech & Language Processing, vol. 14, no. 1, pp. 5–18, 2006.
[30] X. Hu, J. S. Downie, C. Laurier, M. Bay, and A. F. Ehmann, “The 2007 MIREX audio mood classification task: Lessons learned,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2008, pp. 462–467.
[31] Y. H. Yang, Y. C. Lin, Y. F. Su, and H. H. Chen, “A regression approach to music emotion recognition,” IEEE Trans. Audio, Speech & Language Processing, vol. 16, no. 2, pp. 448–457, 2008.
[32] T. Lidy and A. Rauber, “Evaluation of feature extractors and psycho-acoustic transformations for music genre classification,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2005, pp. 34–41, [Online] http://www.ifs.tuwien.ac.at/mir/audiofeatureextraction.html .
[33] T. Hofmann, “Probabilistic latent semantic indexing,” in Proc. ACM SIGIR Conf. Research and Development in Information Retrieval, 1999, pp. 50–57.
[34] M. Soleymani, M. N. Caro, E. M. Schmidt, C. Y. Sha, and Y. H. Yang, “1000 songs for emotional analysis of music,” in Proc. ACM Int. Workshop. Crowdsourcing for Multimedia, 2013, pp. 1–6.
[35] F. Eyben, F. Weninger, F. Gross, and B. Schuller, “Recent developments in openSMILE, the Munich open-source multimedia feature extractor,” in Proc. ACM Int. Conf. Multimedia, 2013, pp. 835–838, [Online] http://www.audeering.com/research/opensmile .
[36] T. Hill and P. Lewicki, Statistics: Methods and Applications. StatSoft, 2005.
[37] L. Lie, D. Liu, and H. J. Zhang, “Automatic mood detection and tracking of music audio signals,” IEEE Trans. Audio, Speech & Language Processing, vol. 14, no. 1, pp. 5–18, 2006.
[38] R. Panda, R. Malheiro, B. Rocha, A. Oliveira, and R. Paiva, “Multi-modal music emotion recognition: A new dataset, methodology and comparative analysis,” in Proc. Int. Soc. Computer Music Modelling & Retrieval, 2013, pp. 1–13.
[39] A. Roda, S. Canazza, and G. D. Poli, “Clustering affective qualities of classical music: beyond the valence-arousal plane,” IEEE Trans. Affective Computing, vol. 5, no. 4, pp. 364–376, 2014.
[40] J. A. Russell, “A circumplex model of affect,” J. Personality & Social Science, vol. 39, no. 6, pp. 1161–1178, 1980.
[41] E. M. Schmidt and Y. E. Kim, “Prediction of time-varying musical mood distributions from audio,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2010.
[42] E. M. Schmidt and Y. E. Kim, “Modeling musical emotion dynamics with conditional random fields,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2011.
[43] B. Schölkopf, A. Smola, and K.-R. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Comput., vol. 10, no. 5, pp. 1299–1319, 1998. [Online]. Available: http://dx.doi.org/10.1162/089976698300017467 =0pt
[44] E. Schubert, “Modeling perceived emotion with continuous musical features,” Music Perception, vol. 21, no. 4, pp. 561–585, 2004.
[45] J. F. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,” Optim. Meth. Softw., vol. 11, no. 1–4, pp. 625–653, 1999.
[46] A. Singhi and D. Brown, “On cultural, textual and experiential aspects of music mood,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2014, pp. 3–8.
[47] M. Soleymani, A. Aljanaki, Y.-H. Yang, M. N. Caro, F. Eyben, K. Markov, B. W. Schuller, R. Veltkamp, F. Weninger, and F. Wiering, “Emotional analysis of music: A comparison of methods,” in Proc. ACM Multimedia, 2014, pp. 1161–1164.
[48] J. A. Speck, E. M. Schmidt, B. G. Morton, and Y. E. Kim, “A comparative study of collaborative vs. traditional musical mood annotation,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2011.
[49] O. Lartillot and P. Toiviainen, “MIR in Matlab (II): A toolbox for musical feature extraction from audio,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2007, pp. 127–130, [Online] http://users.jyu.fi/ lartillo/mirtoolbox/ .
[50] J. C. Wang, Y. H. Yang, H. M. Wang, and S. K. Jeng, “Modeling the affective content of music with a Gaussian mixture model,” IEEE Trans. Affective Computing, vol. 6, no. 1, pp. 56–68, 2015.
[51] J. K. Vuoskoski and T. Eerola, “Measuring music-induced emotion: A comparison of emotion models, personality biases, and intensity of experiences,” Music Perception, vol. 15, no. 2, pp. 159–173, 2011.
[52] F. Weninger, F. Eyben, and B. Schuller, “On-line continuous-time music mood regression with deep recurrent neural networks,” in Proc. IEEE Int. Conf. Acoustics, Speech, & Signal Processing, 2014, pp. 5449–5453.
[53] Y.-H. Yang, Y.-F. Su, Y.-C. Lin, and H. H. Chen, “Music emotion recognition: The role of individuality,” in Proc. ACM Int. Workshop on Human-centered Multimedia, 2007, pp. 13–21.
[54] Y. H. Yang, Y. C. Lin, Y. F. Su, and H. H. Chen, “A regression approach to music emotion recognition,” IEEE Trans. Audio, Speech & Language Processing, vol. 16, no. 2, pp. 448–457, 2008.
[55] Y.-H. Yang and H. H. Chen, “Ranking-based emotion recognition for music organization and retrieval,” IEEE Trans. Audio, Speech & Language Processing, vol. 19, no. 4, pp. 762–774, 2011.
[56] Y.-H. Yang and H.-H. Chen, “Machine recognition of music emotion: A review,” ACM Trans. Intel. Systems & Technology, vol. 3, no. 4, 2012.
[57] Y.-H. Yang and J.-Y. Liu, “Quantitative study of music listening behavior in a social and affective context,” IEEE Trans. Multimedia, vol. 15, no. 6, pp. 1304–1315, 2013.
[58] “Development of a global mental health action plan 2013-2020,” World Health Organization, Nov. 2012.
[59] S. Lyubomirsky, L. King, and E. Diener, “The benefits of frequent positive affect: does happiness lead to success?,” Psychological Bulletin, vol. 131, no. 6, pp. 803–855, Nov. 2005.
[60] M. E. P. Seligman and M. Csikszentmihalyi, “Positive psychology: An introduction,” American Psychologist, vol. 55, no. 1, pp. 5–14, Jan. 2000.
[61] J. Helliwell, R. Layard, and J. Sachs, “World Happiness Report,” The Earth Institute, Columbia University, New York, United States, Apr. 2012.
[62] S. B. F. Hargens, “Integral development — Taking the middle path towards gross national happiness,” Journal of Bhutan Studies, vol. 6, pp. 24–87, 2002.
[63] D. McDuff, A. Karlson, A. Kapoor, A. Roseway, and M. Czerwinski, “AffectAura: Emotional wellbeing reflection system,” in Proc. 2012 6th Int. Conf. Pervasive Computing Technologies for Healthcare, San Diego, California, United States, 2012, May 21–24, pp. 199–200.
[64] A. Tawari and M. M. Trivedi, “Speech emotion analysis: Exploring the role of context,” IEEE Trans. Multimedia, vol. 12, no. 6, pp. 502–509, Oct. 2010.
[65] A. Madan, M. Cebrian, S. Moturu, K. Farrahi, and A. S. Pentland, “Sensing the ”Health State” of a community,” IEEE Pervasive Computing, vol. 11, no. 4, pp. 36–45, Oct.–Dec. 2012.
[66] N. K. Suryadevara and S. C. Mukhopadhyay, “Wireless sensor network based home monitoring system for wellness determination of elderly,” IEEE Sensors Journal, vol. 12, no. 6, pp. 1965–1972, Jun. 2012.
[67] C. A. Frantzidis, C. Bratsas, M. A. Klados, E. Konstantinidis, C. D. Lithari, A. B. Vivas, C. L. Papadelis, E. Kaldoudi, C. Pappas, and P. D. Bamidis, “On the classification of emotional biosignals evoked while viewing affective pictures: An integrated data-mining-based approach for healthcare applications,” IEEE Trans. Information Technology in Biomedicine, vol. 14, no. 2, pp. 309–318, Mar. 2010.
[68] T. Taleb, D. Bottazzi, and N. Nasser, “A novel middleware solution to improve ubiquitous healthcare systems aided by affective information,” IEEE Trans. Information Technology in Biomedicine, vol. 14, no. 2, pp. 335–349, Mar. 2010.
[69] I. Luengo, E. Navas, and I. Hernáez, “Feature analysis and evaluation for automatic emotion identification in speech,” IEEE Trans. Multimedia, vol. 12, no. 6, pp. 490–501, Oct. 2010.
[70] A. Ortony, G. L. Clore, and A. Collins, The Cognitive Structure of Emotions. New York, NY: Cambridge University Press, May 1990.
[71] R. Plutchik, The Psychology and Biology of Emotion. New York, NY: Harper Collins College, Jan. 1994.
[72] L. Vidrascu and L. Devillers, “Annotation and detection of blended emotions in real human-human dialogs recorded in a call center,” in Proc. 2005 IEEE Int. Conf. Multimedia and Expo, Amsterdam, Netherlands, 2005, Jul. 06–09, pp. 944–947.
[73] P. C. Bagshaw, M. Jack, and J. Laver, “Automatic prosodic analysis for computer aided pronunciation teaching,” Ph.D. dissertation, Center for Speech Technology Research, University of Edinburgh, Edinburgh, Scotland, United Kingdom, 1994.
[74] C. Busso, S. Lee, and S. Narayanan, “Analysis of emotionally salient aspects of fundamental frequency for emotion detection,” IEEE Trans. Audio, Speech, and Language Processing, vol. 17, no. 4, pp. 582–596, May 2009.
[75] C. E. Williams and K. N. Stevens, “Emotions and speech: Some acoustical correlates,” Journal of the Acoustical Society of America, vol. 52, no. 4B, pp. 1238–1250, 1972.
[76] I. R. Murray and J. L. Arnott, “Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion,” Journal of the Acoustical Society of America, vol. 93, no. 2, pp. 1097–1108, Feb. 1993.
[77] R. Banse and K. R. Scherer, “Acoustic profiles in vocal emotion expression,” Journal of Personality and Social Psychology, vol. 70, no. 3, pp. 614–636, Mar. 1996.
[78] S. McGilloway, R. Cowie, and E. Douglas-Cowie, “Prosodic signs of emotion in speech: Preliminary results from a new technique for automatic statistical analysis,” in Proc. 13th Int. Congr. Phonetic Sciences, Stockholm, Sweden, 1995, Aug. 13–19, pp. 250–253.
[79] S. McGilloway, R. Cowie, E. Douglas-Cowie, S. Gielen, M. Westerdijk, and S. Stroeve, “Approaching automatic recognition of emotion from voice: A rough benchmark,” in Proc. ISCA Tutorial and Research Workshop on Speech and Emotion, Newcastle, Northern Ireland, United Kingdom, 2000, Sep. 05–07, 2000, pp. 207–212.
[80] R. Cowie and E. Douglas-Cowie, “Automatic statistical analysis of the signal and prosodic signs of emotion in speech,” in Proc. 4th Int. Conf. Spoken Language Processing, Philadelphia, Pennsylvania, United States, 1996, Oct. 03–06, pp. 1989–1992.
[81] C. M. Lee and S. S. Narayanan, “Toward detecting emotions in spoken dialogs,” IEEE Trans. Speech and Audio Processing, vol. 13, no. 2, pp. 293–303, Mar. 2005.
[82] E. Mower, M. J. Mataric, and S. Narayanan, “A framework for automatic human emotion classification using emotion profiles,” IEEE Trans. Audio, Speech, and Language Processing, vol. 19, no. 5, pp. 1057–1070, Jul. 2011.
[83] D. Wu, T. D. Parsons, E. Mower, and S. Narayanan, “Speech emotion estimation in 3D space,” in Proc. 2010 IEEE Int. Conf. Multimedia and Expo, Singapore, 2010, Jul. 19–23, pp. 737–742.
[84] K. R. Scherer, “Vocal communication of emotion: A review of research paradigms,” Speech Communication, vol. 40, no. 1–2, pp. 227–256, Ari. 2003.
[85] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias, W. Fellenz, and J. G. Taylor, “Emotion recognition in human-computer interaction,” IEEE Signal Processing Magazine, vol. 18, no. 1, pp. 32–80, Jan. 2001.
[86] T. L. Nwea, S. W. Foob, and L. C. De Silva, “Speech emotion recognition using hidden Markov models,” Speech Communication, vol. 41, no. 4, pp. 603–623, Nov. 2003.
[87] P. Dunker, S. Nowak, A. Begau, and C. Lanz, “Content-based mood classification for photos and music: A generic multi-modal classification framework and evaluation approach,” in Proc. 1st ACM Int. Conf. Multimedia Information Retrieval, Vancouver, British Columbia, Canada, 2008, Oct. 30–31, pp. 97–104.
[88] B. Schuller, G. Rigoll, and M. Lang, “Hidden Markov model-based speech emotion recognition,” in Proc. 2003 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Hong Kong, China, 2003, Apri 06–10, pp. II-1–II-4.
[89] J. V. Tu, “Advantages and disadvantages of using artificial neural networks versus logistic regression for predicting medical outcomes,” Journal of Clinical Epidemiology, vol. 49, no. 11, pp. 1225–1231, Nov. 1996.
[90] M. W. Kadous, “Temporal classification: Extending the classification paradigm to multivariate time series,” Ph.D. dissertation, School of Computer Science and Engineering, University of New South Wales, Sydney, New South Wales, Australia, Oct. 2002.
[91] N. E. Gillian, “Gesture recognition for musician computer interaction,” Ph.D. dissertation, Faculty of Arts, Humanities and Social Sciences, School of Music and Sonic Arts, Queen′s University Belfast, Belfast, County Antrim, Northern Ireland, United Kingdom, Mar. 2011.
[92] J. M. K. Kua, E. Ambikairajah, J. Epps, and R. Togneri, “Speaker verification using sparse representation classification,” in Proc. 2011 IEEE Int. Conf. Acoustics, Speech, and Signal Processing, Prague, Czech Republic, 2011, May 22–27, pp. 4548–4551.
[93] K. Huang and S. Aviyente, “Sparse representation for signal classification,” in Proc. 20th Annual Conf. Neural Information Processing Systems, Vancouver, British Columbia, Canada, 2006, Dec. 04–07, pp. 609–616.
[94] J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 31, no. 2, pp. 210–227, Mar. 2009.
[95] N. Cho and C.-C. J. Kuo, “Sparse representation of musical signals using source-specific dictionaries,” IEEE Signal Processing Letters, vol. 17, no. 11, pp. 913–916, Nov. 2010.
[96] N. Cho and C.-C. J. Kuo, “Sparse music representation with source-specific dictionaries and its application to signal separation,” IEEE Trans. Audio, Speech, and Language Processing, vol. 19, no. 2, pp. 326–337, Feb. 2011.
[97] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequency dictionaries,” IEEE Trans. Signal Processing, vol. 41, no. 12, pp. 3397–3415, Dec. 1993.
[98] S. P. Ebenezer, A. Papandreou-Suppappola, and S. B. Suppappola, “Classification of acoustic emissions using modified matching pursuit,” EURASIP Journal on Applied Signal Processing, vol. 2004, no. 3, pp. 347–357, 2004.
[99] J. C. Wang, C. H. Lin, B. W. Chen, and M. K. Tsai, “Gabor-based nonuniform scale-frequency map for environmental sound classification in home automation,” IEEE Trans. Automation Science and Engineering, vol. 11, no. 2, pp. 607–613, Apr. 2014.
[100] S. Chu, S. Narayanan, and C.-C. J. Kuo, “Environmental sound recognition with time-frequency audio features,” IEEE Trans. Audio, Speech, and Language Processing, vol. 17, no. 6, pp. 1142–1158, Aug. 2009.
[101] K. Umapathy and S. Krishnan, “Time-width versus frequency band mapping of energy distributions,” IEEE Trans. Signal Processing, vol. 55, no. 3, pp. 978–989, Mar. 2007.
[102] S. Wang, A. Sekey, and A. Gersho, “An objective measure for predicting subjective quality of speech coders,” IEEE Journal on Selected Areas in Communications, vol. 10, no. 5, pp. 819–829, 1992.
[103] L. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition. Upper Saddle River, NJ: Prentice-Hall, 1993.
[104] E. Zwicker and H. Fastl, Psychoacoustics: Facts and Models, 2nd ed. New York, NY: Springer-Verlag, Apr. 1999.
[105] J. D. Durrant and J. H. Lovrinic, Bases of Hearing Science, 3rd ed. Baltimore, MD: Lippincott Williams and Wilkins, Jan. 1995.
[106] B. Moore, An Introduction to the Psychology of Hearing, 5th ed. Bingley, United Kingdom: Emerald Group Publishing Ltd., Jan. 2003.
[107] W. A. Yost and R. R. Fay, Auditory Perception of Sound Sources. New York, NY: Springer-Verlag, Nov. 2007.
[108] W. Brent, “Perceptually based pitch scales in cepstral techniques for percussive timbre identification,” in Proc. International Computer Music Conference, Montreal, Québec, Canada, 2009, Aug. 16–21, pp. 121–124.
[109] I. Luengo, E. Navas, I. Hernáez, and J. Sánchez, “Automatic emotion recognition using prosodic parameters,” in Proc. 9th European Conference on Speech Communication and Technology (Interspeech 2006), Lisbon, Portugal, 2005, Sep. 04–08, pp. 493–496.
[110] C.-W. Hsu and C.-J. Lin, “A comparison of methods for multiclass support vector machines,” IEEE Trans. Neural Networks, vol. 13, no. 2, pp. 415–425, Mar. 2002.
[111] M. Elad and A. M. Bruckstein, “A generalized uncertainty principle and sparse representation in pairs of bases,” IEEE Trans. Information Theory, vol. 48, no. 9, pp. 2558–2567, Sep. 2002.
[112] R. Rubinstein, S. Member, M. Zibulevsky, and M. Elad, “Double sparsity: Learning sparse dictionaries for sparse signal approximation,” IEEE Trans. Signal Processing, vol. 58, no. 3, pp. 1553–1564, Mar. 2010.
[113] K. T. Vo and A. Sowmya, “Multiscale sparse representation of high-resolution computed tomography (HRCT) lung images for diffuse lung disease classification,” in Proc. 2011 18th IEEE Int. Conf. Image Processing, Brussels, Belguim, 2011, Sep. 11–14, pp. 441–444.
[114] D. L. Donoho and M. Elad, “Optimally sparse representation in general (nonorthogonal) dictionaries via ℓ1 minimization,” Proceedings of the National Academy of Sciences, vol. 100, no. 5, pp. 2197–2202, Mar. 2003.
[115] E. J. Candes and T. Tao, “Decoding by linear programming,” IEEE Trans. Information Theory, vol. 51, no. 12, pp. 4203–4215, Dec. 2005.
[116] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM Journal on Scientific Computing, vol. 20, no. 1, pp. 33–61, 1998.
[117] E. H. Kim, K. H. Hyun, S. H. Kim, and Y. K. Kwak, “Improved emotion recognition with a novel speaker-independent feature,” IEEE/ASME Trans. Mechatronics, vol. 14, no. 3, pp. 317–325, Jun. 2009.
[118] Toolbox Focal [online] https://sites.google.com/site/nikobrummer/focal.
[119] E.B. Gouvea, “Acoustic-feature-based frequency warping for speaker normalization,” Department of Electrical and Computer Engineering Carnegie Mellon University, Pittsburgh, Pennsylvania, Dec. 1998.
[120] E. M. Schmidt and Y. E. Kim, “Prediction of time-varying musical mood distributions using kalman filtering,” in Proc. Int. Conf. Machine Learning and Applications, 2010, pp. 655–660.
[121] Y. Imbrasaite, T. Baltrusaitis, and P. Robinson, “Ccnf for continuous emotion tracking in music: Comparison with ccrf and relative feature representation,” in Proc. IEEE Int. Conf. Multimedia and Expo., 2014.
[122] Y. H. Yang, Y. C. Lin, H. T. Cheng, I. B. Liao, Y. C. Ho, and H. H. Chen, “Toward multi-modal music emotion classification,” Advances in Multimedia Information Processing, pp. 70–79, 2008.
[123] D. Su, P. Fung, and N. Auguin, “Multimodal music emotion classification using adaboost with decision stumps,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, 2013, pp. 3447–3451.
[124] Xiao Hu, Kahyun Choi, and J Stephen Downie, “A framework for evaluating multimodal music mood classification,” Journal of the Association for Information Science and Technology, 2016.
[125] S. O. Ali and Z. F. Peynircioglu, “Songs and emotions: are lyrics and melodies equal partners?,” Psychology of Music, vol. 34, no. 4, pp. 511-534, Oct. 2006.
[126] K. Mori and M. Iwanaga, “Pleasure generated by sadness: Effect of sad lyrics on the emotions induced by happy music,” Psychology of Music, vol. 42, no. 5, pp. 643–652, Sep. 2014.
[127] R. Pascanu, C. Gulcehre, K. Cho, and Y. Bengio, “How to construct deep recurrent neural networks,” in Proc. Int. Conf. Learning Representations, 2014.
[128] P. S. Huang, M. Kim, M. H. Johnson, and P. Smaragdis, Singing-voice separation from monaural recordings using deep recurrent neural networks,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2014.
[129] R. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Cannam, and J. P. Bello, “Medleydb: A multitrack dataset for annotation-intensive mir research,” in Proc. Int. Soc. Music Info. Retrieval Conf., 2014.
[130] E. Vincent, R. Gribonval, and C. Fevotte, “Performance measurement in blind audio source separation.,” IEEE Trans. Audio, Speech, and Language Processing, vol. 14, no. 4, pp. 1462–1469, July 2006.
[131] J. H. Lee, T. Hill, and L.Work, “What does music mood mean for real users?,” in Proc. ACM Int. Conf. iConference, 2012, pp. 112–119.
[132] C. L. Hsu and J. S. R. Jang, “On the improvement of singing voice separation for monaural recordings using the mir-1k dataset,” IEEE Trans. Audio, Speech, and Language Processing, vol. 18, no. 2, pp. 310–319, Feb. 2010. |