參考文獻 |
[1] R. Hennequin, B. David, and R. Badeau, “Score informed audio source separation using a parametric model of non-negative spectrogram,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2011, pp. 45–48.
[2] Z. Duan and B. Pardo, “Soundprism: An online system for score-informed source separation of music audio,” IEEE J. Sel. Topics Signal Process., vol. 5, no. 6, pp. 1205–1215, Dec. 2011.
[3]J. Ganseman, P. Scheunders, G. J. Mysore, and J. S. Abel, “Evaluation of a score-informed source separation system,” in Proc. Int. Soc. Music Inf. Retrieval (ISMIR), 2010, pp. 219–224.
[4]J. Woodruff, B. Pardo, and R. B. Dannenberg, “Remixing stereo music with score-informed source separation,” in Proc. Int. Conf. Music Inf. Retrieval (ISMIR), 2006, pp. 314–349.
[5]A. Klapuri and M. Davy, Eds.,“Signal Processing Methods for Music Transcription”. New York, NY, USA: Springer, 2006.
[6]D. Campbell, K. Palomäki, and G. Brown, “A matlab simulation of shoebox room acoustics for use in research andiimm teaching,” Comput. Inf. Syst. J., vol. 9, no. 3, pp. 48–51, Oct. 2005.
[7]Nadine Kroher; Emilia Gómez “Automatic Transcription of Flamenco Singing From Polyphonic Music Recordings“IEEE/ACM Transactions on Audio, Speech, and Language Processing.Year: 2016, Pages : 901 – 913 DOI : 10.1109 / TASLP.2016.2531284
[8]A. de Cheveigné and H. Kawahara, “Yin, a fundamental frequency estimator for speech and music,” J. Acoust. Soc. Amer., vol. 111, pp.1917–1930, 2002.
[9]M. Mauch, C. Cannam, R. Bittner, G. Fazekas, J. Salamon, J. Dai, J. Bello and S. Dixon, “Computer-aided Melody Note Transcription Using the Tony Software: Accuracy and Efficiency”, in Proceedings of the First International Conference on Technologies for Music Notation and Representation, 2015.
[10]Zhiyao Duan, Jinyu Han, and Bryan Pardo, “Multi-pitch Streaming of Harmonic Sound Mixtures ,” IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 22, NO. 1, JANUARY
[11] P. Leveau, E. Vincent, G. Richard, and L. Daudet, “Instrument-specific harmonic atoms for mid-level music representation,” IEEE Trans.Audio, Speech, Lang. Process., vol. 16, no. 1, pp. 116–128, Jan. 2008.
[12] V. Arora and L. Behera, “On-line melody extraction from polyphonic audio using harmonic cluster tracking,” IEEE Trans. Audio, Speech, Lang. Process., vol. 21, no. 3, pp. 520–530, Mar. 2013.
[13]Justin Salamon; Emilia Gomez; Daniel P. W. Ellis; Gael Richard “Melody Extraction from Polyphonic Music Signals: Approaches, applications, and challenges” IEEE Signal Processing Magazine Year: 2014 Pages: 118 - 134
[14] P. Mowlaee, R. Saeidi, M. G. Christensen, Z.-H. Tan, T. Kinnunen, P. Franti, and S. H. Jensen, “A joint approach for single-channel speaker identification and speech separation,” IEEE Trans. Audio, Speech,Lang. Process., vol. 20, no. 9, pp. 2586–2601, Nov. 2012.
[15] M. Cooke, J. R. Hershey, and S. Rennie, “Monaural speech separation and recognition challenge” Comput. Speech Lang., vol. 24, pp. 1–15, 2010.
[16]Yun-Kyung Lee; In Sung Lee; Oh-Wook Kwon“Single channel speech separation using phase-based methods”Yun-Kyung Lee; In Sung Lee; Oh-Wook Kwon IEEE Transactions on Consumer Electronics Year: 2010, Pages: 2453 - 2459,
[17] D.-n. Jiang, W. Zhang, L.-q. Shen, and L.-h. Cai, “Prosody analysis andmodeling for emotional speech synthesis,” in Proc. IEEE Int. Conf. Audio, Speech, Signal Process. (ICASSP), 2005, pp. 281–284
[18] Siddharth Sigtia; Emmanouil Benetos; Simon Dixon, ‘’An End-to-End Neural Network for Polyphonic Piano MusicTranscription ‘’ IEEE/ACM Transactions on Audio, Speech, and Language Processing Year: 2016, Pages: 927 - 939, DOI: 10.1109/TASLP.2016.2533858
[19] V. Arora and L. Behera, “Musical source clustering and identification in polyphonic audio,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 22, no. 6, pp. 1003–1012, Jun. 2014.
[20] T. Heittola, A. Klapuri, and T. Virtanen, “Musical instrument recognition in polyphonic audio using source-filter model for sound separation,” in Proc. Int. Symp. Music Inf. Retreival (ISMIR), 2009.
[21] E. Benetos, M. Kotti, and C. Kotropoulos, “Musical instrument classification using non-negative matrix factorization algorithms and subset feature selection,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2006, pp. 221–224
[22]R. Jaiswal, D. FitzGerald, D. Barry, E. Coyle, and S. Rickard, “Clustering NMF basis functions using shifted NMF for monaural sound source separation,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process. (ICASSP), 2011, pp. 245–248.
[23]F. Rigaud, A. Falaize, B. David, and L. Daudet, “Does inharmonicityimprove an NMF-based piano transcription model? ” in Proc. IEEE Int.Conf. Acoust., Speech, Signal Process. (ICASSP), 2013, pp. 11–15
[24] P. Smaragdis, B. Raj, and M. Shashanka, “A probabilistic latent variable model for acoustic modeling,” Adv. Models for Acoust. Process.,
NIPS, vol. 148, 2006.
[25]G. Grindlay and D. P. W. Ellis, “Transcribing multi-instrument polyphonic music with hierarchical eigeninstruments,” IEEE J. Sel. Topics Signal Process., vol. 5, no. 6, pp. 1159–1169, Oct. 2011.
[26]V. Arora and L. Behera, “Semi-supervised polyphonic source identification using PLCA based graph clustering,” in Proc. Int. Symp. Music Inf. Retreival (ISMIR), 2013.
[27] V. Arora and L. Behera, “Musical source clustering and identification in polyphonic audio IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 22, no. 6, pp. 1003–1012, Jun. 2014.
[28] L. G. Martins, J. J. Burred, G. Tzanetakis, and M. Lagrange, “Polyphonic instrument recognition using spectral clustering.,” in Proc. Int. Symp. Music Inf. Retreival (ISMIR), 2007.
[29]M. Wohlmayr, M. Stark, and F. Pernkopf, “A probabilistic interaction model for multipitch tracking with factorial hidden Markov models,”IEEE Trans. Audio, Speech, Lang. Process., vol. 19, no. 4, pp. 799–810,May 2011
[30]F. Bach and M. Jordan, “Discriminative training of hidden Markov models for multiple pitch tracking,” in Proc. IEEE Int. Conf. Acoust. Speech, Signal Process. (ICASSP), 2005, pp. 489–492.
[31]M. Bay, A. F. Ehmann, J. W. Beauchamp, P. Smaragdis, and J. S. Downie, “Second fiddle is important too: Pitch tracking individual voices in polyphonic music,” in Proc. Int. Soc. Music Inf. Retrieval Conf. (ISMIR), 2012, pp. 319–324
[32] Shoko Arakia,b,, Hiroshi Sawadaa , Ryo Mukaia , Shoji Makinoa,b “Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors”Signal Processing 87 (2007) 1833–1847
[33]Guodong Guo and Stan Z. Li ‘’Content-Based Audio Classification and Retrieval by Support Vector Machines”IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 14, NO. 1, JANUARY 2003 209
[34] Shuai Li; Xin-Jun Wang; Ying Zhang “X-SPA: Spatial Characteristic PSO Clustering Algorithm with Efficient Estimation of the Number of Cluster” Fuzzy Systems and Knowledge Discovery, 2008. FSKD ′08. Fifth International Conference on Year: 2008
[35] Rehab F. Abdel-Kader ‘’Genetically Improved PSO Algorithm for Efficient Data Clustering’’ Machine Learning and Computing (ICMLC), 2010 Second International Conference on Year: 2010
[36] Z. Duan, B. Pardo, and C. Zhang, “Multiple fundamental frequency estimation by modeling spectral peaks and non-peak regions,” IEEE Trans. Audio, Speech, Lang. Process., vol. 18, no. 8, pp. 2121–2133, Nov. 2010.
[37]D. Campbell, K. Palomäki, and G. Brown, “A matlab simulation of shoebox room acoustics for use in research andiimm teaching,” Comput. Inf. Syst. J., vol. 9, no. 3, pp. 48–51, Oct. 2005.
[38] R. Bittner, J. Salamon, M. Tierney, M. Mauch, C. Cannam, and J. P. Bello. Medleydb: A multitrack dataset for annotation-intensive mir research. In Proc.
Int. Soc. Music Info. Retrieval Conf., 2014.
[39]MIREX MultiF0 Development Dataset, Available : “http://www.music -ir.org/mirex/wiki/MIREX_HOME”
[40]Fred Cummins, Marco Grimaldi, Thomas Leonard and Juraj Simko “The CHAINS corpus: CHAracterizing INdividual Speakers” School of Computer Science and Informatics University College Dublin, Dublin 4, Ireland
[41]Xiang Wang; Zhitao Huang; Yiyu Zhou“Underdetermined DOA estimation and blind separation of non-disjoint sources in time-frequency domain based on sparse representation method”Journal of Systems Engineering and ElectronicsYear: 2014, Volume: 25, Issue: 1Pages: 17 - 25,
Koichi Ichige; Yoshihisa Ishikawa; Hiroyuki Arai
[42]“High resolution 2-D DOA estimation using second-order partial-differential of MUSIC spectrum”2008 IEEE International Symposium on Circuits and SystemsYear: 2008,Pages: 1152 - 1155,
[43]A. Jourjine, S. Rickard, O¨. Yılmaz, Blind separation of disjoint orthogonal signals: demixing N sources from 2 mixtures, in: Proceedings of the ICASSP 2000, vol. 12, 2000, pp. 2985–2988
[44]S. Araki, S. Makino, A. Blin, R. Mukai, H. Sawada, Underdetermined blind separation for speech in real environments with sparseness and ICA, in: Proceedings of the ICASSP 2004, vol. III, 2004, pp. 881–88
[45]Jae-Hun Choi; Joon-Hyuk Chang“Dual-Microphone Voice Activity Detection Technique Based on Two StepPower Level Difference Ratio” IEEE/ACM Transactions on Audio, Speech, and Language Processing.Year: 2014, Volume: 22, Issue: 6.Pages: 1069 - 1081, |