基於多重時間描述之內涵式音樂檢索

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：23

、訪客IP：3.16.47.175

姓名

戴齊廷(Chi-ting Day) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

基於多重時間描述之內涵式音樂檢索
(Temporal Multi-Descriptors)

相關論文

★ 基於區域權重之衛星影像超解析技術	★ 延伸曝光曲線線性特性之調適性高動態範圍影像融合演算法
★ 實現於RISC架構之H.264視訊編碼複雜度控制	★ 基於卷積遞迴神經網路之構音異常評估技術
★ 具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割技術	★ 具有注意力機制之隱式表示於影像重建三維人體模型
★ 使用對抗式圖形神經網路之物件偵測張榮	★ 基於弱監督式學習可變形模型之三維人臉重建
★ 以非監督式表徵分離學習之邊緣運算裝置低延遲樂曲中人聲轉換架構	★ 基於序列至序列模型之 FMCW雷達估計人體姿勢
★ 基於多層次注意力機制之單目相機語意場景補全技術	★ 基於時序卷積網路之單FMCW雷達應用於非接觸式即時生命特徵監控
★ 視訊隨選網路上的視訊訊務描述與管理	★ 基於線性預測編碼及音框基頻週期同步之高品質語音變換技術
★ 基於藉語音再取樣萃取共振峰變化之聲調調整技術	★ 即時細緻可調性視訊在無線區域網路下之傳輸效率最佳化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著多媒體壓縮技術、行動裝置與行動網路的蓬勃發展，透過串流平台或社群網站分享、下載各種多媒體影音資料已成為日常生活的一部分。而對於不經意聽到卻感興趣的歌曲，內涵式音樂檢索(Content Based Music Retrieval, CBMR)可直接利用歌曲內容如旋律、音色等特徵做為檢索依據，避免使用者無法描述其關鍵字或標注錯誤的情況。
面對大量的檢索資料庫所耗費的大量比對時間，本研究提出以稀疏自編碼器(Sparse Auto Encoder, SAE)將片段時間的音訊Chroma特徵轉換為資訊含量較高的描述元(Descriptor)，藉由學習找出相對關鍵的特徵增加檢索效能，並降低比對的特徵數量減少比對時間。實驗結果顯示，本研究提出之方法不僅節省50%以上的時間，也大幅提升MRR值，說明長時間的特徵更能描述歌曲檢索資訊。

摘要(英)

Nowadays, sharing or downloading multimedia resources from the internet has become part of our daily life. However, it is hard to find the particular music in such a tremendous amount of data on internet when it comes to searching the music with limited information. The Content Based Music Retrieval (CBMR) can direct get the desired music by using features extracted from the content as the keywords for searching.
To deal with massive retrieval data, we use Chroma clip as input for the Sparse Auto Encoder (SAE) transferring feature to Descriptor before matching to reduce feature’s quantity, and learning which parts is more important for the input data. The experiment results show that our method provide over 50% matching time reduction and higher MRR compared with traditional approach.

關鍵字(中)

★ 音樂檢索
★ 翻唱歌曲
★ 類神經網路
★ 深度學習

關鍵字(英)

★ Music Retrieval
★ Cover Song
★ Neural Network
★ Deep Learning

論文目次

摘　要 I
Abstract II
致謝 III
目　錄 IV
附圖索引 VI
附表索引 VIII
第一章緒論 1
1.1　研究背景 1
1.2　研究動機與目的 2
1.3　論文架構 2
第二章內涵式音樂檢索簡介 3
2.1　音樂檢索及其特性元素簡介 3
2.1.1 音樂特性元素 4
2.1.2 翻唱歌曲辨識及相關研究 5
2.2　檢索特徵 8
2.3　比對方法 10
2.3.1 Optimal Transposition Index 10
2.3.2動態時間扭曲 11
第三章類神經網路 14
3.1　類神經網路 15
3.2　深度神經網路 20
3.2.1摺積神經網路 22
3.2.2稀疏自編碼器 23
第四章　多重時間描述分析 25
4.1　系統架構 25
4.2　實驗數據及參數分析 31
4.2.1 特徵轉換 32
4.2.2 檢索系統效能 35
4.2.3 多重時間之檢索系統 38
第五章　結論及未來展望 41
參考文獻 42

參考文獻

[1] M. A. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, and M. Slaney, “Content-Based Music Information Retrieval: Current Directions and Feature Challenges,” in Proc. of the IEEE, vol. 96 no. 4, pp. 668-696, April 2008.
[2] 侯志欽，聲學原理與多媒體音訊科技，初版，台灣商務印書館，台北市，民國九十六年。
[3] 陳仁寬，樂理入門與指導，初版，五洲出版有限公司，台北市，民國八十五年。
[4] Music Information Retrieval Evaluation eXchange,
http://www.music-ir.org/mirex/wiki/2006:Main_Page
[5] J. Serra, E. Gomez, and P. Herrera, “Audio cover song identification and similarity: background, approaches, evaluation, and beyond,” Advances in Music Information Retrieval, vol. 274, ch. 14, pp. 307-332, March 2010.
[6] D. P. W. Ellis, and G.E. Poliner, “Identifying ‘Cover Songs’ with Chroma Features and Dynamic Programming Beat Tracking,” in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Honolulu, Hawaii, U.S.A., pp. 1429-1432, April 15-20, 2007.
[7] K. Lee, “Identifying Cover Songs from Audio Using Harmonic Representation,” extended abstract submitted to MIREX (Music Information Retrieval Evaluation eXchange) task on Audio Cover Song Identification, 2006.
[8] C. Sailer, and Karin Dressler, “Finding cover songs by melodic similarity,” extended abstract submitted to MIREX (Music Information Retrieval Evaluation eXchange) task on Audio Cover Song Identification, 2006.
[9] D. P. W. Ellis, and C. Cotton, “THE 2007 LABROSA COVER SONG DETECTION SYSTEM,” extended abstract submitted to MIREX (Music Information Retrieval Evaluation eXchange) task on Audio Cover Song Identification, 2006.
[10] J. Serra, and E. Gomez, “Audio cover song identification based on tonal sequence alignment,” in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Las Vegas, Nevada, U.S.A., pp.61-64, March 30- April 4, 2008.
[11] S. Ravuri, and D. P. W. Ellis, “Cover song detection: From high scores to general classification,” in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Dallas, Texas, U.S.A., pp. 65-68, March 14-19, 2010.
[12] J. Serra, “Music similarity based on sequences of descriptors: tonal features applied to audio cover song identiﬁcation,” M.S. thesis, MTG, Universitat Pompeu Fabra, Barcelona, Spain, 2007.
[13] 謝佳斌，AAC壓縮域翻唱歌曲辨識系統。中央大學通訊工程學系碩士學位論文，2012。
[14] 莊詠婷，利用AAC壓縮域特徵之古典樂翻奏曲檢索系統。中央大學通訊工程學系碩士學位論文，2013。
[15] E. Keogh, C. A. Ratanamahatana, “Exact indexing of dynamic time warping,” Knowledge and Information Systems, 2004.
[16] Yue Liu, and Hui Liu, and Bofeng Zhang and Gengfeng Wu, “Extraction of if-then rules from trained neural network and its application to earthquake prediction,” Cognitive Informatics, 2004. Proceedings of the Third IEEE International Conference.
[17] T. Kondo, J. Ueno, and S. Takao, “Medical image diagnosis of lung cancer by revised GMDH-type neural network self-selecting optimum neuron architectures,” System Integration (SII), IEEE/SICE International Symposium, 2011.
[18] N. L. D. Khoa, K. Sakakibara, and I. Nishikawa, “Stock Price Forecasting using Back Propagation Neural Networks with Time and Profit Based Adjusted Weight Factors,” SICE-ICASE, International Joint Conference, 2006.
[19] G. E. Hinton, and R. R , Salakhutdinov, “Reducing the dimensionality of data with neural networks,” Science, Vol. 313. no. 5786, pp. 504 - 507, 28 July 2006.
[20] http://www.ling.fju.edu.tw/hearing/brain-into.htm
[21] D. E. Rumelhart, G. E. Hinton, R. J. Williams, “Learning representations by back-propagating errors,” Nature 323 (6088): 533–536, 8 October 1986.
[22] http://www.nature.com/news/computer-science-the-learning-machines-1.14481#/b1
[23] D. H. Ackley, G. E. Hinton, T. J. Sejnowski, “A Learning Algorithm for Boltzmann Machines,” In D. E. Rumelhart, J. L. McClelland, and the PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations (Cambridge: MIT Press): 282–317. 1985.
[24] P. Smolensky, Parallel Distributed Processing: Volume 1:Foundations, D. E. Rumelhart, J. L. McClelland, Eds. (MIT Press, Cambridge, 1986), pp. 194–281
[25] A. Mnih, and G. E. Hinton, “Learning Unreliable Constraints using Contrastive Divergence,” In IJCNN 2005, Montreal.
[26] Y. Bengio, P. Lamblin, D. Popovici, H. Larochelle, “Greedy Layer-Wise Training of Deep Networks,” Advances in Neural Information Processing Systems 19, 2007.
[27] G. Casella, E. I. George, “Explaining the Gibbs Sampler,” The American Statistician 46 (3): 167, 1992.
[28] V. Nair, and G. E. Hinton, “3-D Object recognition with deep belief nets,” Advances in Neural Information Processing Systems 22, Y. Bengio, D. Schuurmans, J. lafferty, C. K. I. Williams, and A. Culotta (Eds.), pp 1339-1347.
[29] A. R. Mohamed, G. E. Dahl, and G. E. Hinton, “Deep belief networks for phone recognition,” NIPS 22 workshop on Deep Learning for Speech Recognition.
[30] G. E. Hinton, L. Deng, D. Yu, G. Dahl, A. Mohamed, Navdeep Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. Sainath, and B. Kingsbury, “Deep Neural Networks for Acoustic Modeling in Speech Recognition,” IEEE Signal Processing Magazine, November, 2012.
[31] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-Based Learning Applied to Document Recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998.
[32] I. Mrazova, M. Kukacka, “Hybrid convolutional neural networks,” Industrial Informatics INDIN 2008. 6th IEEE International Conference, 2008.
[33] C. Neubauer, “Evaluation of convolutional neural networks for visual recognition,” IEEE Transactions on Neural Networks, VOL. 9, NO. 4, July 1998
[34] Andrew Ng, “Sparse Autoencoder,” Lecture notes. Deep Learning and Unsupervised Feature Learning, Winter, 2011
[35] Matlab Central, Deep Learning Toolbox,
http://www.mathworks.com/matlabcentral/fileexchange/38310-deep-learning-toolbox
[36] The Covers80 cover song data set,
http://labrosa.ee.columbia.edu/projects/coversongs/covers80/

指導教授

張寶基(Pao-chi Chang)

審核日期

2014-7-31

推文