利用AAC壓縮域特徵之古典樂翻奏曲檢索系統

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：32

、訪客IP：3.144.117.62

姓名

莊詠婷(Yung-ting Chuang) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

利用AAC壓縮域特徵之古典樂翻奏曲檢索系統
(Classical Music Cover Song Retrieval System utilizing AAC Domain Features)

相關論文

★ 基於區域權重之衛星影像超解析技術	★ 延伸曝光曲線線性特性之調適性高動態範圍影像融合演算法
★ 實現於RISC架構之H.264視訊編碼複雜度控制	★ 基於卷積遞迴神經網路之構音異常評估技術
★ 具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割技術	★ 具有注意力機制之隱式表示於影像重建三維人體模型
★ 使用對抗式圖形神經網路之物件偵測張榮	★ 基於弱監督式學習可變形模型之三維人臉重建
★ 以非監督式表徵分離學習之邊緣運算裝置低延遲樂曲中人聲轉換架構	★ 基於序列至序列模型之 FMCW雷達估計人體姿勢
★ 基於多層次注意力機制之單目相機語意場景補全技術	★ 基於時序卷積網路之單FMCW雷達應用於非接觸式即時生命特徵監控
★ 視訊隨選網路上的視訊訊務描述與管理	★ 基於線性預測編碼及音框基頻週期同步之高品質語音變換技術
★ 基於藉語音再取樣萃取共振峰變化之聲調調整技術	★ 即時細緻可調性視訊在無線區域網路下之傳輸效率最佳化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

由於網際網路及多媒體壓縮技術已相當成熟，人們對網路的需求也日益劇增，透過網路下載或分享影音資料已成為人們生活中的一部分，而龐大的音樂資料庫是很常見的，因此如何在資料庫中快速檢索出使用者所需的資料是個重要的課題。常見的搜尋引擎大多藉由文字作為輸入，但卻有標記錯誤或模糊造成檢索結果錯誤的缺點，此情況於檢索古典樂時比流行樂更常發生。
本論文針對古典音樂資料庫，利用AAC壓縮域的特徵，部分解碼出改良式離散餘弦係數，可節省約70%的解碼運算複雜度，且對係數能量作前置處理以提升準確率，將係數重新定義於十二平均律音名，並利用內積計算求得樂曲相似度矩陣，藉由尋找最佳相似度累計路徑求得其相似度分數權重平均值，以得到最後檢索結果。實驗結果顯示，所提出之方法其檢索效能MRR值為0.96，可達97%的準確率，且與傳統基於原始域檢索的方法比較，可省下90%以上的比對時間。

摘要(英)

With the rapid development of Internet and multimedia compression techniques, people can easily download or share multimedia data through networks. Therefore, efficient multimedia retrieval from huge multimedia database becomes an important issue. The most common method of search engines is through textual label. However, the label created by people may be ambiguous or even with errors. The situation like this in retrieving classical music occurs more often than pop music.
In our proposed system, we focus on classical music cover song retrieval in AAC compression domain. The modified discrete cosine transform coefficients are directly used to represent 12-dimensional chroma feature without a fully decoding process, and it can save about 70% decoding complexity. We truncate MDCT coefficients with low magnitude, adjust frequency boundary dynamically, and utilize dot-product calculation to get chroma similarity matrix. We calculate the similarity weighted arithmetic mean value between the songs by finding optimal similarity accumulated path and finally get the ranking results.
The experimental results show that the proposed method can reach Precision of 97% and save over 90% matching time compared with traditional approach in the waveform domain.

關鍵字(中)

★ 古典樂
★ AAC
★ 壓縮域
★ 音樂檢索

關鍵字(英)

★ classical music cover song
★ AAC
★ compression domain
★ content-based music retrieval

論文目次

摘　要 i
Abstract ii
誌謝 iii
目　錄 iv
圖目錄 vi
表目錄 viii
第一章緒論 1
1.1　研究背景 1
1.2　研究動機與目的 2
1.3　論文架構 4
第二章音樂檢索與音訊壓縮技術簡介 5
2.1　音樂檢索之簡介 5
2.1.1 音樂內涵式檢索特徵概述及進展 6
2.1.2 翻唱歌曲定義及變異性 10
2.1.3 古典樂發展及特性 11
2.2　內涵式音樂檢索相關文獻 13
2.2.1 原始域之音樂檢索相關文獻介紹 13
2.2.2 壓縮域之音樂檢索相關文獻介紹 14
2.3　音訊壓縮技術簡介 16
第三章提出之壓縮域古典樂翻奏曲檢索方法 21
3.1　部分解碼 22
3.2　前置處理 22
3.2.1　MDCT係數能量截斷 23
3.2.2　動態頻率範圍截斷 25
3.3　音訊特徵擷取 27
3.3.1　特徵擷取 27
3.3.2　音段切割及正規化 29
3.4　相似度比對 30
3.4.1　相似度內積計算 31
3.4.2　動態時間扭曲累計 33
3.5　後置處理 35
第四章　實驗結果與分析討論 40
4.1　實驗環境與運算複雜度評估 40
4.2　系統效能評估方式 43
4.3　提出之檢索系統效能分析 45
4.3.1 系統參數實驗分析 45
4.3.2 整體系統效能分析 48
第五章　結論及未來展望 52
參考文獻 54

參考文獻

[1]The Official YouTube Blog. http://youtube-global.blogspot.jp/2013/03/onebillionstrong.html
[2]Music Information Retrieval Evaluation eXchange.
http://www.music-ir.org/mirex/wiki/MIREX_HOME
[3]J. Serra, E. Gomez, and P. Herrera, “Audio cover song identification and similarity: background, approaches, evaluation, and beyond,” Advances in Music Information Retrieval, vol. 274, ch. 14, pp. 307-332, March 2010.
[4]吉松隆文，吳怡文譯，古典音樂簡單到不行，初版，大雁文化，民國96年。
[5]吉松隆作，呂雅昕，游蕾蕾譯，古典音樂一下就聽懂名曲Guide，初版，大雁文化，民國96年。
[6]D. P. W. Ellis and G. E. Poliner, “Identifying ‘Cover Songs’ with Chroma Features and Dynamic Programming Beat Tracking,” in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Honolulu, Hawaii, U.S.A., pp. 1429-1432, April 15-20, 2007.
[7]J. Serra and E. Gomez, “Audio cover song identification based on tonal sequence alignment,” in Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Las Vegas, Nevada, U.S.A., pp.61-64, March 30- April 4, 2008.
[8]S. Kim, E. Unal, and S. Narayanan, “Music fingerprint extraction for classical music cover song identification,” in Proc. Int. Conf. on Multimedia and Expo, Hannover, pp.1261-1264, June 23- April 26, 2008.
[9]X. Chuan, “Cover song identification using an enhanced chroma over a binary classifier based similarity measurement framework,” in Proc. Int. Conf. on Systems and Informatics (ICSAI), Las Vegas, Nevada, U.S.A., pp.2170-2176, May 19- 20, 2012.
[10]T. Bertin-Mahieux and D. P. W. Ellis, “Large-scale cover song recognition using hashed chroma landmark,” in Proc. IEEE Int. Conf. on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, pp.117-120, Oct. 19-20, 2011.
[11]Z. C. Cheng, C. S. Lin, and Y. H. Chen, “Fast Music Information Retrieval Using PAT Tree Based Dynamic Time Warping,” in Proc. Int. Conf. on Communications and Signal Processing, Singapore, Dec. 2011, pp. 1 – 5.
[12]T. Bertin-Mahieus and D. Ellis, “Large-Scale Cover Song Recognition using the 2D Fourier Transform Magnitude,” in Proc. Int. Conf. on International Society for Music Information Retrieval Conference (ISMIR), Porto, Oct. 8-12 2012.
[13]E. Ravelli, G. Richard, and L. Daudet, “Audio Signal Representations for Indexing in the Transform Domain,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, no. 3, pp. 434-446, March 2010.
[14]T. H. Tsai and Y. T. Wang, “Content-Based Retrieval of Audio Example on MP3 Compression Domain,” in Proc. IEEE 6th Workshop on Multimedia Signal Processing, pp.123-126, September 2004.
[15]T. H. Tsai and W. C. Chang, “Two-Stage Method for Specific Audio Retrieval based on MP3 Compression Domain,” in Proc. IEEE International Symposium on Circuits and Systems, pp. 713-716, May 2009.
[16]C. C. Liu and C. S. Huang, “A singer identiﬁcation technique for content-based classiﬁcation of MP3 music objects,” in Proc. Int. Conf. on Information and Knowledge Management, McLean, Virginia, U.S.A., pp. 438-445, November 4-9, 2002.
[17]C. C. Liu and P. J. Tsai, “Content-based retrieval of Mp3 music objects,” in Proc. Int. Conf. on Information and knowledge management, New York, U.S.A. , pp. 506-511, 2011.
[18]Y. Jiao, B. Yang, M. Li, and X. M. Niu, “MDCT-Based Perceptual Hashing for Compressed Audio Content Identification,” in Proc. IEEE Int. Conf. on Multimedia Signal Processing, Crete , pp. 381-384, Oct. 1-3, 2011.
[19]ISO/IEC JTCI SC29/WG11, “ISO/IEC FDIS 14496-3 Subparts 1 ,2 ,3, Coding of Audio-Visual Objects – Part 3: Audio,” October 1988.
[20]M. Muller, D. P. W. Ellis, A. Klapuri, and G. Richard, “Signal Processing for Music Analysis,” IEEE Journal of Selected Topics in Signal Processing, vol. 5, no.6, pp.1088-1110, October 2011.
[21]T. H. Tsai, and C. Liu, “A Configurable Common Filterbank Processpr for Multi-Standard Audio Decoder,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. 90,no. 9, pp. 1913-1923, September 2007.

指導教授

張寶基(Pao-chi Chang)

審核日期

2013-7-22

推文