以作者查詢圖書館館藏 、以作者查詢臺灣博碩士 、以作者查詢全國書目 、勘誤回報 、線上人數:63 、訪客IP:3.144.6.223
姓名 黃麗芳(Li-Fang Huang) 查詢紙本館藏 畢業系所 通訊工程學系 論文名稱 利用快速碼簿搜尋之AMR至G.729A語音轉碼
(AMR to G.729A speech transcoding with fast codebook search)相關論文 檔案 [Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] [檢視] [下載]
- 本電子論文使用權限為同意立即開放。
- 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
- 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
摘要(中) 隨著網路的發達,網際網路除可傳送數據資料外,人們也可以使用行動通信系統透過網際網路與IP電話做連結。由於行動通信與VoIP所使用之語音編碼技術不盡相同,因此語音轉碼(speech transcoding)是網路語音系統中不可缺少的機制,此技術尚可應用在網路連線遊戲及語音聊天室等娛樂用途。
傳統上最佳的語音轉碼方法是使用完全解碼(full decoding)的方式,在過程上必需進行語音的壓縮及解壓縮處理,造成運算複雜度過高與時間延遲長的缺點。為此,本論文利用脈衝代換之快速碼簿搜尋法,提出一套部份解碼(partial decoding)方式的語音轉碼方法,利用語音訊號的特性,以碼框(frame)為單位,分析代表各語音所需的語音參數,藉由參數的轉換以達到語音轉碼的效果。該組目標音訊參數亦符合原壓縮方法之壓縮格式。可運用在AMR與G.729A語音壓縮標準上,並可有效地降低運算複雜度,就每一音框所需的時脈刻劃時間(clockticks),約為完全解碼法的7.2%,且可得到與完全解碼法接近之語音品質。摘要(英) As the development of the internet technique, we not only can transmit the data but also connect 3GPP with VoIP over internet . Because of the coding schemes of 3GPP are not the same as VoIP, speech transcoding scheme is needed in the voice system over internet. Speech transcoding scheme can make the connection between users successful, and furthermore, it can be used in entertainment applications, such as audio chat rooms and online games.
Full decoding technique is an intuitive and traditional speech transcoding method, but it requires high computational complexity and long processing time. In this work, we propose a partial decoding technique with fast codebook search, which utilizes the pulse replacement method, on ACELP coding architecture. There is no need to redo all the decoding and encoding processes. Partial decoding method can be directly applied to ACELP based speech coding, such as AMR and G.729A speech standards. It achieves excellent voice quality as the full decoding method does while it only requires 7.2% computation loading on clockticks per frame.關鍵字(中) ★ G.729A
★ AMR
★ 語音轉碼關鍵字(英) ★ AMR
★ G.729A
★ Speech transcoding論文目次 中文摘要 .………...……..……………………………………………. I
Abstract …………………………………………………….………… II
目錄 ……………………………………………………….………… III
附圖索引 ………………………………………………….………. VI
附表索引 ………………………………………...……….………. VIII
第一章 緒論 ………………………..……………….………… 1
1.1語音轉碼技術簡介 ………………………………………… 1
1.2研究動機與目的 ………………..………………………….. 2
1.3論文架構 ……………….…………………..………………... 5
第二章 語音編碼技術…………..…………………………… 7
2.1語音的特性 ………….…….……………..…………… 7
2.2線性預測編碼(Linear Prediction Coding)….……… 9
2.3基頻估測(Pitch Prediction) ……………………………. 14
2.4 CELP編碼理論 ……………………………………...….…. 17
2.5語音編碼系統的特性 …………………………………….. 22
2.5.1位元率 ……………………………...……………...….…. 22
2.5.2時間延遲 ……………………………………………...…. 24
2.5.3運算複雜度 ……………….…………………………..…. 26
2.5.4語音品質 ………………….…………………………..…. 26
2.6語音轉碼之品質評量方法 ...…………………………….. 28
2.6.1客觀品質評量 ……………………...……………...….…. 28
2.6.2主觀品質評量 ………………………………………...…. 29
第三章 AMR與G.7239A差異性研究 ….……….…… 31
3.1線性預測分析及量化(Linear Prediction Analysis
and Quantization) ………………………..…………… 34
3.2感觀權重濾波器(Perceptual Weighting Filter) ..….…. 41
3.3開迴路基頻分析(Open Loop Pitch Analysis) ……….. 43
3.4適應性碼簿搜尋(Adaptive-codebook Search) ………. 46
3.5固定性碼簿之結構與搜尋 …………………………....…. 53
3.6增益量化 ……………………………………….………...…. 60
第四章 語音轉碼方法討論 ….....................………….……64
4.1完全解碼法 .........................…………..…………......…. 65
4.2部分解碼法 .......………............................…………….…..66
4.2.1 LSP係數 ...…………..……………..……………..…… 67
4.2.2基頻(Pitch)係數 …..……………..……………..…… 70
4.2.3固定性碼簿搜尋 ....…..……………..……………..…… 71
4.2.3.1焦點搜尋法 ..……………….…………..…………73
4.2.3.2最深樹狀搜尋法 ..………….…………..…………74
4.2.3.3最深樹狀搜尋法 ..………….…………..…………75
4.2.4增益係數 ...............………………....……………..…… 77
第五章 模擬實驗與品質評估…………………………….. 79
5.1模擬環境及語音資料…..………..……………………...… 79
5.2客觀語音品質評估 ......................…………………..…..… 82
5.2.1 語音轉碼品質 ..............…..…………..…………….……82
5.2.2 討論 ..............................…..…………..……………..…88
5.3運算複雜度分析 ………………………………………...… 91
第六章 結論與未來工作 ..…………………….….……..… 92
參考文獻 ……………………………………………………...… 94參考文獻 [1] ETSI, "Digital Cellular Telecommunications System(Phase 2+);Adaptive Muliti-Rate(AMR)speech transcoding," EN 301 704, Apr. 2000.
[2] ITU-T Recommendation G.729A, "Coding of speech at 8 kbit/s using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP)," Mar. 1996.
[3] Y. Ota, M. Suzuki and Y. Tsuchinaga, "Speech coding translation for IP and 3G mobile integrated network," Proc. of ICC, pp. 114-118, Apr. 2002.
[4] S. Lee, S. Seo and D. Jang, "A novel transcoding algorithm for AMR and EVRC speech codecs via direct parameter transformation," Proc. of ICASSP, pp. 177-180, vol. 2, Apr. 2003.
[5] J. Choi, C. Lee, H. Kang, Y. Park and D. Youn, "Improvement issues on transcoding algorithms for the flexible usage to the various pairs of speech codec," Proc. of ICASSP, pp. I-269 ~ I-272, May. 2004.
[6] 余志剛, 胡波, " AMR與G.729A的參數直接轉換算法," 信息與電子工程, 第四期,中華民國九十四年十二月.
[7] ITU-T Recommendation P.862, "Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone network and speech codecs," Feb. 2001.
[8] L. R. Rabiner, R. W. Schafer, Digital Prediction of Speech Signals, Prentice Hall, 1978.
[9] A. S. Spanias, "Speech Coding: A Tutorial Review," Proc. of the IEEE, vol. 82, no. 10, pp. 1541-82, Oct. 1994.
[10] A. M. Kondoz, Digital Speech Coding for Low Bit Rate Communications Systems, Wiley, 1994
[11] D. G. Rowe, "Techniques for Harmonic Sinusoidal Coding," Ph.D. Thesis, University of South Australia, 1997.
[12] M. R. Scheroeder, B. S. Atal, "Code-excited linear prediction (CELP): high quality speech at very low bit rate," Proc. of ICASSP, pp. 937-940, Mar. 1985
[13] B. Gold and C. Rader, "The channel vocoder," IEEE Trans. On Audio, vol. 15, pp. 148-161, Dec. 1967.
[14] I. Gibson, "Vector sum excited linear prediction (VSCELP) speech coding for Japan digital cellular," presented at the Meeting of IEICE, paper RCS90-26, Nov. 1990.
[15] J. P. Campbell, T. E. Tremain and V. C. Welch, "The DOD 4.8 kbps standard (Proposed Federal Standard 1016)," Advances in Speech Coding, Kluwer Academic Publishers, pp. 121-133, 1991.
[16] R. V. Cox, "Three new speech codecs from the ITU cover a range of application," IEEE Comm. Magazine, Sep. 1997.
[17] S. Lin and D. J. Constello, "Error control coding fundamentals and applications," Prentice-Hall, 1983.
[18] A. Gersho, "Advances in speech and audio compression," Proc. of IEEE, vol. 82, no. 6, pp. 900-918, Jun. 1994.
[19] R. V. Cox and P. Kroon, "Low bit-rate speech coders for multimedia communication," IEEE Comm. Magazine, vol. 34, no. 12, pp. 34-41, Dec. 1996.
[20] A. Papoulis, Probability, Random Variables, and Stochastic Processes, third edition, McGraw-Hill, 1991.
[21] T. Fingscheidt, P. Vary and J. A. Andonegui, "Robust speech decoding: can error concealment be better than error correction," Proc. of ISSP, vol. 1, pp. 373-376, May 1998.
[22] 廖瑞祥, 無線傳輸環境下G.723.1語音編碼之位元保護與錯誤隱藏處理, 碩士論文, 中央大學, 1998.
[23] 朱復興, 無線傳輸及網際網路環境下之G.729與G.723.1語音傳輸, 碩士論文, 中央大學, 2000.
[24] S. Atungsiri, R. Soheili, A. M. Kondoz and B. G. Evans, "Effective lost speech frame reconstruction for CELP coders," Proc. of EUROSPEECH Conf., volume 2, Sep. 1991.
[25] C. Hoene, H. Karl, and A. Wolisz, "A perceptual quality model for adaptive VoIP Applications," Proc. of SPECTS, San Jose, CA, July 2004.
[26] H. C. Park, Y. C. Choi and D. Y. Lee, "Efficient codebook search method for ACELP speech codes," Proc. of IEEE Speech Coding Workshop, pp. 17-19, Oct.2002.
[27] M. Ghenania and C. Lamblin, "Low-cost smart transcoding algorithm between ITU-T G.729(8kbit/s) and 3GPPNB-AMR(12.2kbit/s)," Proc. of Eusipco, Vienna, 2004.
[28] A. Lovrich and J. Reimer, "A multi-rate transcoder," IEEE Trans. On Consumer Electronics, vol. 35, pp. 715-722, Jun. 1989.
[29] H. G. Kang, H. K. Kim and R. V. Cox, "Improving the transcoding capability of speech coders," IEEE Trans. On Multimedia, vol. 5, pp. 24-33, Mar. 2003.
[30] 陳慶彰, 運用G.729與G.723.1於多點會議系統之多聲道語音混合方法, 碩士論文, 中央大學, 2001.
[31] 楊東敏, 運用G.729與G.723.1於多點會議系統之多聲道語音混合方法, 碩士論文, 中央大學, 2001.指導教授 張寶基(Pao-Chi Chang) 審核日期 2007-7-23 推文 facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu 網路書籤 Google bookmarks del.icio.us hemidemi myshare