利用快速碼簿搜尋之AMR至G.729A語音轉碼

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：63

、訪客IP：3.144.6.223

姓名

黃麗芳(Li-Fang Huang) 查詢紙本館藏

畢業系所

通訊工程學系

論文名稱

利用快速碼簿搜尋之AMR至G.729A語音轉碼
(AMR to G.729A speech transcoding with fast codebook search)

相關論文

★ 基於區域權重之衛星影像超解析技術	★ 延伸曝光曲線線性特性之調適性高動態範圍影像融合演算法
★ 實現於RISC架構之H.264視訊編碼複雜度控制	★ 基於卷積遞迴神經網路之構音異常評估技術
★ 具有元學習分類權重轉移網路生成遮罩於少樣本圖像分割技術	★ 具有注意力機制之隱式表示於影像重建三維人體模型
★ 使用對抗式圖形神經網路之物件偵測張榮	★ 基於弱監督式學習可變形模型之三維人臉重建
★ 以非監督式表徵分離學習之邊緣運算裝置低延遲樂曲中人聲轉換架構	★ 基於序列至序列模型之 FMCW雷達估計人體姿勢
★ 基於多層次注意力機制之單目相機語意場景補全技術	★ 基於時序卷積網路之單FMCW雷達應用於非接觸式即時生命特徵監控
★ 視訊隨選網路上的視訊訊務描述與管理	★ 基於線性預測編碼及音框基頻週期同步之高品質語音變換技術
★ 基於藉語音再取樣萃取共振峰變化之聲調調整技術	★ 即時細緻可調性視訊在無線區域網路下之傳輸效率最佳化研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

隨著網路的發達，網際網路除可傳送數據資料外，人們也可以使用行動通信系統透過網際網路與IP電話做連結。由於行動通信與VoIP所使用之語音編碼技術不盡相同，因此語音轉碼（speech transcoding）是網路語音系統中不可缺少的機制，此技術尚可應用在網路連線遊戲及語音聊天室等娛樂用途。
傳統上最佳的語音轉碼方法是使用完全解碼（full decoding）的方式，在過程上必需進行語音的壓縮及解壓縮處理，造成運算複雜度過高與時間延遲長的缺點。為此，本論文利用脈衝代換之快速碼簿搜尋法，提出一套部份解碼（partial decoding）方式的語音轉碼方法，利用語音訊號的特性，以碼框（frame）為單位，分析代表各語音所需的語音參數，藉由參數的轉換以達到語音轉碼的效果。該組目標音訊參數亦符合原壓縮方法之壓縮格式。可運用在AMR與G.729A語音壓縮標準上，並可有效地降低運算複雜度，就每一音框所需的時脈刻劃時間（clockticks），約為完全解碼法的7.2%，且可得到與完全解碼法接近之語音品質。

摘要(英)

As the development of the internet technique, we not only can transmit the data but also connect 3GPP with VoIP over internet . Because of the coding schemes of 3GPP are not the same as VoIP, speech transcoding scheme is needed in the voice system over internet. Speech transcoding scheme can make the connection between users successful, and furthermore, it can be used in entertainment applications, such as audio chat rooms and online games.
Full decoding technique is an intuitive and traditional speech transcoding method, but it requires high computational complexity and long processing time. In this work, we propose a partial decoding technique with fast codebook search, which utilizes the pulse replacement method, on ACELP coding architecture. There is no need to redo all the decoding and encoding processes. Partial decoding method can be directly applied to ACELP based speech coding, such as AMR and G.729A speech standards. It achieves excellent voice quality as the full decoding method does while it only requires 7.2% computation loading on clockticks per frame.

關鍵字(中)

★ G.729A
★ AMR
★ 語音轉碼

關鍵字(英)

★ AMR
★ G.729A
★ Speech transcoding

論文目次

中文摘要 .………...……..……………………………………………. I
Abstract …………………………………………………….………… II
目錄 ……………………………………………………….………… III
附圖索引 ………………………………………………….………. VI
附表索引 ………………………………………...……….………. VIII
第一章緒論 ………………………..……………….………… 1
1.1語音轉碼技術簡介 ………………………………………… 1
1.2研究動機與目的 ………………..………………………….. 2
1.3論文架構 ……………….…………………..………………... 5
第二章語音編碼技術…………..…………………………… 7
2.1語音的特性 ………….…….……………..…………… 7
2.2線性預測編碼（Linear Prediction Coding）….……… 9
2.3基頻估測（Pitch Prediction） ……………………………. 14
2.4 CELP編碼理論 ……………………………………...….…. 17
2.5語音編碼系統的特性 …………………………………….. 22
2.5.1位元率 ……………………………...……………...….…. 22
2.5.2時間延遲 ……………………………………………...…. 24
2.5.3運算複雜度 ……………….…………………………..…. 26
2.5.4語音品質 ………………….…………………………..…. 26
2.6語音轉碼之品質評量方法 ...…………………………….. 28
2.6.1客觀品質評量 ……………………...……………...….…. 28
2.6.2主觀品質評量 ………………………………………...…. 29
第三章 AMR與G.7239A差異性研究 ….……….…… 31
3.1線性預測分析及量化（Linear Prediction Analysis
and Quantization） ………………………..…………… 34
3.2感觀權重濾波器（Perceptual Weighting Filter） ..….…. 41
3.3開迴路基頻分析（Open Loop Pitch Analysis） ……….. 43
3.4適應性碼簿搜尋（Adaptive-codebook Search） ………. 46
3.5固定性碼簿之結構與搜尋 …………………………....…. 53
3.6增益量化 ……………………………………….………...…. 60
第四章語音轉碼方法討論 ….....................………….……64
4.1完全解碼法 .........................…………..…………......…. 65
4.2部分解碼法 .......………............................…………….…..66
4.2.1 LSP係數 ...…………..……………..……………..…… 67
4.2.2基頻（Pitch）係數 …..……………..……………..…… 70
4.2.3固定性碼簿搜尋 ....…..……………..……………..…… 71
4.2.3.1焦點搜尋法 ..……………….…………..…………73
4.2.3.2最深樹狀搜尋法 ..………….…………..…………74
4.2.3.3最深樹狀搜尋法 ..………….…………..…………75
4.2.4增益係數 ...............………………....……………..…… 77
第五章模擬實驗與品質評估…………………………….. 79
5.1模擬環境及語音資料…..………..……………………...… 79
5.2客觀語音品質評估 ......................…………………..…..… 82
5.2.1 語音轉碼品質 ..............…..…………..…………….……82
5.2.2 討論 ..............................…..…………..……………..…88
5.3運算複雜度分析 ………………………………………...… 91
第六章結論與未來工作 ..…………………….….……..… 92
參考文獻 ……………………………………………………...… 94

參考文獻

[1] ETSI, "Digital Cellular Telecommunications System（Phase 2+）；Adaptive Muliti-Rate（AMR）speech transcoding," EN 301 704, Apr. 2000.
[2] ITU-T Recommendation G.729A, "Coding of speech at 8 kbit/s using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP)," Mar. 1996.
[3] Y. Ota, M. Suzuki and Y. Tsuchinaga, "Speech coding translation for IP and 3G mobile integrated network," Proc. of ICC, pp. 114-118, Apr. 2002.
[4] S. Lee, S. Seo and D. Jang, "A novel transcoding algorithm for AMR and EVRC speech codecs via direct parameter transformation," Proc. of ICASSP, pp. 177-180, vol. 2, Apr. 2003.
[5] J. Choi, C. Lee, H. Kang, Y. Park and D. Youn, "Improvement issues on transcoding algorithms for the flexible usage to the various pairs of speech codec," Proc. of ICASSP, pp. I-269 ~ I-272, May. 2004.
[6] 余志剛, 胡波, " AMR與G.729A的參數直接轉換算法," 信息與電子工程, 第四期,中華民國九十四年十二月.
[7] ITU-T Recommendation P.862, "Perceptual evaluation of speech quality (PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone network and speech codecs," Feb. 2001.
[8] L. R. Rabiner, R. W. Schafer, Digital Prediction of Speech Signals, Prentice Hall, 1978.
[9] A. S. Spanias, "Speech Coding: A Tutorial Review," Proc. of the IEEE, vol. 82, no. 10, pp. 1541-82, Oct. 1994.
[10] A. M. Kondoz, Digital Speech Coding for Low Bit Rate Communications Systems, Wiley, 1994
[11] D. G. Rowe, "Techniques for Harmonic Sinusoidal Coding," Ph.D. Thesis, University of South Australia, 1997.
[12] M. R. Scheroeder, B. S. Atal, "Code-excited linear prediction (CELP): high quality speech at very low bit rate," Proc. of ICASSP, pp. 937-940, Mar. 1985
[13] B. Gold and C. Rader, "The channel vocoder," IEEE Trans. On Audio, vol. 15, pp. 148-161, Dec. 1967.
[14] I. Gibson, "Vector sum excited linear prediction (VSCELP) speech coding for Japan digital cellular," presented at the Meeting of IEICE, paper RCS90-26, Nov. 1990.
[15] J. P. Campbell, T. E. Tremain and V. C. Welch, "The DOD 4.8 kbps standard (Proposed Federal Standard 1016)," Advances in Speech Coding, Kluwer Academic Publishers, pp. 121-133, 1991.
[16] R. V. Cox, "Three new speech codecs from the ITU cover a range of application," IEEE Comm. Magazine, Sep. 1997.
[17] S. Lin and D. J. Constello, "Error control coding fundamentals and applications," Prentice-Hall, 1983.
[18] A. Gersho, "Advances in speech and audio compression," Proc. of IEEE, vol. 82, no. 6, pp. 900-918, Jun. 1994.
[19] R. V. Cox and P. Kroon, "Low bit-rate speech coders for multimedia communication," IEEE Comm. Magazine, vol. 34, no. 12, pp. 34-41, Dec. 1996.
[20] A. Papoulis, Probability, Random Variables, and Stochastic Processes, third edition, McGraw-Hill, 1991.
[21] T. Fingscheidt, P. Vary and J. A. Andonegui, "Robust speech decoding: can error concealment be better than error correction," Proc. of ISSP, vol. 1, pp. 373-376, May 1998.
[22] 廖瑞祥, 無線傳輸環境下G.723.1語音編碼之位元保護與錯誤隱藏處理, 碩士論文, 中央大學, 1998.
[23] 朱復興, 無線傳輸及網際網路環境下之G.729與G.723.1語音傳輸, 碩士論文, 中央大學, 2000.
[24] S. Atungsiri, R. Soheili, A. M. Kondoz and B. G. Evans, "Effective lost speech frame reconstruction for CELP coders," Proc. of EUROSPEECH Conf., volume 2, Sep. 1991.
[25] C. Hoene, H. Karl, and A. Wolisz, "A perceptual quality model for adaptive VoIP Applications," Proc. of SPECTS, San Jose, CA, July 2004.
[26] H. C. Park, Y. C. Choi and D. Y. Lee, "Efficient codebook search method for ACELP speech codes," Proc. of IEEE Speech Coding Workshop, pp. 17-19, Oct.2002.
[27] M. Ghenania and C. Lamblin, "Low-cost smart transcoding algorithm between ITU-T G.729（8kbit/s） and 3GPPNB-AMR（12.2kbit/s）," Proc. of Eusipco, Vienna, 2004.
[28] A. Lovrich and J. Reimer, "A multi-rate transcoder," IEEE Trans. On Consumer Electronics, vol. 35, pp. 715-722, Jun. 1989.
[29] H. G. Kang, H. K. Kim and R. V. Cox, "Improving the transcoding capability of speech coders," IEEE Trans. On Multimedia, vol. 5, pp. 24-33, Mar. 2003.
[30] 陳慶彰, 運用G.729與G.723.1於多點會議系統之多聲道語音混合方法, 碩士論文, 中央大學, 2001.
[31] 楊東敏, 運用G.729與G.723.1於多點會議系統之多聲道語音混合方法, 碩士論文, 中央大學, 2001.

指導教授

張寶基(Pao-Chi Chang)

審核日期

2007-7-23

推文