不同人工電子耳編碼策略之兒歌感知時頻分析

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：18

、訪客IP：18.224.68.104

姓名

普蒂薇(Epri Wahyu Pratiwi) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

不同人工電子耳編碼策略之兒歌感知時頻分析
(Temporal and Spectral Analysis of Children Song Perception with Different Simulated Cochlear Implant Coding Strategies)

相關論文

★ 獨立成份分析法於真實環境中聲音訊號分離之探討	★ 口腔核磁共振影像的分割與三維灰階值內插
★ 數位式氣喘尖峰氣流量監測系統設計	★ 結合人工電子耳與助聽器對中文語音辨識率的影響
★ 人工電子耳進階結合編碼策略的中文語音辨識成效模擬--結合助聽器之分析	★ 中文發聲之神經關聯性的腦功能磁振造影研究
★ 利用有限元素法建構3維的舌頭力學模型	★ 以磁振造影為基礎的立體舌頭圖譜之建構
★ 腎小管之草酸鈣濃度變化與草酸鈣結石關係之模擬研究	★ 口腔磁振影像舌頭構造之自動分割
★ 微波輸出窗電性匹配之研究	★ 以軟體為基準的助聽器模擬平台之發展-噪音消除
★ 以軟體為基準的助聽器模擬平台之發展-回饋音消除	★ 模擬人工電子耳頻道數、刺激速率與雙耳聽對噪音環境下中文語音辨識率之影響
★ 用類神經網路研究中文語音聲調產生之神經關聯性	★ 教學用電腦模擬生理系統之建構

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2024-9-1以後開放)

摘要(中)

音樂的特徵囊括了不同的頻譜線索和時域線索，這些線索在音樂感知上扮演了重要的角色。人工電子耳(Cochlear Implant, CI)編碼策略主要用於語音傳遞，但此法仍會造成音訊失真。本研究檢驗了音調(pitch)與節律(rhythm)對於旋律辨識(melody recognition)的相對貢獻，同時也評估了三種人工電子耳編碼策略對音樂品質的影響，此三種策略分別為：進階聯合編碼(Advanced Combinational Encoder, ACE) 、基本頻率調變(Fundamental Frequency Modulation, F0mod) ，以及包絡增強(Envelope Enhancement, EE) 。

本研究的旋律資料庫內有30首流行於台灣的童歌，每首童歌都有兩種音樂線索，分別為時域線索(音調)以及頻域線索(節律)，這些童歌的旋律皆會經由中央大學的人工電子耳模擬器搭配三種人工耳電子耳的編碼策略所處理。接下來，藉由一個主觀的聆聽測試，此測試是從熟悉旋律辨識(Familiar Melody Identification, FMI)測試中處理後的刺激訊號來測量旋律感知(melody perception)，藉此收集作答正確率及反應時間。共有5名正常聽力參與者參加熟悉旋律辨識測試。開始測試時要先從30首童歌中選出15首，參與者需要使用這15首歌，且這15首歌都會經由不同的人工電子耳編碼策略處理，處理後的歌曲會有不同的音樂特徵(音調及節律)。每名參與者會聽到90種刺激訊號 (15首歌，乘以2種音樂特徵，乘以3種人工電子耳編碼策略)，這些參與者共選出了23首童歌旋律。熟悉旋律辨認的結果顯示，當旋律辨識中留存了節律線索，則熟悉旋律辨識測試的表現顯著較佳(p < 0.05)，其聽者有較高的作答正確率以及較快的反應時間。此外，旋律伴隨節律線索經由進階聯合編碼策略有最好的分數，其分數為86.80%。

以此23首童歌旋律作為基礎，並使用包絡差值指標(Envelope Difference Index, EDI) 、音強錯配型態(Intensity Mismatch Pattern)和對數頻譜距離(Log Spectral Distance, LSD)，三種方法來進行客觀分析，來評估原始音樂訊號，以及根據時域特徵及頻率特徵處理後的音樂訊號。原始訊號與經由進階聯合編碼、基本頻率調變、以及包絡增強三種訊號處理方式所處理後的訊號，其之間的平均音強錯配型態(Intensity Mismatch Pattern)分別為5.9、6.4，以及6.0，錯配型態越低，振幅旋律保留得越好；除此之外，原始訊號與處理後的訊號，其間的包絡差值指標數值在進階聯合編碼、基本頻率調變、以及包絡增強三種訊號處理方式的數值分別為0.11、0.11，以及0.15。包絡差值指標越高，頻域包絡保留得越好；原始訊號與處理後的訊號在頻域品質的差異，透過三種訓號處理方式的表現分別為2.10、2.16，以及2.19，對數頻譜距離越低，頻域品質越佳。

綜合主觀與客觀分析，進階聯合編碼策略在頻域及時域品質的保留上有最好的效果，另外進階聯合編碼策略和節律線索合併使用時，在旋律辨識上有最高的準確性。

摘要(英)

Acoustic music features include various spectral and temporal cues, which play a critical role in music perception. The cochlear implant (CI) coding strategy designs primarily to convey speech, but music distortion remains. This study examined the relative contribution of pitch and rhythm to melody recognition, as well as the music quality from three CI coding strategies, Advanced Combinational Encoder (ACE), Fundamental Frequency Modulation (F0mod), and Envelope Enhancement (EE).

The database of melody children′s songs consisted of 30 popular songs in Taiwan. Each melody children song had two music features, temporal (pitch) and spectral (rhythm). Then, the melody was processed with three CI coding strategies using NCU-CI, a cochlear implant simulation software. Then, a pilot subjective listening test was conducted to measure the melody perception from the processed stimuli using the familiar melody identification (FMI) test by collecting the percent correct and response time. There were 5 NH participants who joined the FMI test. The FMI test was begun with selecting 15 of 30 songs by the participants. Then, the participants tested with 15 chosen songs with different music features (pitch and rhythm) that were processed with three CI strategies in each FMI test session. Each participant had 90 tested stimuli (15 songs x 2 music features x 3 CI coding strategies). In total, 23 melody children songs were chosen by the participants. The results indicated that when the rhythm cues were preserved in melody recognition, the FMI performance was significantly better (p<0.05) by having a higher percent correct and faster response time than the pitch cues. Also, the melody with the rhythm cues processed with the ACE strategy achieved the best score, 86.80%.

Based on the 23 chosen melody children′s songs, it was further examined using objective analysis. The envelope difference index (EDI), the intensity mismatch pattern, and the log spectral distance (LSD) were used to assess the quality of processed music compared to original music based on temporal and spectral features for the objective tests. The average intensity mismatch pattern between original and processed by the ACE, F0mod, and EE strategy were 5.9, 6.4, and 6.0, respectively. The lower the mismatch pattern, the better the amplitude melody was preserved. Then, the EDI value between original and processed by the ACE, F0mod, and EE strategy were 0.11, 0.11, and 0.15, respectively. The higher the EDI value, the better the temporal envelope was preserved. Then, the spectral quality differences between original and processed by the ACE, F0mod, and EE strategy were 2.10, 2.16, and 2.19, respectively. The lower the LSD, the better the spectral quality.

In line with the subjective and objective analysis, the ACE strategy was the most outperforming the CI coding strategy in preserving spectral and temporal quality in our study. The results also revealed that the rhythm cues combined with the ACE strategy performed the highest accuracy in the melody recognition.

關鍵字(中)

★ 熟悉旋律辨認(familiar melody identification)
★ 時域品質(temporal quality)
★ 頻域品質(spectral quality)
★ 音樂(music)
★ 節律(rhythm)
★ 音調(pitch)
★ 人工電子耳(cochlear implant)
★ 人工電子耳模擬(cochlear implant simulation)
★ 聲碼器(vocoder)

關鍵字(英)

★ familiar melody identification
★ temporal quality
★ spectral quality
★ music
★ rhythm
★ pitch
★ cochlear implant
★ cochlear implant simulation
★ vocoder

論文目次

摘要 i
ABSTRACT iii
ACKNOWLEDGEMENTS v
TABLE OF CONTENT vi
LIST OF FIGURES viii
LIST OF TABLES x
CHAPTER I 1
1.1. Background and Motivation 1
1.2. Literature Review 2
1.3. Objectives 4
1.4. Thesis Outlines 5
CHAPTER 2 6
2.1. Cochlear Implant 6
2.2. CI Coding Strategy 7
2.3.1. Advanced Combination Encoder (ACE) strategy 7
2.3.2. Fundamental Frequency modulation (F0mod) 9
2.3.3. Envelope Enhancement (EE) Strategy 11
2.3. CI Simulation 13
2.4. Music Perception in CI 13
2.4.1. Temporal Features of Music 14
2.4.2. Spectral Features in Music 16
2.5. Summary 18
CHAPTER 3 19
3.1. Melody of Children Song Database 19
3.2. Voice coder (vocoder) processing 21
3.3. Experimental Subjective Test Design 23
3.3.1. Stimuli 23
3.3.2. Participants 23
3.3.3. Testing Environment 23
3.3.4. Familiar Melody Identification (FMI) Test 24
3.3.5. Response Time Measure 26
3.3.6. Data Analysis of Subjective Test 26
3.4. Experimental Objective Test Design 27
3.4.1. Melody Index (MI) and Rhythm Index (RI) 27
3.4.2. Envelope Difference Index (EDI) 28
3.4.3. Log Spectral Distance (LSD) 29
3.4.4. Intensity Pattern Mismatch 30
3.5. Correlation Analysis 33
CHAPTER 4 34
4.1. Results 34
4.1.1. Subjective Evaluation 34
4.1.2. Objective Evaluation 39
4.2. Discussion 54
CHAPTER 5 59
5.1. Summary 59
5.2. Future Works 61
REFERENCES
APPENDIX

參考文獻

Arifianto, D., & Pratiwi, E. W. (2017). Enhanced harmonics for music appreciation on cochlear implant. IEEE Region 10 Annual International Conference, Proceedings/TENCON. https://doi.org/10.1109/TENCON.2016.7848410
Arifianto, D., & Pratiwi, E. W. (2016). Enhanced harmonics for music appreciation on cochlear implant. 2016 IEEE Region 10 Conference (TENCON), 2167–2171. https://doi.org/10.1109/TENCON.2016.7848410
Brant, J. A., Eliades, S. J., Kaufman, H., Chen, J., & Ruckenstein, M. J. (2018). AzBio Speech Understanding Performance in Quiet and Noise in High Performing Cochlear Implant Users. Otology & Neurotology, 39(5). https://journals.lww.com/otology-neurotology/Fulltext/2018/06000/AzBio_Speech_Understanding_Performance_in_Quiet.9.aspx
Dassa, A. (2018). Musical Auto-Biography Interview (MABI) as promoting self-identity and well-being in the elderly through music and reminiscence. Nordic Journal of Music Therapy, 27(5), 419–430. https://doi.org/10.1080/08098131.2018.1490921
Donnelly, P. J., & Limb, C. J. (2012). Music perception in cochlear implant users. In Cochlear Implants: Principles and Practices. https://www.scopus.com/inward/record.uri?eid=2-s2.0-84970996528&partnerID=40&md5=ce42707cc036cc05583a602f2b576183
Dorman, M. F., Loizou, P. C., & Rainey, D. (1997). Simulating the effect of cochlear-implant electrode insertion depth on speech understanding. The Journal of the Acoustical Society of America, 102(5), 2993–2996. https://doi.org/10.1121/1.420354
Drennan, W. R., & Rubinstein, J. T. (2008). Music perception in cochlear implant users and its relationship with psychophysical capabilities. Journal of Rehabilitation Research and Development, 45(5), 779–789. https://doi.org/10.1682/jrrd.2007.08.0118
Dritsakis, G., van Besouw, R. M., & O′ Meara, A. (2017). Impact of music on the quality of life of cochlear implant users: a focus group study. Cochlear Implants International, 18(4), 207–215. https://doi.org/10.1080/14670100.2017.1303892
Fishman, K. E., Shannon, R. V, & Slattery, W. H. (1997). Speech recognition as a function of the number of electrodes used in the SPEAK cochlear implant speech processor. Journal of Speech, Language, and Hearing Research : JSLHR, 40(5), 1201–1215. https://doi.org/10.1044/jslhr.4005.1201
Fortune, T. W., Woodruff, B. D., & Preves, D. A. (1994). A New Technique for Quantifying Temporal Envelope Contrasts. Ear and Hearing, 15(1). https://journals.lww.com/ear-hearing/Fulltext/1994/02000/A_New_Technique_for_Quantifying_Temporal_Envelope.11.aspx
Friesen, L. M., Shannon, R. V, Baskent, D., & Wang, X. (2001). Speech recognition in noise as a function of the number of spectral channels: comparison of acoustic hearing and cochlear implants. The Journal of the Acoustical Society of America, 110(2), 1150–1163. https://doi.org/10.1121/1.1381538
Fujita, S., & Ito, J. (1999). Ability of Nucleus Cochlear Implantees to Recognize Music. Annals of Otology, Rhinology & Laryngology, 108(7), 634–640. https://doi.org/10.1177/000348949910800702
Galvin, J., Fu, Q.-J., & Nogaki, G. (2007). Melodic Contour Identification by Cochlear Implant Listeners. Ear and Hearing, 28, 302–319. https://doi.org/10.1097/01.aud.0000261689.35445.20
Garnham, C., O’Driscoll, M., Ramsden And, R., & Saeed, S. (2002). Speech understanding in noise with a Med-El COMBI 40+ cochlear implant using reduced channel sets. Ear and Hearing, 23(6), 540–552. https://doi.org/10.1097/00003446-200212000-00005
Geurts, L., & Wouters, J. (1999). Enhancing the speech envelope of continuous interleaved sampling processors for cochlear implants. The Journal of the Acoustical Society of America, 105(4), 2476–2484. https://doi.org/10.1121/1.426851
Gfeller, K, Christ, A., Knutson, J., Witt, S., Murray, K., & Tyler, R. (2000). Musical backgrounds, listening habits, and aesthetic enjoyment of adult cochlear implant recipients. Journal of the American Academy of Audiology, 11 7, 390–406.
Gfeller, Kate, & Lansing, C. (1992). Musical Perception of Cochlear Implant Users as Measured by the Primary Measures of Music Audiation: An Item Analysis 1. Journal of Music Therapy, 29(1), 18–39. https://doi.org/10.1093/jmt/29.1.18
Gfeller, Kate, Turner, C., Mehr, M., Woodworth, G., Fearn, R., Knutson, J. F., Witt, S., & Stordahl, J. (2002). Recognition of familiar melodies by adult cochlear implant recipients and normal-hearing adults. Cochlear Implants International, 3(1), 29–53. https://doi.org/10.1179/cim.2002.3.1.29
Gfeller, Kate, Woodworth, G., Robin, D. A., Witt, S., & Knutson, J. F. (1997). Perception of Rhythmic and Sequential Pitch Patterns by Normally Hearing Adults and Adult Cochlear Implant Users. Ear and Hearing, 18(3). https://journals.lww.com/ear-hearing/Fulltext/1997/06000/Perception_of_Rhythmic_and_Sequential_Pitch.8.aspx
Gifford, R. H., Shallop, J. K., & Peterson, A. M. (2008). Speech recognition materials and ceiling effects: considerations for cochlear implant programs. Audiology & Neuro-Otology, 13(3), 193–205. https://doi.org/10.1159/000113510
Ho, L. L., Wu, C. M., & Lin, H. C. (2009). Effect of channel number, stimulation rate, and electroacoustic stimulation. Poster Presentation of 7th Asia Pacific Symposium on Cochlear Implants and Related Sciences.
Hsiao, F. (2008). Mandarin melody recognition by pediatric cochlear implant recipients. Journal of Music Therapy, 45(4), 390–404. https://doi.org/10.1093/jmt/45.4.390
Huang, E. H., Wu, C. M., & Lin, H. C. (2019). Simulation of three auditory physiology based CI sound coding strategies with Mandarin speech. Proc. of the 12th Asia Pacific Symposium on Cochlear Implants and Related Sciences.
Jiam, N. T., & Limb, C. J. (2019). Rhythm processing in cochlear implant−mediated music perception. Annals of the New York Academy of Sciences, 1453(1), 22–28. https://doi.org/https://doi.org/10.1111/nyas.14130
Karoui, C., James, C., Barone, P., Bakhos, D., Marx, M., & Macherey, O. (2019). Searching for the Sound of a Cochlear Implant: Evaluation of Different Vocoder Parameters by Cochlear Implant Users With Single-Sided Deafness. Trends in Hearing, 23, 2331216519866029. https://doi.org/10.1177/2331216519866029
Kate, G., & R., L. C. (1991). Melodic, Rhythmic, and Timbral Perception of Adult Cochlear Implant Users. Journal of Speech, Language, and Hearing Research, 34(4), 916–920. https://doi.org/10.1044/jshr.3404.916
Kong, Y.-Y., Cruz, R., Jones, J. A., & Zeng, F.-G. (2004). Music perception with temporal cues in acoustic and electric hearing. Ear and Hearing, 25(2), 173–185. https://doi.org/10.1097/01.aud.0000120365.97792.2f
Kong, Y.-Y., Mullangi, A., Marozeau, J., & Epstein, M. (2011). Temporal and spectral cues for musical timbre perception in electric hearing. Journal of Speech, Language, and Hearing Research : JSLHR, 54(3), 981–994. https://doi.org/10.1044/1092-4388(2010/10-0196)
Loizou, P. C. (1999). Signal-processing techniques for cochlear implants. IEEE Engineering in Medicine and Biology Magazine : The Quarterly Magazine of the Engineering in Medicine & Biology Society, 18(3), 34–46. https://doi.org/10.1109/51.765187
Lorenzi, C., Gilbert, G., Carn, H., Garnier, S., & Moore, B. C. J. (2006). Speech perception problems of the hearing impaired reflect inability to use temporal fine structure. Proceedings of the National Academy of Sciences, 103(49), 18866–18869. https://doi.org/10.1073/pnas.0607364103
McDermott, H. J. (2004). Music Perception with Cochlear Implants: A Review. Trends in Amplification, 8(2), 49–82. https://doi.org/10.1177/108471380400800203
McDermott, J. H., Lehr, A. J., & Oxenham, A. J. (2008). Is relative pitch specific to pitch? Psychological Science, 19(12), 1263–1271. https://doi.org/10.1111/j.1467-9280.2008.02235.x
Mcnichols, E. (2018). Music Perception In Simulations of Cochlear Implant Listening. University of Colorado Boulder.
Milczynski, M., Wouters, J., & van Wieringen, A. (2009). Improved fundamental frequency coding in cochlear implant signal processing. The Journal of the Acoustical Society of America, 125, 2260–2271. https://doi.org/10.1121/1.3085642
Olszewski, C., Gfeller, K., Froman, R., Stordahl, J., & Tomblin, J. (2005). Familiar melody recognition by children and adults using cochlear implants and normal hearing children. Cochlear Implants International, 6, 123–140. https://doi.org/10.1002/cii.5
Oxenham, A. J. (2008). Pitch Perception and Auditory Stream Segregation: Implications for Hearing Loss and Cochlear Implants. Trends in Amplification, 12(4), 316–331. https://doi.org/10.1177/1084713808325881
Papinczak, Z. E., Dingle, G. A., Stoyanov, S. R., Hides, L., & Zelenko, O. (2015). Young people′s uses of music for well-being. Journal of Youth Studies, 18(9), 1119–1134. https://doi.org/10.1080/13676261.2015.1020935
Rabiner, L., & Juang, B.-H. (1993). Fundamentals of Speech Recognition. Prentice-Hall, Inc.
Reich, R. D. (2002). Instrument identification through a simulated cochlear implant processing system. Massachusetts Institute of Technology.
Schulz E, K. M. (1994). Music perception with the MED-EL implants. In in Hochmair-Desoyer IJ, Hochmaier ES (eds): Advances in Cochlear Implants.
Shannon, R. V, Zeng, F.-G., Kamath, V., Wygonski, J., & Ekelid, M. (1995). Speech Recognition with Primarily Temporal Cues. Science, 270(5234), 303 LP – 304. https://doi.org/10.1126/science.270.5234.303
Souza, P., & Rosen, S. (2009). Effects of envelope bandwidth on the intelligibility of sine- and noise-vocoded speech. The Journal of the Acoustical Society of America, 126(2), 792–805. https://doi.org/10.1121/1.3158835
Sucher, C. M., & McDermott, H. J. (2007). Pitch ranking of complex tones by normally hearing subjects and cochlear implant users. Hearing Research, 230(1–2), 80–87. https://doi.org/10.1016/j.heares.2007.05.002
Vandali, A. E., Whitford, L. A., Plant, K. L., & Clark, and G. M. (2000). Speech Perception as a Function of Electrical Stimulation Rate: Using the Nucleus 24 Cochlear Implant System. Ear and Hearing, 21(6). https://journals.lww.com/ear-hearing/Fulltext/2000/12000/Speech_Perception_as_a_Function_of_Electrical.8.aspx
World Health Organisation. (2021, April 1). Deafness and hearing loss. https://www.who.int/news-room/fact-sheets/detail/deafness-and-hearing-loss
Wu, C. M., Huang, K. Y., & Lin, H. C. (2009). Effects of channel number, stimulation rate, and electroacoustic stimulation of cochlear implant simulation on Chinese speech recognition in noise. Proc. of the 7- Th Asia Pacific Symposium on Cochlear Implants and Related Sciences.
Zhao, K., Bai, Z. G., Bo, A., & Chi, I. (2016). A systematic review and meta-analysis of music therapy for the older adults with depression. International Journal of Geriatric Psychiatry, 31(11), 1188–1198. https://doi.org/10.1002/gps.4494

指導教授

吳炤民(Chao-Min Wu)

審核日期

2021-8-30

推文