行動裝置上運用機器學習與語音分析於帕金森氏症診斷之可行性研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：50

、訪客IP：18.117.11.176

姓名

連哲源(Zhe-Yuan Lian) 查詢紙本館藏

畢業系所

電機工程學系

論文名稱

行動裝置上運用機器學習與語音分析於帕金森氏症診斷之可行性研究
(Feasibility Study of Diagnosis of Parkinson′s Diseases Based on Machine Learning and Voice Analysis on Mobile Devices)

相關論文

★ 獨立成份分析法於真實環境中聲音訊號分離之探討	★ 口腔核磁共振影像的分割與三維灰階值內插
★ 數位式氣喘尖峰氣流量監測系統設計	★ 結合人工電子耳與助聽器對中文語音辨識率的影響
★ 人工電子耳進階結合編碼策略的中文語音辨識成效模擬--結合助聽器之分析	★ 中文發聲之神經關聯性的腦功能磁振造影研究
★ 利用有限元素法建構3維的舌頭力學模型	★ 以磁振造影為基礎的立體舌頭圖譜之建構
★ 腎小管之草酸鈣濃度變化與草酸鈣結石關係之模擬研究	★ 口腔磁振影像舌頭構造之自動分割
★ 微波輸出窗電性匹配之研究	★ 以軟體為基準的助聽器模擬平台之發展-噪音消除
★ 以軟體為基準的助聽器模擬平台之發展-回饋音消除	★ 模擬人工電子耳頻道數、刺激速率與雙耳聽對噪音環境下中文語音辨識率之影響
★ 用類神經網路研究中文語音聲調產生之神經關聯性	★ 教學用電腦模擬生理系統之建構

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2025-9-1以後開放)

摘要(中)

在近幾年的研究中，語音分析被認為可以客觀且有效的診斷帕金森氏症(Parkinson′s disease, PD)，然而語音分析工具大部分都須依靠特定儀器或電腦運作，這些設備不利於攜帶或移動，若採用行動裝置能有效的解決攜帶的問題，因此我們開發了一款語音分析的Android行動裝置軟體，並測試五種分類器，從中尋找合適的分類器對PD進行診斷。
在實驗設計使用了74位帕金森患者的語音與50位健康者的語音，這些語音樣本為連續母音/a/，在實驗中測試了聲學參數對PD的相關性，包含了19個多面向音聲分析系統(Multidimensional Voice Program, MDVP)參數、歸一化噪音能量(Normalized Noise Energy, NNE)、平滑倒頻譜的峰值(Cepstral Peak Prominence Smoothed, CPPS)、長時間平均頻譜(Long-Term Average Spectrum, LTAS)、梅爾倒頻譜係數(Mel Frequency Cepstral Coefficients, MFCC)和可調Q因子小波轉換(Tunable Q-Factor Wavelet Transform, TQWT)。
在過去使用TQWT診斷PD的研究中擁有432個參數，而當參數過於龐大時容易導致分類器過度擬合，因此須對TQWT進行降維，首先在實驗中我們測試Principal Component Analysis (PCA)、Linear Discriminant Analysis (LDA)和Hellinger Linear Discriminant Analysis (HLDA)對TQWT的降維能力，其中HLDA獲得最好效果且解決LDA無法調整參數的問題。
在分類器中，選擇了最近鄰居法(K Nearest Neighbor, KNN)、多層感知器(Multi-Layer perceptron, MLP)、支持向量機(Support Vector Machine, SVM)、梯度提升決策樹(Gradient Boosting Decision Tree, GBDT)和多類海靈格線性判斷決策樹(Multi-class Hellinger Linear Discriminant decision tree, MHLDT)。
共5組進行參數的比較，在實驗中將參數依照1)時域測量、2)噪音測量與3)MFCC分成3組，再加上4)全部的參數與5)海靈格距離(Hellinger distance, HD)挑選的10個參數，測試參數混和的效果。
在結果中顯示噪音測量與MFCC的參數各自在不同的分類器中表現優於時域測量，與使用HD挑選的參數都為噪音測量與MFCC的結果一致，結合選中參數的特性與過去研究的結果發現測量聲帶受損導致的氣聲能有效的診斷PD。
在分類器與參數的比較結果中，當使用SVM與HD所挑選的參數能獲得最高的準確度最高為97.5%，最終將選中的分類器與參數製作成Android 軟體，軟體中可以錄製語音並診斷PD。

摘要(英)

In recent years of research, voice analysis was believed to be objective and effective in the diagnosis of Parkinson′s disease (PD), but most voice analysis tools today still need to work with specialized equipment or computers, which are not convenient for carrying or moving. Therefore, using of mobile devices could effectively solve the problem of carrying.
In this study, we developed an Android app for mobile devices to perform voice analysis, and tested 5 distinct classifiers, from which to find a suitable classifier to diagnose PD.
In experimental design we used voice samples of 74 PD patients and 50 healthy speakers, and these voice samples were sustained vowels /a/. In the experiment, we tested the correlation between PD and various voice parameters, including 19 Multidimensional Voice Program (MDVP) parameters, Normalized Noise Energy (NNE), Cepstral Peak Prominence Smoothed (CPPS), Long-Term Average Spectrum (LTAS), Mel Frequency Cepstral Coefficients (MFCC) and Tunable Q-Factor Wavelet Transform (TQWT).
In the past studies, there are 432 parameters using TQWT to diagnose PD. If the number of parameters is high, it is easy to cause classifier overfitting, so TQWT has to be reduced in dimensionality. Two experiments were conducted in this study.
In the first experiment, we tested the dimensionality reduction techniques based on the performance of Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Hellinger Linear Discriminant Analysis (HLDA) on TQWT, where HLDA performed optimally and resolved the parameter adjust issue for LDA.
The classifiers, K Nearest Neighbor (KNN), Multi-Layer perceptron (MLP), Support Vector Machine (SVM), Gradient Boosting Decision Tree (GBDT) and Multi-class Hellinger Linear Discriminant decision tree (MHLDT) were used to determine if the voice belonged to a PD patient.
A total of 5 groups of parameters a were compared, the parameters were divided into three groups according to 1) time-domain measurement, 2) noise measurement, and 3) MFCC to test the performance of different characteristics. In addition, 4) all the parameters and 5) 10 parameters selected by Hellinger distance (HD) were also used to test the performance of parameter mixing.
The results showed that the parameters of noise measurement and MFCC outperform those of time-domain measurement in different classifiers. The results are consistent with the parameters selected using HD for noise measurements and MFCC.
Combining the characteristics of the selected parameters and the results of previous studies, it was found that measuring the breathy voice caused by the abnormal vocal cord can effectively diagnose PD.
In the comparison of parameters and classifiers, the highest performance was observed using SVM and the 10 parameters selected by HD, and the accuracy was 97.5%.
Finally, the selected classifier and parameters were implemented as an Android app, which could record voice and diagnose PD.

關鍵字(中)

★ 帕金森氏症
★ 機器學習
★ 行動裝置
★ 語音分析

關鍵字(英)

★ Parkinson′s Disease
★ Machine Learning
★ Mobile Devices
★ Speech Analysis

論文目次

摘要 I
ABSTRACT III
目錄 V
圖目錄 VIII
表目錄 X
第一章:緒論 1
1.1研究動機 1
1.2文獻探討 4
1.3 研究目的 9
1.4論文架構 10
第二章: 語音參數 12
2.1 多面向音聲分析系統 (MDVP) 12
2.1.1基頻信息測量 12
2.1.2長期與短期頻率擾動測量 12
2.1.3長期與短期振幅擾動測量 14
2.1.4語音中斷測量 16
2.1.5次諧波測量 16
2.1.6聲音不規則性測量 16
2.1.7噪音測量 17
2.1.8震顫測量 18
2.2 歸一化噪音能量 (NNE) 19
2.3 平滑倒頻譜的峰值 (CPPS) 21
2.4 長時間平均頻譜-斜率 (LTAS) 22
2.5 梅爾倒頻譜係數 (MFCC) 23
2.6 可調Q因子小波轉換 (TQWT) 24
2.6.1可調Q因子小波轉換 (TQWT) 24
2.6.2 維度降低 27
2.6.2.1 線性判別分析(LDA) 27
2.6.2.2 主成分分析(PCA) 28
2.6.2.3 海靈格線性判別分析 (HLDA) 29
第三章: 機器學習 30
3.1 多層感知器 (MLP) 30
3.2 最近鄰居法(KNN) 31
3.3 支持向量機 (SVM) 32
3.4 梯度提升決策樹(GBDT) 34
3.5 多類海靈格線性判斷決策樹(MHLDT) 35
第四章: 實驗方法 37
4.1實驗中應用的資料庫 37
4.2實驗中參數的分組 38
4.3實驗介紹 39
4.3.1實驗一:降維測試 39
4.3.2實驗二:參數與分類器比較 39
4.4評分方式 40
第五章: 結果與討論 43
5.1實驗結果 43
5.1.1實驗一 43
5.1.2實驗二 48
5-2行動裝置軟體介紹 56
5-3討論 59
第六章: 結論與未來展望 64
6.1 結論 64
6.2 未來展望 66
參考文獻 67

參考文獻

Anowar, F., Sadaoui, S., & Selim, B. (2021) “Conceptual and empirical comparison of dimensionality reduction algorithms (PCA, KPCA, LDA, MDS, SVD, LLE, ISOMAP, LE, ICA, t-SNE)” Computer Science Review, vol. 40
Balakrishnama, S., & Ganapathiraju, A. (1998) “Linear discriminant analysis-a brief tutorial.” Institute for Signal and information Processing, vol. 18, pp. 1-8
Balestrino, R. & Schapira, A. (2020). “Parkinson disease.” European Journal of Neurology, vol. 27, no. 1, pp. 27-42.
Belalcazar-Bolanos, E. A., Orozco-Arroyave, J. R., Arias-Londono, J. D., Vargas-Bonilla, J. F., & Nöth, E. (2013) “Automatic detection of Parkinson′s disease using noise measures of speech.” Symposium of Signals, Images and Artificial Vision-2013: STSIVA-2013, pp. 1-5
Berus, L., Klancnik, S., Brezocnik, M., & Ficko, M. (2018) “Classifying Parkinson′s Disease Based on Acoustic Measures Using Artificial Neural Networks.” Sensors, vol. 19, no. 1, pp. 1-16
Bourouhou, A., Jilbab, A., Nacir, C., & Hammouch, A. (2016) “Comparison of classification methods to detect the Parkinson disease.” 2016 International Conference on Electrical and Information Technologies (ICEIT), pp. 421-424
Brückl, M., Ghio, A., & Viallet, F. (2018) “Measurement of Tremor in the Voices of Speakers with Parkinson’s Disease.” Procedia Computer Science, vol. 128, pp. 47-54
Cañete-Sifuentes, L., Monroy, R., Medina-Pérez, M. A., Loyola-González, O., & Voronisky, F. V. (2019) “Classification Based on Multivariate Contrast Patterns.” IEEE Access, vol. 7, pp. 55744-55762
Canter, GJ. (1965) “Speech characteristics of patients with Parkinson’s disease. 3.Articulation, diadochokinesis, and over-all speech adequacy.” The Journal of speech and hearing disorders, vol. 30, no. 3, pp. 217-324
Chicco, D., & Jurman, G. (2020) “The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.” BMC Genomics, vol.21, no. 1, pp. 1 -13
Ghio, A., Robert, D., Grigoli, C., Mas, M., Delooze, C., Mercier, C., & Viallet, F. (2014) “F0 characteristics in Parkinsonian speech: Contrast between the effect of hypodopaminergy due to Parkinson′s disease and that of the therapeutic delivery of L-Dopa.” Revue de laryngologie-otologie-rhinologie, vol. 135, no. 2, pp. 63-70
Gillivan-Murphy, P., Miller, N., & Carding, P. (2019) “Voice Tremor in Parkinson′s Disease: An Acoustic Study.” Journal of voice, vol. 33, no. 4, pp. 526-535
Godino-Llorente, J. I., Gomez-Vilda, P., & Blanco-Velasco, M. (2006) “Dimensionality Reduction of a Pathological Voice Quality Assessment System Based on Gaussian Mixture Models and Short-Term Cepstral Parameters.” IEEE Transactions on Biomedical Engineering, vol. 53, no. 10, pp. 1943-1953
Godino-Llorente, J. I., Osma-Ruiz, V., Sáenz-Lechón, N., Gómez-Vilda, P., Blanco-Velasco, M., & Cruz-Roldán, F. (2010) “The effectiveness of the glottal to noise excitation ratio for the screening of voice disorders.” Journal of Voice, vol. 24, no. 1, pp. 47-56
Gunduz, H. (2019) “Deep Learning-Based Parkinson’s Disease Classification Using Vocal Feature Sets.” IEEE Access, vol. 7, pp. 115540-115551
Ho, A. K., Iansek, R., Marigliani, C., Bradshaw, J. L., & Gates, S. (1998) “Speech impairment in a large sample of patients with Parkinson′s disease.” Behavioural neurology. vol. 11, no. 3, pp. 131-137
Hillenbrand, J., & Houde, R. A. (1996) “Acoustic correlates of breathy vocal quality: dysphonic voices and continuous speech.” J Speech Hear Res, vol. 39, no. 2, pp. 311-321
Kasuya, H., Ogawa, S., Mashima, K., & Ebihara, S. (1986) “Normalized noise energy as an acoustic measure to evaluate pathologic voice.” Journal of the Acoustical Society of America, vol. 80, no. 5, pp.1329-1334
Lahmiri, S., Dawson, D. A., & Shmuel, A. (2018) “Performance of machine learning methods in diagnosing Parkinson′s disease based on dysphonia measures.” Biomedical engineering letters, vol. 8, no. 1, pp. 29-39
Liu, W. M., Wu, R. M., Lin, J. W., Liu, Y. C., Chang, C. H., & Lin, C. H. (2016) “Time trends in the prevalence and incidence of Parkinson′s disease in Taiwan: A nationwide, population-based study.” Journal of the Formosan Medical Association, vol. 115, no. 7, pp. 531-538
Ma, A., Lau, K. K., & Thyagarajan, D. (2021) “Radiological correlates of vocal fold bowing as markers of Parkinson’s disease progression: A cross-sectional study utilizing dynamic laryngeal CT.” PloS one, vol. 16, no. 10, e0258786
Manfredi, C., Pierazzi, L., & Bruscaglioni, P. (2000) “A Measure of Voice Hoarseness in Time and Frequency Domain.” IFAC Proceedings Volumes, vol. 33, no. 3, pp. 41-46.
Mathew, M. M., & Bhat, J. S. (2009) “Soft phonation index—a sensitive parameter?” Indian Journal of Otolaryngology and Head & Neck Surgery, vol.61, no. 2, pp. 127-130
Marras, C., Beck, J. C., Bower, J. H., Roberts, E., Ritz, B., Ross, G. W., ... & Tanner, C. M. (2018) “Prevalence of Parkinson’s disease across North America.” NPJ Parkinson′s Disease, vol. 4, no. 1, pp. 1-7
Meyer-Baese, A., & Schmid, V. J. (2014) “Chapter 7 - Foundations of Neural Networks.” Pattern Recognition and Signal Analysis in Medical Imaging, pp. 197-243
Midi, I., Dogan, M., Koseoglu, M., Can, G. Ü. N. A. Y., Sehitoglu, M. A., & Gunal, D. I. (2008) “Voice abnormalities and their relation with motor dysfunction in Parkinson’s disease.” Acta Neurologica Scandinavica, vol. 117, no. 1, pp. 26-34
Murman, D. L. (2012) “Early treatment of Parkinson′s disease: opportunities for managed care.” The American journal of managed care, vol.18, no. 7, pp. 183-188.
Noble, W. (2006) “What is a support vector machine?” Nature biotechnology, vol. 24, no. 12, pp. 1565–1567
Chén, O. Y., Lipsmeier, F., Phan, H., Prince, J., Taylor, K. I., Gossens, C., & De Vos, M. (2020). “Building a machine-learning framework to remotely assess Parkinson′s disease using smartphones.” IEEE Transactions on Biomedical Engineering, vol. 67, no. 12, pp. 3491-3500
Peter Kitzing (1986) “LTAS criteria pertinent to the measurement of voice quality.” Journal of Phonetics, vol.14, no.3-4, pp. 477-482
Perju-Dumbrava, L., Lau, K., Phyland, D., Papanikolaou, V., Finlay, P., Beare, R., & Thyagarajan, D. (2017) “Arytenoid cartilage movements are hypokinetic in Parkinson’s disease: A quantitative dynamic computerised tomographic study.” PloS one, vol. 12, no. 11, e0186611
Pramono, R. X. A., Imtiaz, S. A., & Rodriguez-Villegas, E. (2019) “Evaluation of features for classification of wheezes and normal respiratory sounds.” PloS one, vol.14, no.3, e0213659
Rizzo, G., Copetti, M., Arcuti, S., Martino, D., Fontana, A., & Logroscino, G (2016) “Accuracy of clinical diagnosis of Parkinson disease: A systematic review and meta-analysis.” Neurology, vol. 86, no. 6, pp. 566-576
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986) “Learning representations by back-propagating errors.” Nature, vol. 323, pp. 533–536
Rusz, J., Hlavnička, J., Tykalova, T., Novotný, M., Dušek, P., Šonka, K., & Růžička, E. (2018). “Smartphone allows capture of speech abnormalities associated with high risk of developing Parkinson’s disease.” IEEE transactions on neural systems and rehabilitation engineering, vol.26, no.8, pp.1495-1507
Rusz, J., Cmejla, R., Ruzickova, H., & Ruzicka, E. (2011) “Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson′s disease.” The journal of the Acoustical Society of America, vol. 129, no. 1, pp. 350-367
Saenz-Lechon, N., Fraile, R., Godino-Llorente, J. I., Fernández-Baíllo, R., Osma-Ruiz, V., Gutiérrez-Arriola, J. M., & Arias-Londoño, J. D. (2011) “Towards objective evaluation of perceived roughness and breathiness: an approach based on mel-frequency cepstral analysis.” Logopedics phoniatrics vocology, vol. 36, no. 2, pp. 52-59
Sakar, C. O., Serbes, G., Gunduz, A., Tunc, H. C., Nizam, H., Sakar, B. E., ... & Apaydin, H. (2019) “A comparative analysis of speech signal processing algorithms for Parkinson’s disease classification and the use of the tunable Q-factor wavelet transform.” Applied Soft Computing, vol. 74, pp. 255-263
Selesnick, I. W. (2011) “Wavelet Transform with Tunable Q-Factor.” IEEE Transactions on Signal Processing, vol. 59, no. 8, pp. 3560-3575
Sharma A. & Giri R. N. (2014) “Automatic recognition of Parkinson’s Disease via artificial neural network and support vector machine.” Journal of Innovative Technology and Exploring Engineering, vol. 4, no. 3, pp. 2278-3075
Shoji, K., Regenbogen, E., Yu, J. D., & Blaugrund, S. M. (1992) “High‐frequency power ratio of breathy voice.” The Laryngoscope, vol. 102, no. 3, pp.267-271.
Smith, L. K., & Goberman, A. M. (2014) “Long-time average spectrum in individuals with parkinson disease.” NeuroRehabilitation, vol. 35, no. 1, pp. 77-88.
Swain, P. H., & Hauska, H. (1977) “The decision tree classifier: Design and potential” IEEE Transactions on Geoscience Electronics, vol.15, no. 3, pp.142-147
Šimek, M., & Rusz, J. (2021) “Validation of cepstral peak prominence in assessing early voice changes of Parkinson′s disease: Effect of speaking task and ambient noise.” The Journal of the Acoustical Society of America, vol.150, no. 6, pp.4522-4533.
Teixeira, J. P., Oliveira, C., & Lopes, C. (2013) “Vocal Acoustic Analysis – Jitter, Shimmer and HNR Parameters.” Procedia Technology, vol. 9, pp. 1112-1122
Woldert-Jokisz B. (2007). “Saarbruecken Voice Database”
Yumoto, E., Gould, W. J., & Baer, T. (1982) “Harmonics‐to‐noise ratio as an index of the degree of hoarseness.” The journal of the Acoustical Society of America, vol. 71, no. 6, pp.1544-1550
Yu, S., Li, X., Zhang, X., & Wang, H. (2019) “The OCS-SVM: An Objective-Cost-Sensitive SVM with Sample-Based Misclassification Cost Invariance.” IEEE Access, vol. 7, pp. 118931-118942
Zhang, Z. (2016) “Introduction to machine learning: k-nearest neighbors.” Annals of translational medicine, vol. 4, no. 11
Zhang, Y. N. (2017) “Can a Smartphone Diagnose Parkinson Disease? A Deep Neural Network Method and Telediagnosis System Implementation” Parkinson’s Disease
賴靖如 (2017) “以聲音特徵為基礎帕金森氏症診斷” 國立中興大學資訊管理學系所碩士論文。
Danisa (2020) “基於機器學習分析帕金森氏症患者之語音” 國立中央大學電機工程學系碩士論文。
謝承恩 (2022) “巴金森病友年增2千人左旋多巴藥物使用量最多” 聯合報
Statista:
https://www.statista.com/statistics/330695/number-of-smartphone-users-worldwide/ 2022/02/07

指導教授

吳炤民

審核日期

2022-8-18

推文