博碩士論文 107521602 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:24 、訪客IP:3.145.59.187
姓名 戴妮莎(Mirna Danisa Tandjung)  查詢紙本館藏   畢業系所 電機工程學系
論文名稱 基於機器學習分析帕金森氏症患者之語音
(Voice Analysis of Patient with Parkinson’s Disease Based on Machine Learning Approach)
相關論文
★ 獨立成份分析法於真實環境中聲音訊號分離之探討★ 口腔核磁共振影像的分割與三維灰階值內插
★ 數位式氣喘尖峰氣流量監測系統設計★ 結合人工電子耳與助聽器對中文語音辨識率的影響
★ 人工電子耳進階結合編碼策略的中文語音辨識成效模擬--結合助聽器之分析★ 中文發聲之神經關聯性的腦功能磁振造影研究
★ 利用有限元素法建構3維的舌頭力學模型★ 以磁振造影為基礎的立體舌頭圖譜之建構
★ 腎小管之草酸鈣濃度變化與草酸鈣結石關係之模擬研究★ 口腔磁振影像舌頭構造之自動分割
★ 微波輸出窗電性匹配之研究★ 以軟體為基準的助聽器模擬平台之發展-噪音消除
★ 以軟體為基準的助聽器模擬平台之發展-回饋音消除★ 模擬人工電子耳頻道數、刺激速率與雙耳聽對噪音環境下中文語音辨識率之影響
★ 用類神經網路研究中文語音聲調產生之神經關聯性★ 教學用電腦模擬生理系統之建構
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   [檢視]  [下載]
  1. 本電子論文使用權限為同意立即開放。
  2. 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
  3. 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。

摘要(中) 語音分析是協助帕金森氏症(Parkinson’s disease, PD)患者早期發現疾病、鑑別診斷,以及監測疾病進程的重要工具。帕氏症患者會有獨特的運動言語障礙。多向度語音分析軟體(The Kay Multidimensional Voice Program, MDVP) 是一個自動化的語音分析系統,此系統可以快速計算達33種的語音功能測試,並以圖形化的形式呈現語音的功能,藉此可辨別在臨床上可能潛在的語音功能差異。MDVP分析中的錯誤主要的來源有兩種,分別是來自電腦程式執行計算的錯誤以及 用戶在分離訊號時的錯誤。本研究共收集了55個健康者的語音樣本,55個帕氏症控制組語音樣本以及145個帕氏症患者語音樣本。語音樣本是一個持續三秒鐘的母音如/a/及/u/。在機器學習的分類技術中,本研究使用了最佳梯度提升 (Optimized Gradient Boosted) 模型來確認此分類機制的表現。共有五個參數,分別是APQ, PPQ, Jitta, ShdB, Jitt,其最佳的表現分別達到94.6%精確率(precision)、96.7%召回率、93%正確率,操作者特徵 (Receiver operating Characteristic, ROC) 曲線為87.9%。馬修斯相關係數 (Matthews Correlation Coefficient, MCC) 為83.1%。在此實驗中觀察到帕氏症患者的母音/a/,相較於母音/u/以及帕氏症控制組,母音/a/有較高的預測表現;另外健康者的母音/u/,相較於健康者的母音/a/,健康者的母音/u/有較高的預測表現。帕氏症患者及帕氏症控制組的母音/a/可以達到95.7%的精確率、95.7%的召回率、91.3%的準確率,以及100%的ROC曲線。健康者的母音/u/可達到95.6%的精確率、86%的召回率、91.3%的準確率,以及100%的ROC曲線。根據五個參數以及母音的結果,本研究可以有效的區分健康者及帕氏症患者的語音樣本。
摘要(英) Voice analysis of a patient with Parkinson′s disease (PD) could be an important tool in early detection, differential diagnosis, and monitoring of disease progression. Patients with PD develop distinctive motor speech disorders. The Kay Multidimensional Voice Program™ (MDVP) is an automatic voice analysis. This system rapidly calculates up to 33 measures of vocal function and displays them on a graph that incorporates normative values for the identification of potentially important clinical differences. Discrepancies in the MDVP analysis arise from two major sources: errors in the computations performed by the program and errors made by the user in an attempt to isolate a given portion of the signal. This study collected 55 healthy voice samples, 55 PD control samples, and 145 samples of PD patients. Voice samples of a 3-second sustained vowel sound /a/ and /u/. Among the machine learning classification technique, we used Optimized Gradient Boosted model to verify the performance of the classification mechanisms. The best performance with 5 parameters (APQ, PPQ, Jitta, ShdB, Jitt) achieved a precision 94.6%, recall (sensitivity) 96.7%, accuracy 93%, Receiver Operating Characteristic (ROC) curve 87.9%, and Matthews Correlation Coefficient (MCC) is 83.1%. It is observed vowel /a/ PD has a higher prediction than vowel /u/ PD and PD control, and vowel /u/ Healthy have a higher prediction than vowel /a/ Healthy. Vowel /a/ for PD and PD control voice achieved a precision 95.7%%, recall 95.7%, accuracy 91.3% and ROC curve 100%. The vowel /u/ of Healthy voice achieved a precision 95.6%, recall 86%, accuracy 91.3%, and ROC curve 100%. According to these 5 parameters and vowel results, we can efficiently differentiate between healthy and PD voice samples.
關鍵字(中) ★ 帕金森氏病(PD)
★ 多維語音程序(MDVP)
★ 機器學習(ML)
★ 優化梯度提升
關鍵字(英) ★ Parkinson′s disease (PD)
★ Multidimensional Voice Program (MDVP)
★ Machine Learning (ML)
★ Optimized Gradient Boosted
論文目次 Abstract (Chinese) i
Abstract (English) ii
Acknowledgments iii
Table of Content iv
List of figures v
List of tables vi

Chapter I: Introduction 1
1.1 Background and motivation 1
1.2 Literature review 6
1.2.1 Speech disorders in Parkinson’s Disease 7
1.2.2 Related works 9
1.3 Objectives of the thesis 13
1.4 Thesis outline 13

Chapter II: Overview of Machine Learning Approach 15
2.1 Machine learning: Linear model 15
2.2 Machine learning: Classification 17
2.3 Machine learning: Classification and regression trees (CART) 17
2.4 Machine learning: Gradient boosting classifier 20
2.4.1 Numerical optimization 21
2.4.2 Steepest-descent 22
2.4.3 Numerical optimization in function space 22
2.4.4 Finite data 23
2.4.5 Additive modeling 26
2.4.5.1 Least-squares regression 26
2.4.5.2 Least-absolute-deviation (LAD) regression 27
2.4.5.3 Regression tree 27
2.4.5.4 M-regression 30
2.5 Two-class logistic regression and classification 32
2.5.1 Multi-class logistic regression and classification 34
2.6 Evaluating of machine learning algorithms 36
2.6.1 Evaluation accuracy 36
2.6.2 Confusion matrix 36
2.6.3 Receiver operating characteristic (ROC) curve
and Area Under Curve (AUC) 37
2.6.4 Recall and Precision metrics 38
2.6.5 Mathew Correlation Coefficient (MCC) 38

Chapter III: Methodology of Acoustic Parameters and Machine Learning 39
3.1 Acoustic parameters 40
3.1.1 Fundamental frequency information measurements 41
3.1.2 Short and long-term frequency perturbation measurements 43
3.1.3 Short and long-term amplitude perturbation measurements 45
3.1.4 Voice break related measurements 47
3.1.5 Sub-harmonic components related measurements 48
3.1.6 Voice irregularity related measurements 48
3.1.7 Noise related measurements 49
3.2 Machine learning approach 49

Chapter IV: Results and Discussion 53
4.1 Study subjects and devices 53
4.2 Experimental results 55
4.2.1 PD + PD control 56
4.2.2. PD + Healthy voice 59
4.2.3. PD + PD control + Healthy voice 62
4.2.4. Vowel evaluation and assessment 65
4.3 Validation of optimized gradient boosted 67
4.5. Discussion 69

Chapter V: Conclusion and future work 72
References 74
參考文獻 Aithal et al. (2011). “Acoustic analysis of voice in normal and high pitch phonation: a comparative study”. Folia Phoniat, vol. 64, pp. 48-53.
Alpaydin, E., “Introduction to machine learning”., MIT Press., United States, 2010.
Avagyan et al. (2015). “Speech rehabilitation in parkinson’s disease”. International Journal of Neurology Research, vol. 1, no. 3, pp. 158-162.
Awan et al. (2016). “Validation of the cepstral spectral index of dysphonia (CSID) as a screening tool for voice disorder: development of clinical cutoff values”. Journal of Voice, vol. 30, pp. 130-144.
Bandini et al. (2016). “Markerless analysis of articulatory movements in patients with Parkinson’s disease”. Journal of Voice, vol. 30, pp. 766e1-766e11.
Bang et al. (2013). “Acoustic characteristics of vowel sounds in patients with Parkinson disease”. NeuroRehabilitation, vol. 32, pp. 649.
Behrman, A. (2004). “Common practices of voice therapists in the evaluation of patients”. Journal of Voice, vol. 19, pp. 454-469.
Biau, G., Cadre B. (2017). “Optimization by Gradient Boosting”. Mathematics, Computer Science, ArXiv.
Boersma, P., van Heuven, V. (2001). “Speak and unspeak with PRAAT”. Glot International, vol. 5, no. 9, pp. 341-347.
Breiman et al. (1983). “Classification and Regression Trees”. Mathematics, Computer Science, Medicine.
Breiman, L. (1999). “Prediction games and arcing algorithms”. Neural Computation. Vol.11, no.7, pp. 1493-1517.
Brockmann et al. (2011). “Reliable jitter and shimmer measurements in voice clinics: the relevance of vowel, gender, vocal intensity, and fundamental frequency effect in typical clinical task”. Journal of Voice, vol. 26, pp. 44-53.
Calton, R., H., Casper, J., K., Understanding voice problems: a physiological perspective for diagnosis and treatment (2nd ed)., Williams&Wilkins., Baltimore., 1996.
Carson et al. (2015). “Acoustic analyses of prolonged vowels in young adults with Friedreich ataxia”. Journal of Voice, vol. 30, pp. 272-280.
Code, C. (1998). “Models, theories and heuristics in apraxia of speech”. Clinical Linguistics & Phonetics, vol. 12, no. 1, pp. 47-65.
Cortes, C., Vapnik, V. (1995). "Support-vector network". Mach. Learn., vol. 20, pp. 273–297.
Darley et al. (1969a). “Differential diagnostic patterns of dysarthria”. Journal of Speech and Hearing Research, vol. 2, pp. 246.
Darley et al. (1969b). “Clusters of deviant speech dimensions in the dysarthrias”. Journal of Speech and Hearing Research, vol. 3, pp. 462.
Darley et al., Motor speech disorders., Saunders., Philadelphia., 1975.
Dashtipour et al. (2018). “Speech disorders in Parkinson’s disease: pathophysiology, medical management and surgical approaches”. Neurodegenerative Disease Management, vol. 8, no. 5, pp. 337-348.
Davis, S., B. (1979). “Acoustic Characteristics of Normal and Pathological Voices”. Speech and Language, vol. 1, pp. 271-335.
Deng, L., Li, X. (2013). “Machine learning paradigms for speech recognition: An overview”. IEEE Transactions on Audio, Speech, and Language Processing, vol. 21, no. 5, pp. 1060-1089.
Dromey, C. “Spectral measures and perceptual ratings of hypokinetic dysarthria”. Journal of Medical Speech-Language Pathology, vol. 11, pp. 85-94.
Drucker, H. (1997). “Improving regressors using boosting techniques”. Proceeding of the Fourteenth International Conference on Machine Learning ed D. Fisher, Jr., pp. 107-115 Morgan Kaufmann.
Duffy J.R., Motor speech disorders: Substrates, differential diagnosis, and management., MO: Elsevier Health Sciences., St. Louis, 1995.
Duffy, N., Helmbold, D. (1999) “A geometric approach to leveraging weak learners”. Computational learning Theory, Proceeding of 4th European Conference EuroCOLT99. P. Fischer and H. U Simon Eds. Springer, pp. 18-33.
Elbez etl. (2018). “Global, regional, and national burden of Parkinson’s disease, 1999-2016: a systematic analysis for the global burden of disease study 2016”. The Lancet Neurology, vol. 17, pp. 939-953.
Fang et al. (2018). “Detection of pathological voice using cepstrum vectors: a deep learning approach”. Journal of Voice, vol. 33, pp. 634-641.
Freund, Y., Schspire, R. (1996). “Experiments with a new boosting algorithm”. In Machine learning: Proceeding of the Thirteenth International Conference, pp. 148-156.
Friedman et al. (2000). “Additive logistic regression a statistical view of boosting (with discussion)”. Annals of statistics, vol. 28, pp. 337-407.
Gamboa et al. (1997). “Acoustic voice analysis in patients with Parkinson′s disease treated with dopaminergic drugs”. Journal of Voice, vol. 11, no. 3, pp. 314-320.
Grunwell, P., Developmental phonological disability: Order in disorder. In B. W. Hodson and M. L. Edwards (Eds), Perspective in applied phonology (pp. 61-103), Gaithersburg, MD: Aspen.
Halawa et al. (2014). “Assessment of effectiveness of acoustic analysis of voice for monitoring the evolution of vocal nodules after vocal treatment”. Eur Arch Otorhinolaryngol, vol. 271, pp. 749-756.
Harrison, A.E, Speech disorders: causes, treatment, and social effects., Nova Science Publishers., New York, 2010.
Hastie, T., Tibshirani, R. Generalized Additive Models., Springer., New York, 1990.
Hillenbrand, J. (1987). “A methodological study of perturbation and additive noise in synthetically generated voice signals”. Journal of Speech and Hearing, vol. 30, pp. 448-461.
Ho et el. (1998). “Speech impairment in a large sample of patients with Parkinson’s disease”. Behavioural Neurology 11, vol. 3, pp. 131-137.
Hsieh, F.I., Chiou, H.Y. (2014). "Stroke: Morbidity, Risk Factors, and Care in Taiwan". Journal of Stroke, vol. 16, no. 2, pp. 59-64.
Huber, P. (1964). “Robust estimation of a location parameter”. Annlas of mathematical statics, vol. 35, pp. 73-101.
Jiang et al. (2020). “Supervised machine learning: A brief primer”. Behavior Therapy. [access online].
Jiménez-Jiménez et al. (1997). “Acoustic voice analysis in untreated patients with Parkinson’s disease”. Parkinsonism & Related Disorders, vol. 3, no. 2, pp. 111-116.
Kara et al. (2017). “FPGA-accelerated dense linear machine learning: A precision-convergence trade-off”. IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp. 160-167.
Karlsen et al. (2018). “Acoustic voice analysis and maximum phonation time in relation to voice handicap index score and larynx disease”. Journal of Voice, vol. 34, pp. 161.E27-161.E35.
Kelleher et al., Fundamental of machine learning for predictive data analytic, MIT Press., Cambridge, Massachusetts, United States, 2015.
Kent et al. (1999). “Reliability of the multi-dimensional voice program for the analysis of voice samples of subjects with dysarthria”. American Journal of Speech-Language Pathology, vol. 8, pp. 129-136.
Lansford et al. (2011). “A cognitive-perceptual approach to conceptualizing speech intelligibility deficits and remediation practice in hypokinetic dysarthria”. SAGE-Hindawi Access to Research Parkinson’s Disease, vol. 201, pp. 1-9.
Lechien et al. (2019). “Are the acoustic measurements reliable in the assessment of voice quality? A methodological prospective study”. Journal of Voice, [available online].
Liepmann, H. (1920) “Apraxie. ergebnisse der gesamten medizin”, 1, pp. 516-543.
Lopes et al. (2016). “Accuracy of acoustic analysis measurements in the evaluation of patients with different laryngeal diagnoses”. Journal of Voice, vol. 31, pp. 382.E15-382.E26.
Ma et al. (2016). “Big health application system based on health internet of things and big data”. IEEE Access, vol. 5, pp. 1.
Murdoch., Dysarthria: A Physiological Approach to Assessment and Treatment., Stanley Publisher., UK, 1998.
Murphy, KP., Machine learning: a probabilistic perspective, MIT Press., Cambridge, Massachusetts, United States, 2012.
Nicastri et al. (2005). “Multidimensional Voice Program (MDVP) and amplitude variation parameters in euphonic adult subjects”. Acta Otorhinolaryngoal Italica, vol. 24, no. 6, pp. 337.
Ogar et al. (2005). “Apraxia of speech: an overview”. Neurocase, vol. 11, pp. 427-432.
Oller, L., L., Analysis of voice signals for the Harmonics-to-Noise crossover frequency., KTH., Barcelona., 2008.
Ortiz et al. (2016). “Sensorimotor speech disorders in Parkinson’s disease”. Dement Neuropsychol, vol. 10, no. 3, pp. 210-216.
Papapetropoulos, S., Mitsi, G., and Espay, A. J. (2015). “Digital health revolution: is it time for affordable remote monitoring for Parkinson’s disease?”. Front. Neurol. vol. 6, no. 34, pp. 1-3.
Pawlukowska et al. (2018). “Difference between subjective and objective assessment of speech deficiency in parkinson disease”. Journal of Voice, vol. 32, pp. 715-722.
Ramig et al. (2008). “Speech treatment for Parkinson’s disease”. Expert Rev. Neurotherapeutics 8(2), pp. 299-331
Ramig et al. (2008). “Speech treatment for Parkinson’s disease”. Neurotherapeutics 8, vol. 2, pp. 299-311.
Rektorova et al. (2012). “Functional neuroanatomy of vocalization in patients with Parkinson’s disease”. Journal Neurological Sciences, vol. 313, pp. 7-12.
Romani, C., Calabrese, A. (1998). “Syllabic constraints in the phonological errors of an aphasic patients”. Brain Language, vol. 64, pp. 83-121.
Roy et al. (2013). “Evidence-based clinical voice assessment: a systematic review”. American Journal of Speech-Language Pathology, vol. 22, pp. 212-226.
Rusz et al. (2011). “Quantitative acoustic measurements for characterization of speech and voice disorders in early untreated Parkinson’s disease”. Acoustical Society of America, vol. 129, no. 1, pp. 350-367.
Sahoo et al. (2017). “Analyzing healthcare big data with prediction for future health condition”. IEEE Transaction on Industrial and Informatics, vol. 14, issue 7.
Sapir et al. (2008). “Speech and swallowing disorders in Parkinson disease”. Current Opinion in Otolaryngology & Head and Neck Surgery, vol. 16, pp. 205-210.
Sauder et al. (2017). “Predicting voice disorder status from smoothed measures of cepstral peak prominence using praat and analysis of dysphonia in speech and voice (ADSV)”. Journal of Voice, vol. 31, pp. 557-566.
Schapire RE., Freund Y., Boosting foundation algorithm, Massachusetts Institute of Technology, United States, 2012.
Tawalbeh et al. (2016). “Mobile could computing model and big data analysis for healthcare applications”. IEEE Access, vol. 4, pp. 6171-6180.
Teixeire, J.P., Fernandes, P.O. (2015). “Acoustic analysis of vocal dysphonia”. Procedia Computer Science, vol. 64, pp. 466-473.
Titze, I., R., Workshop acoustic voice analysis., National Centre for Voice and Speech., Lowa City, 1995.
Tiwari, AK. (2016). “Machine Learning Based Approaches for Prediction of Parkinson’s Disease”. Machine Learning and Applications: An International Journal, vol. 3, no. 2, pp. 33-39.
Uloza et al. (2010). “Categorizing normal and pathological voice: automated and perceptual categorization”. Journal of Voice, vol. 25, pp. 700-708.
Vieira et al., Machine learning methods and applications to brain disorders., Elsevier, pages 1-20, Amsterdam, Netherlands, 2020.
Weismer et al. (2001). “Acoustic and intelligibility characteristics of sentence production in neurogenic speech disorders”. Folia Phoniat, vol. 53, pp. 1-18.
Whitfield, J., A., Goberman, A., M. (2017). “Speech motor sequence learning: acquisition and retention in Parkinson disease and normal aging”. Journal of Speech, Language, and Hearing, vol. 60, no. 6, pp. 1477-1492.
Wroge et al. (2018). “Parkinson’s Disease Diagnosis Using Machine Learning and Voice”. IEEE Signal Processing in Medicine and Biology Symposium, pp. 1-7.
Yuan et al. (2012) “A linear classification is a useful tool in machine learning and data mining”. Proceedings of the IEEE, vol. 100, no. 9, pp. 2584-2603.
Ziegler, W. (2008). “Apraxia of speech”. Handbook of Clinical Neurology, vol. 88, 3rd series.
Ziegler, W., Ackermann, H. (2017). “Subcortical contributions to motor speech: phylogenetic, developmental, clinical”. Trends Neuroscience, vol. 40, no. 8, pp. 458-468.
指導教授 吳炤民(Chao-Min Wu) 審核日期 2020-8-20
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明