博碩士論文 111423062 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:10 、訪客IP:18.216.95.250
姓名 黃永璿(Yun-shuan Huang)  查詢紙本館藏   畢業系所 資訊管理學系
論文名稱 non
(Bipolar Disorder Prediction with Transfer Learning on Acoustic and Linguistic Embeddings)
相關論文
★ 不動產仲介業銷售住宅類別之成交預測模型—以不動產仲介S公司為例★ 應用文字探勘技術建構預測客訴問題類別機器學習模型
★ 以機器學習技術建構顧客回購率預測模型:以某手工皂原料電子商務網站為例★ 以機器學習建構股價預測模型:以台灣股市為例
★ 以機器學習方法建構財務危機之預測模型:以台灣上市櫃公司為例★ 運用資料探勘技術於股票填息之預測模型:以台灣股市上市公司為例
★ 運用資料探勘技術優化 次世代防火牆規則之研究★ 應用資料探勘技術於電子病歷文本中識別相關新資訊
★ 應用深度學習於藥品後市場監督:Twitter文本分類任務★ 運用電子病歷與資料探勘技術建構腦中風病人心房顫動預測模型
★ 考量特徵選取與隨機森林之遺漏值填補技術★ 電子病歷縮寫消歧與一對多分類任務
★ 運用Meta-path與注意力機制改善個人化穿搭推薦★ 運用機器學習技術建構核保風險預測模型:以A公司為例
★ 風扇壽命預測使用大數據分析-以 X 公司為例★ 使用文字探勘與深度學習技術建置中風後肺炎之預測模型
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (2029-7-1以後開放)
摘要(中) 躁鬱症是一種精神疾病,對個人的生活有深遠影響。準確且早期的診斷至關重要,然而誤診為抑鬱症可能導致錯誤的治療。開發協助臨床醫生進行精確診斷的工具可以減少誤診的機會。機器學習提供了這樣的解決方案。近年來,機器學習在音訊領域的研究日益增多,許多研究探索了使用音訊資料來預測精神疾病的可能性。然而,要收集到足夠的音訊資料來建立分類模型的成本非常高,這也成為使用音訊作為模型學習資料來源的相關研究中的一大限制。遷移學習提供了一個可行的解決方案來應對小型資料集的問題。本研究針對音訊和文本特徵在診斷躁鬱症的有效性進行了全面研究,比較了傳統的特徵工程技術與預訓練模型所提取的學習特徵。本研究的結果顯示學習特徵顯著優於傳統的特徵工程術。此外,本研究還探討了結合音訊和文本資料多模態方法的成效。雖然這些多模態方法未能超越單獨使用音訊特徵的表現,但它們在單獨使用文本特徵方面提供了顯著的提升。這些發現為未來進行相關研究提供了有價值的方法論參考。
摘要(英) Bipolar disorder is a mental disorder that can seriously affect individuals. Early and accurate diagnosis is crucial; however, misdiagnosis of bipolar disorder as depression can lead to incorrect treatment. Therefore, it is important to develop tools that support clinicians in making accurate diagnoses. Machine learning approaches can provide such solutions. Recently, audio has become an important domain for research, with increasing studies exploring the use of audio data to predict mental disorders. However, collecting a sufficient amount of audio data to build a classifier model is costly and impractical, presenting a challenge in utilizing audio data. To address the issue of limited datasets, transfer learning offers a viable solution. In this paper, we conduct a comprehensive study on the effectiveness of audio and textual features, comparing conventional hand-crafted features with learned features. Our results show that learned features outperform conventional hand-crafted features. Additionally, we explore multimodal approaches that combine audio and textual data, finding that while multimodal techniques do not surpass the performance of audio features alone, they do provide improvements over textual features alone. These findings highlight the potential of learned features and multimodal approaches in supporting the accurate diagnosis of bipolar disorder, suggesting a promising direction for future research.
關鍵字(中) ★ 躁鬱症
★ 音訊特徵
★ 文字特徵
★ 深度學習特徵
★ 多模態學習
關鍵字(英) ★ Bipolar Disorder
★ Acoustic Feature
★ Linguistic Feature
★ Learned Feature
★ Multi-modal Learning
論文目次 摘要 i
Abstract ii
誌謝 iii
List of Figures v
List of Tables vi
1. Introduction 1
2. Literature Review 5
2.1 The Investigative Features for Predicting Bipolar Disorder 5
2.1.1 Acoustic (Audio) Features 7
2.1.2. Linguistic (Text) Feature 9
2.2 Transfer Learning 10
3. Research Method 12
3.1 Dataset 12
3.2 Feature Extraction 13
3.2.1 Exploring Low-Level Features 13
3.2.2 Exploring High Level Features 14
3.3 Classification Methods 17
• XGBoost 17
• Random Forest 17
• Logistic Regression 18
• Support Vector Machine 18
• K-Nearest Neighbour 18
3.4 Experiment Design 19
3.5 Evaluation Metrics 22
4. Experiment 24
4.1 Experiment 1: Evaluating the Effectiveness of Acoustic Features in Bipolar Disorder 24
4.3 Experiment 2 26
4.4 Experiment 3 29
4.5 Discussion 31
5. Conclusion 34
Reference 36
參考文獻 AbaeiKoupaei, N., & Al Osman, H. (2023). A Multi-Modal Stacked Ensemble Model for Bipolar Disorder Classification. IEEE Transactions on Affective Computing, 14(1), 236–244. https://doi.org/10.1109/TAFFC.2020.3047582
Aich, A., & Parde, N. (2022). Are You Really Okay? A Transfer Learning-based Approach for Identification of Underlying Mental Illnesses. In A. Zirikly, D. Atzil-Slonim, M. Liakata, S. Bedrick, B. Desmet, M. Ireland, A. Lee, S. MacAvaney, M. Purver, R. Resnik, & A. Yates (Eds.), Proceedings of the Eighth Workshop on Computational Linguistics and Clinical Psychology (pp. 89–104). Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.clpsych-1.8
Baevski, A., Zhou, H., Mohamed, A., & Auli, M. (2020). wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (arXiv:2006.11477). arXiv. https://doi.org/10.48550/arXiv.2006.11477
Baki, P., Kaya, H., Çiftçi, E., Güleç, H., & Salah, A. A. (2022). A Multimodal Approach for Mania Level Prediction in Bipolar Disorder. IEEE Transactions on Affective Computing, 13(4), 2119–2131. https://doi.org/10.1109/TAFFC.2022.3193054
Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
Ceccarelli, F., & Mahmoud, M. (2022). Multimodal temporal machine learning for Bipolar Disorder and Depression Recognition. Pattern Analysis and Applications, 25(3), 493–504. https://doi.org/10.1007/s10044-021-01001-y
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 785–794. https://doi.org/10.1145/2939672.2939785
Church, K. W. (2017). Word2Vec. Natural Language Engineering, 23(1), 155–162. https://doi.org/10.1017/S1351324916000334
Cohan, A., Desmet, B., Yates, A., Soldaini, L., MacAvaney, S., & Goharian, N. (2018). SMHD: A Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions (arXiv:1806.05258). arXiv. http://arxiv.org/abs/1806.05258
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297. https://doi.org/10.1007/BF00994018
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27. https://doi.org/10.1109/TIT.1967.1053964
Cummins, N., Scherer, S., Krajewski, J., Schnieder, S., Epps, J., & Quatieri, T. F. (2015). A review of depression and suicide risk assessment using speech analysis. Speech Communication, 71, 10–49. https://doi.org/10.1016/j.specom.2015.03.004
Dai, H.-J., Su, C.-H., Lee, Y.-Q., Zhang, Y.-C., Wang, C.-K., Kuo, C.-J., & Wu, C.-S. (2021). Deep Learning-Based Natural Language Processing for Screening Psychiatric Patients. Frontiers in Psychiatry, 11, 533949. https://doi.org/10.3389/fpsyt.2020.533949
Deng, J. J., Leung, C. H. C., & Li, Y. (2021). Multimodal Emotion Recognition Using Transfer Learning on Audio and Text Data. In O. Gervasi, B. Murgante, S. Misra, C. Garau, I. Blečić, D. Taniar, B. O. Apduhan, A. M. A. C. Rocha, E. Tarantino, & C. M. Torre (Eds.), Computational Science and Its Applications – ICCSA 2021 (pp. 552–563). Springer International Publishing. https://doi.org/10.1007/978-3-030-86970-0_39
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (arXiv:1810.04805). arXiv. http://arxiv.org/abs/1810.04805
Eyben, F., Scherer, K. R., Schuller, B. W., Sundberg, J., Andre, E., Busso, C., Devillers, L. Y., Epps, J., Laukka, P., Narayanan, S. S., & Truong, K. P. (2016). The Geneva Minimalistic Acoustic Parameter Set (GeMAPS) for Voice Research and Affective Computing. IEEE Transactions on Affective Computing, 7(2), 190–202. https://doi.org/10.1109/TAFFC.2015.2457417
Farrús, M., Codina-Filbà, J., & Escudero, J. (2021). Acoustic and prosodic information for home monitoring of bipolar disorder. Health Informatics Journal, 27(1), 146045822097275. https://doi.org/10.1177/1460458220972755
Gkotsis, G., Oellrich, A., Hubbard, T., Dobson, R., Liakata, M., Velupillai, S., & Dutta, R. (2016). The language of mental health problems in social media. Proceedings of the Third Workshop on Computational Lingusitics and Clinical Psychology, 63–73. https://doi.org/10.18653/v1/W16-0307
Grande, I., Berk, M., Birmaher, B., & Vieta, E. (2016). Bipolar disorder. The Lancet, 387(10027), 1561–1572. https://doi.org/10.1016/S0140-6736(15)00241-X
Grollmisch, S., Cano, E., Kehling, C., & Taenzer, M. (2021). Analyzing the Potential of Pre-Trained Embeddings for Audio Classification Tasks. 2020 28th European Signal Processing Conference (EUSIPCO), 790–794. https://doi.org/10.23919/Eusipco47968.2020.9287743
Guidi, A., Schoentgen, J., Bertschy, G., Gentili, C., Scilingo, E. P., & Vanello, N. (2017). Features of vocal frequency contour and speech rhythm in bipolar disorder. Biomedical Signal Processing and Control, 37, 23–31. https://doi.org/10.1016/j.bspc.2017.01.017
Harvey, D., Lobban, F., Rayson, P., Warner, A., & Jones, S. (2022). Natural Language Processing Methods and Bipolar Disorder: Scoping Review. JMIR Mental Health, 9(4), e35928. https://doi.org/10.2196/35928
He, L., & Cao, C. (2018). Automated depression analysis using convolutional neural networks from speech. Journal of Biomedical Informatics, 83, 103–111. https://doi.org/10.1016/j.jbi.2018.05.007
Hershey, S., Chaudhuri, S., Ellis, D. P. W., Gemmeke, J. F., Jansen, A., Moore, R. C., Plakal, M., Platt, D., Saurous, R. A., Seybold, B., Slaney, M., Weiss, R. J., & Wilson, K. (2017). CNN Architectures for Large-Scale Audio Classification (arXiv:1609.09430). arXiv. https://doi.org/10.48550/arXiv.1609.09430
Hirschfeld, R. M. (2014). Differential diagnosis of bipolar disorder and major depressive disorder. Journal of Affective Disorders, 169, S12–S16. https://doi.org/10.1016/S0165-0327(14)70004-7
Hsu, W.-N., Bolte, B., Tsai, Y.-H. H., Lakhotia, K., Salakhutdinov, R., & Mohamed, A. (2021). HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units (arXiv:2106.07447). arXiv. https://doi.org/10.48550/arXiv.2106.07447
Hu, Y.-H., Chen, K., Chang, I.-C., & Shen, C.-C. (2020). Critical Predictors for the Early Detection of Conversion From Unipolar Major Depressive Disorder to Bipolar Disorder: Nationwide Population-Based Retrospective Cohort Study. JMIR Medical Informatics, 8(4), e14278. https://doi.org/10.2196/14278
Huang, Y.-H., Chen, Y.-H., Alvarado, F. H. C., Lee, S.-R., Wu, S.-I., Lai, Y., & Chen, Y.-S. (2019). Leveraging Linguistic Characteristics for Bipolar Disorder Recognition with Gender Differences (arXiv:1907.07366). arXiv. http://arxiv.org/abs/1907.07366
Jan, Z., AI-Ansari, N., Mousa, O., Abd-alrazaq, A., Ahmed, A., Alam, T., & Househ, M. (2021). The Role of Machine Learning in Diagnosing Bipolar Disorder: Scoping Review. Journal of Medical Internet Research, 23(11), e29749. https://doi.org/10.2196/29749
Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach (arXiv:1907.11692). arXiv. https://doi.org/10.48550/arXiv.1907.11692
Maxhuni, A., Muñoz-Meléndez, A., Osmani, V., Perez, H., Mayora, O., & Morales, E. F. (2016). Classification of bipolar disorder episodes based on analysis of voice and motor activity of patients. Pervasive and Mobile Computing, 31, 50–66. https://doi.org/10.1016/j.pmcj.2016.01.008
McIntyre, R. S., Berk, M., Brietzke, E., Goldstein, B. I., López-Jaramillo, C., Kessing, L. V., Malhi, G. S., Nierenberg, A. A., Rosenblat, J. D., Majeed, A., Vieta, E., Vinberg, M., Young, A. H., & Mansur, R. B. (2020). Bipolar disorders. The Lancet, 396(10265), 1841–1856. https://doi.org/10.1016/S0140-6736(20)31544-0
Meyer, F., & Meyer, T. D. (2009). The misdiagnosis of bipolar disorder as a psychotic disorder: Some of its causes and their influence on therapy. Journal of Affective Disorders, 112(1–3), 174–183. https://doi.org/10.1016/j.jad.2008.04.022
Mundt, J. C., Vogel, A. P., Feltner, D. E., & Lenderking, W. R. (2012). Vocal Acoustic Biomarkers of Depression Severity and Treatment Response. Biological Psychiatry, 72(7), 580–587. https://doi.org/10.1016/j.biopsych.2012.03.015
Pan, S. J., & Yang, Q. (2010). A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345–1359. https://doi.org/10.1109/TKDE.2009.191
Pan, W., Deng, F., Wang, X., Hang, B., Zhou, W., & Zhu, T. (2023). Exploring the ability of vocal biomarkers in distinguishing depression from bipolar disorder, schizophrenia, and healthy controls. Frontiers in Psychiatry, 14, 1079448. https://doi.org/10.3389/fpsyt.2023.1079448
Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. https://doi.org/10.3115/v1/D14-1162
Phillips, M. L., & Kupfer, D. J. (2013). Bipolar disorder diagnosis: Challenges and future directions. The Lancet, 381(9878), 1663–1671. https://doi.org/10.1016/S0140-6736(13)60989-7
Podcastle: All-in-One Podcast Software. (n.d.). Retrieved June 10, 2024, from https://podcastle.ai/
Rotenberg, L. D. S., Borges-Júnior, R. G., Lafer, B., Salvini, R., & Dias, R. D. S. (2021). Exploring machine learning to predict depressive relapses of bipolar disorder patients. Journal of Affective Disorders, 295, 681–687. https://doi.org/10.1016/j.jad.2021.08.127
Rude, S., Gortner, E.-M., & Pennebaker, J. (2004). Language use of depressed and depression-vulnerable college students. Cognition & Emotion, 18(8), 1121–1133. https://doi.org/10.1080/02699930441000030
Sharma, G., Umapathy, K., & Krishnan, S. (2020). Trends in audio signal feature extraction methods. Applied Acoustics, 158, 107020. https://doi.org/10.1016/j.apacoust.2019.107020
Singh, T., & Rajput, M. (2006). Misdiagnosis of Bipolar Disorder. Psychiatry (Edgmont), 3(10), 57–63.
Syed, Z. S., Ali, S., & Latif, A. (2020). Deep Acoustic Embeddings for Identifying Parkinsonian Speech. International Journal of Advanced Computer Science and Applications, 11(10). https://doi.org/10.14569/IJACSA.2020.0111089
TAIDE. (2024, May 21). https://huggingface.co/taide
Touvron, H., Martin, L., Stone, K., Albert, P., Almahairi, A., Babaei, Y., Bashlykov, N., Batra, S., Bhargava, P., Bhosale, S., Bikel, D., Blecher, L., Ferrer, C. C., Chen, M., Cucurull, G., Esiobu, D., Fernandes, J., Fu, J., Fu, W., … Scialom, T. (2023). Llama 2: Open Foundation and Fine-Tuned Chat Models (arXiv:2307.09288). arXiv. https://doi.org/10.48550/arXiv.2307.09288
Wang, B., Wu, Y., Taylor, N., Lyons, T., Liakata, M., Nevado-Holgado, A. J., & Saunders, K. E. A. (2021). Learning to Detect Bipolar Disorder and Borderline Personality Disorder with Language and Speech in Non-Clinical Interviews (arXiv:2008.03408). arXiv. http://arxiv.org/abs/2008.03408
Wang, Q., Dai, S., Xu, B., Lyu, Y., Zhu, Y., Wu, H., & Wang, H. (2022). Building Chinese Biomedical Language Models via Multi-Level Text Discrimination (arXiv:2110.07244). arXiv. https://doi.org/10.48550/arXiv.2110.07244
Weiss, K., Khoshgoftaar, T. M., & Wang, D. (2016). A survey of transfer learning. Journal of Big Data, 3(1), 9. https://doi.org/10.1186/s40537-016-0043-6
World mental health report: Transforming mental health for all. (n.d.). Retrieved December 10, 2023, from https://www.who.int/publications-detail-redirect/9789240049338
Yang, L., Li, Y., Chen, H., Jiang, D., Oveneke, M. C., & Sahli, H. (2018). Bipolar Disorder Recognition with Histogram Features of Arousal and Body Gestures. Proceedings of the 2018 on Audio/Visual Emotion Challenge and Workshop, 15–21. https://doi.org/10.1145/3266302.3266308
Zaman, K., Sah, M., Direkoglu, C., & Unoki, M. (2023). A Survey of Audio Classification Using Deep Learning. IEEE Access, 11, 106620–106649. https://doi.org/10.1109/ACCESS.2023.3318015
Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., & He, Q. (2021). A Comprehensive Survey on Transfer Learning. Proceedings of the IEEE, 109(1), 43–76. https://doi.org/10.1109/JPROC.2020.3004555
指導教授 胡雅涵(Ya Han Hu) 審核日期 2024-7-22
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明