病徵應用於病患自撰日誌之情緒分析

DC 欄位	值	語言
DC.contributor	資訊管理學系	zh_TW
DC.creator	鄭新禹	zh_TW
DC.creator	Xin-Yu Zheng	en_US
dc.date.accessioned	2019-7-23T07:39:07Z
dc.date.available	2019-7-23T07:39:07Z
dc.date.issued	2019
dc.identifier.uri	http://ir.lib.ncu.edu.tw:444/thesis/view_etd.asp?URN=106423057
dc.contributor.department	資訊管理學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	近年來社群媒體發展快速，人們習慣在平台上分享心情，遇到問題時第一時間也會至社群詢問相同經驗者以尋求解答，在此情況下，本研究欲利用情緒分析從社群資料中找出隱藏的價值並產生相關應用。而過往情緒分析多用於商品、電影評論等，較少文獻探討醫療領域，故本研究以醫療論壇上病人日誌為文本，關懷發文者目前是否遭受疾病侵害，讓身邊的人可以設法給予幫助，提升使用者治療過程心理方面的正向影響。本研究資料集來源為英國醫療論壇 DailyStrength，其中病患自撰日誌包含許多醫療專有名詞，如藥物名稱、病徵、疾病等，這些詞語搭配不同程度的副詞或形容詞會讓情緒變成極好、好、壞或是極差。而病徵通常為一種直觀表達身體感受的專有名詞，因此本研究目的為探討病徵結合情緒分析是否能夠加強病患自撰日誌的情緒辨識，其中不單想了解正負情緒，而是區分 Bad 及 Horrible 的程度差別，藉以找出情緒極差的高危險族群，並適時地給予幫助。本研究以四部分實驗方法進行探討：(1)探討傳統文本表示法 Bag-of-word 及 Word Embedding 在病患日誌上的 Baseline，相較於傳統領域最佳準確率僅 57%，顯示過去常用的文本表示法於病患自撰日誌上效果有限；(2)利用三種提及病徵表示法發現病徵確實可提升 3~4%預測準確率；(3)運用半監督式及階層式架構幫助加強分辨 Bad 及 Horrible 情緒，發現利用半監督式方法增加訓練樣本，應用於階層式架構中準確率能達 65%，但相較於過去傳統分類來說效果不顯著；(4)利用人工評估探討長、短文本中病患主觀感受與第三方客觀感受的差別，發現短文本中人為評估與機器學習結果一致性高，顯示客觀分析與病患主觀感受存在極大落差，而長文本中人為評估與機器評估的感受較不一致，推斷長文本中人為評估因容易理解上下文關係及轉折語氣的表達，因此較機器學習容易判斷情緒。	zh_TW
dc.description.abstract	Social media has developed rapidly in recent years. People are used to sharing their own journal on the community. When you have a problem, you will first go to the social media to seek answers. In this case, our study wants to use sentiment analysis to find hidden value and generate more related extension application. Past studies indicate that sentiment analysis is used for movie reviews and product reviews, etc. Less research is aimed at sentiment analysis in the medical field. Therefore, this study uses the patient-authored text as the dataset of sentiment analysis. In order to find out whether the user is currently suffering from disease and find ways to help them. The source of the study′s dataset is the UK′s medical forum called DailyStrength. We found that the patient-authored text contained many medical terms such as drug names, symptoms, diseases, etc. And these words with a different adverb of degree or adjectives will make the emotions become excellent, good, bad or horrible. And the symptoms are often used to express physical condition. Therefore, the purpose of this study is using symptoms to patient-authored text in sentiment analysis. It’s not only just about understanding positive and negative emotions but distinguishing the difference between bad and horrible, in order to identify high-risk groups and give timely help. The research method mainly divided into four parts. First, we mainly discuss the baseline of the bag-of-words and word embedding representation on the patient-authored text. the best accuracy rate is only 57%, showing that in the most common text representation on the patient-authored text has limited effect. The second part uses the three mentioned symptom representations compared to the baseline, it is found that it can actually improve the prediction accuracy by 3% to 4%. Confirmed that using symptoms can improve prediction accuracy. The third part uses a semi-supervised and hierarchical structure to help distinguish between bad and horrible emotions. The semi-supervised method is used to increase the training samples, which can achieve 65% accuracy in the hierarchical structure, but the effect is not significant compared with the accuracy of the traditional classification in the past. Finally, we use manual evaluation to explore the reasons, which divide the text into long and short texts, found that In the short text there is a great gap between objective analysis and patient subjective feelings. In the long text, human assessment and machine assessment are more inconsistent.	en_US
DC.subject	社群媒體	zh_TW
DC.subject	自然語言處理	zh_TW
DC.subject	情緒分析	zh_TW
DC.subject	病患自撰日誌	zh_TW
DC.subject	Social media	en_US
DC.subject	Natural Language Processing	en_US
DC.subject	Sentiment analysis	en_US
DC.subject	Patient-authored text	en_US
DC.title	病徵應用於病患自撰日誌之情緒分析	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 106423057 完整後設資料紀錄