dc.description.abstract | Social media has developed rapidly in recent years. People are used to sharing their own journal on the community. When you have a problem, you will first go to the social media to seek answers. In this case, our study wants to use sentiment analysis to find hidden value and generate more related extension application. Past studies indicate that sentiment analysis is used for movie reviews and product reviews, etc. Less research is aimed at sentiment analysis in the medical field. Therefore, this study uses the patient-authored text as the dataset of sentiment analysis. In order to find out whether the user is currently suffering from disease and find ways to help them. The source of the study′s dataset is the UK′s medical forum called DailyStrength. We found that the patient-authored text contained many medical terms such as drug names, symptoms, diseases, etc. And these words with a different adverb of degree or adjectives will make the emotions become excellent, good, bad or horrible. And the symptoms are often used to express physical condition. Therefore, the purpose of this study is using symptoms to patient-authored text in sentiment analysis. It’s not only just about understanding positive and negative emotions but distinguishing the difference between bad and horrible, in order to identify high-risk groups and give timely help. The research method mainly divided into four parts. First, we mainly discuss the baseline of the bag-of-words and word embedding representation on the patient-authored text. the best accuracy rate is only 57%, showing that in the most common text representation on the patient-authored text has limited effect. The second part uses the three mentioned symptom representations compared to the baseline, it is found that it can actually improve the prediction accuracy by 3% to 4%. Confirmed that using symptoms can improve prediction accuracy. The third part uses a semi-supervised and hierarchical structure to help distinguish between bad and horrible emotions. The semi-supervised method is used to increase the training samples, which can achieve 65% accuracy in the hierarchical structure, but the effect is not significant compared with the accuracy of the traditional classification in the past. Finally, we use manual evaluation to explore the reasons, which divide the text into long and short texts, found that In the short text there is a great gap between objective analysis and patient subjective feelings. In the long text, human assessment and machine assessment are more inconsistent. | en_US |