dc.description.abstract | Cerebrovascular disease, which is also known as stroke, is the second largest reason of deaths of human worldwide and the third largest reason of disability. Atrial fibrillation is the potential factor to cause ischemic stroke, and it is strongly related to ischemic stroke as well. However, it′s difficult to detect atrial fibrillation, causing the situation that the patient can′t receive the treatment properly. When an acute ischemic stroke patient is detected atrial fibrillation, the strategy of secondary prevention will be modified accordingly. The main purpose of this study is to use electronic medical records and the machine learning algorithm to build the early prediction model based on the patients who have had ischemic stroke. The second purpose is to compare the performance of the prediction model based on the structured data with that based on the unstructured data. We hope that the model proposed by the study can assist the doctors′ medical decision making, and to utilize medical resources properly.
In the experiment of predicting atrial fibrillation, we found that in the experiment 1, logistic regression classifier has the best performance on data with different features, especially on structural features combined with text features. In the experiment 2, we build and cross validate the model based on the data of two hospitals. The results indicated that using unstructured data of different hospitals to build prediction model of atrial fibrillation, the effect of performance is not as expected. Therefore, this study proved that compared to only using the structured features, the combination of structured and text features can enhance the performance of the model. | en_US |