利用上下文感知最大化邊界神經網路提取疾病與疾病的關聯

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：123

、訪客IP：3.133.12.172

姓名

盧韋良(Wei-Liang Lu) 查詢紙本館藏

畢業系所

軟體工程研究所

論文名稱

利用上下文感知最大化邊界神經網路提取疾病與疾病的關聯
(Extracting disease-disease associations with context-aware max-margin neural network)

相關論文

★ miRCSC : miRNA表現量伴隨癌症改變狀態的文獻證明搜尋引擎	★ 應用嵌入式系統於呼吸肌肉群訓練儀之系統開發
★ 勃起障礙與缺血性心臟病的雙向研究: 以台灣全人口基礎的世代研究	★ 基質輔助雷射脫附飛行時間式串聯質譜儀微生物抗藥性資料視覺化工具
★ 使用穿戴式裝置分析心律變異及偵測心律不整之應用程式	★ 建立一個自動化分析系統用來分析任何兩種疾病之間的關聯性透過世代研究設計以及使用承保抽樣歸人檔
★ 青光眼病患併發糖尿病,使用Metformin及Sulfonylurea治療得到中風之風險:以台灣人口為基礎的觀察性研究	★ 利用組成識別和序列及空間特性構成之預測系統來針對蛋白質交互作用上的特殊區段點位進行分析及預測辨識
★ 新聞語意特徵擷取流程設計與股價變化關聯性分析	★ 藥物與疾病關聯性自動化分析平台設計與實作
★ 建立財務報告自動分析系統進行股價預測	★ 建立一個分析疾病與癌症關聯性的自動化系統
★ 基於慣性感測器虛擬鍵盤之設計與實作	★ 一個醫療照護監測系統之實作
★ 應用手機開發手握球握力及相關資料之量測	★ 利用關聯分析全面性的搜索癌症關聯疾病

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

由於缺乏人工標註高品質的疾病之間關聯語料庫，在本篇論文中我們建構了一個疾病之間關聯語料庫，並用於建構與評估我們的系統。最後我們建一個末端對末端（End-to-end）的最大化邊界上下文感知神經網絡。在我們的實驗結果顯示相較於單純的卷積類神經網路而言，支持向量機達到 77.82% F1度量，高於CNN模型 2.47% F1度量。接著我們將卷積類神經網路的結果作為特徵值加入支持向量機分類元件中，檢查是否可以提升分類效果，而最好的實驗結果為 77.32% F1度量，比只使用該特徵值的支持向量機低 0.5% F1 度量，主要原因是在訓練支持向量機的同時無法同步更新類神經網路，導致分類效果沒有提升。因此我們建構一末端對末端最大化邊界上下文感知神經網絡來分類疾病關聯，達到最高的 84.34% F1度量，精確度80.65%和召回率88.39%。

摘要(英)

In our study, we constructed a disease-association corpus then use it to build and evaluate the disease-association extraction system. Finally, we propose a max-margin context-aware neural network. The results show that the support vector machine(SVM) achieves the highest F1-measure of 77.82%. The SVM-based approach is higher than the convolutional neural networks(CNN) by F1-measure of 2.47%. Then we merge the softmax layer of CNN as feature to the SVM then check whether the performance was improved or not. However, the best result is an F1-measure of 77.32%, which is 0.5% lower than the original SVM which using only its feature. The possible reason may be the NN can’t be updated synchronously while training the SVM. Therefore, we constructed a max-margin context-aware neural network to classify disease associations and achieve the highest F1-measure of 84.34%.

關鍵字(中)

★ 自然語言處理
★ 文字探勘
★ 機器學習
★ 深度學習

關鍵字(英)

★ natural language processing
★ text mining
★ machine learning
★ deep learning

論文目次

摘要 i
Abstract ii
致謝 iii
Table of contents iv
List of figures v
List of tables vi
Chapter 1. Introduction 1
1.1 Background 1
1.2 Motivation 2
1.3 Goal 2
Chapter 2. Related works 3
Chapter 3. Methods 5
3.1 System architecture 5
3.2 Data collection 6
3.3 Data filtering 6
3.4 Feature and embedding 8
3.4.1 Pair embedding 9
3.4.1 Sentence embedding 13
3.5 Relation extraction 16
Chapter 4. Experiment setup 18
Chapter 5. Results 19
5.1 Comparison of different baseline models 19
5.2 Comparison of combine models 20
5.3 Comparison of max-margin models 21
Chapter 6. Discussions 22
Chapter 7. Conclusion 24
References 25
Appendix 28

參考文獻

1. Ahlqvist, E., T.S. Ahluwalia, and L. Groop, Genetics of type 2 diabetes. Clinical chemistry, 2011. 57(2): p. 241-254.
2. Kunte, H., et al., Sulfonylureas improve outcome in patients with type 2 diabetes and acute ischemic stroke. Stroke, 2007. 38(9): p. 2526-2530.
3. Chillaron, J., et al., Insulin resistance and hypertension in patients with type 1 diabetes. Journal of diabetes and its complications, 2011. 25(4): p. 232.
4. Chawla, A., R. Chawla, and S. Jaggi, Microvasular and macrovascular complications in diabetes mellitus: distinct or continuum? Indian journal of endocrinology and metabolism, 2016. 20(4): p. 546.
5. Leon, B.M. and T.M. Maddox, Diabetes and cardiovascular disease: Epidemiology, biological mechanisms, treatment recommendations and future research. World journal of diabetes, 2015. 6(13): p. 1246.
6. Zhou, H., X. Zhang, and J. Lu, Progress on diabetic cerebrovascular diseases. Bosnian journal of basic medical sciences, 2014. 14(4): p. 185.
7. Martin, E.T., et al., Diabetes and risk of surgical site infection: a systematic review and meta-analysis. infection control & hospital epidemiology, 2016. 37(1): p. 88-99.
8. Bunescu, R., et al., Comparative experiments on learning information extractors for proteins and their interactions. Artificial intelligence in medicine, 2005. 33(2): p. 139-155.
9. Pyysalo, S., et al., BioInfer: a corpus for information extraction in the biomedical domain. BMC bioinformatics, 2007. 8(1): p. 50.
10. Wishart, D.S., et al., DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic acids research, 2017. 46(D1): p. D1074-D1082.
11. Pyysalo, S., T. Ohta, and S. Ananiadou. Overview of the cancer genetics (CG) task of BioNLP Shared Task 2013. in Proceedings of the BioNLP Shared Task 2013 Workshop. 2013.
12. Li, J., et al., BioCreative V CDR task corpus: a resource for chemical disease relation extraction. Database, 2016.
13. Peng, Y., C.-H. Wei, and Z. Lu, Improving chemical disease relation extraction with rich features and weakly labeled data. Journal of cheminformatics, 2016. 8(1): p. 53.
14. Pons, E., et al., Extraction of chemical-induced diseases using prior knowledge and textual information. Database, 2016: p. baw046.
15. Xu, J., et al., CD-REST: a system for extracting chemical-induced disease relation in literature. Database, 2016.
16. Asada, M., M. Miwa, and Y. Sasaki, Extracting Drug-Drug Interactions with Attention CNNs. BioNLP, 2017: p. 9-18.
17. Peng, Y. and Z. Lu, Deep learning for extracting protein-protein interactions from biomedical literature. BioNLP, 2017: p. 29-38.
18. Zhao, Z., et al., Drug drug interaction extraction from biomedical literature using syntax convolutional neural network. Bioinformatics, 2016. 32(22): p. 3444-3453.
19. Peng, Y., et al., Chemical-protein relation extraction with ensembles of SVM, CNN, and RNN models. arXiv preprint arXiv:1802.01255, 2018.
20. Feng, Z., Z. Sun, and L. Jin. Learning deep neural network using max-margin minimum classification error. in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on. 2016. IEEE.
21. Tang, Y., Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239, 2013.
22. Lee, J., et al. On the efficacy of per-relation basis performance evaluation for PPI extraction and a high-precision rule-based approach. in BMC medical informatics and decision making. 2013. BioMed Central.
23. Nguyen, N.T., et al., Wide-coverage relation extraction from MEDLINE using deep syntax. BMC bioinformatics, 2015. 16(1): p. 107.
24. Lipscomb, C.E., Medical subject headings (MeSH). Bulletin of the Medical Library Association, 2000. 88(3): p. 265.
25. Davis, A.P., et al., The comparative toxicogenomics database: update 2017. Nucleic acids research, 2016. 45(D1): p. D972-D978.
26. Leaman, R., R. Islamaj Do?an, and Z. Lu, DNorm: disease name normalization with pairwise learning to rank. Bioinformatics, 2013. 29(22): p. 2909-2917.
27. Mikolov, T., et al. Distributed representations of words and phrases and their compositionality. in Advances in neural information processing systems. 2013.
28. Mikolov, T., et al., Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781, 2013.
29. Mikolov, T., W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. in Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013.
30. Mikolov, T., et al. Recurrent neural network based language model. in Eleventh Annual Conference of the International Speech Communication Association. 2010.
31. SPFGH, M. and T.S.S. Ananiadou, Distributional semantics resources for biomedical text processing. 2013.
32. Peng, Y. and Z. Lu, Deep learning for extracting protein-protein interactions from biomedical literature. arXiv preprint arXiv:1706.01556, 2017.
33. Chang, C.-C. and C.-J. Lin, LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST), 2011. 2(3): p. 27.
34. Xu, M., et al., Deep Learning for Person Reidentification Using Support Vector Machines. Advances in Multimedia, 2017.

指導教授

洪炯宗(Jorng-Tzong Horng)

審核日期

2018-7-23

推文