博碩士論文 108522095 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:10 、訪客IP:18.232.31.206
姓名 方詩匀(Shih-Yun Fang)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於檢驗數值的糖尿病腎病變預測模型
(Prediction Models for Diabetic Nephropathy based on laboratory tests)
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 糖尿病為國人最常見慢性病之一,且時常伴隨其他疾病發生。
其中,糖尿病腎病變便是最常見的併發症中的一種,同時也是高發病率與高死亡率的疾病。
由於腎臟相關疾病在早期不易察覺,等到患者意識到腎功能衰退時,通常都已經需要依靠血液透析維生。
如果能在尚未發病的時期,就告知患者未來患病的可能性,或許能讓患者多加留意自己健康狀況。
對預測結果提供有效的時間資訊是在研究縱向資料很重要的影響因子,
因此本研究會在現有的實驗室資料上探討不同的時間序資料切割方式對於結果的影響。

本研究在生化檢測資料上訓練不同架構的機器學習模型,
包含以樹狀結構為基底的學習模型XGBoot、以tensorflow構造的多層感知機與先以分群演算法來分群各資料點,
再利用泰勒展開式去逼近資料點的雅各比矩陣學習模型。
此外,本研究比較多種特徵選取方法並分析特徵對於結果的影響。
最終,以多層感知機與自選特徵在交叉驗證上的效果最好,準確率與靈敏度分別達到85.7%與85.4%。
摘要(英) Diabetes is one of the most common chronic diseases in Taiwan and is often associated with various complications.
Among them, diabetic nephropathy is one of the most frequent ones.
It is also a disease with high morbidity and mortality.
Because symptoms of kidney-related diseases are usually not readily observable at an early stage,
most patients are unaware of it until the condition has progressed.
By the time the kidney damage has already occurred,
however, it is usually too late, and the patients will need hemodialysis as a treatment method for survival.
If the patients can be informed of the possibility of the disease beforehand,
it may allow them to pay more attention to their health conditions.
In this sense, providing effective temporal information for prediction results is an important influencing factor in the study of longitudinal data.
Therefore, this study will explore the influence of different time series data processing methods on the results based on the existing laboratory data.

In this study, machine learning models with different architectures are trained on biochemical data,
which include the learning model XGBoot that is based on tree structure,
the multilayer perceptron built by tensorflow,
and the Jacobian matrix learning model (JMLM).
In general, JMLM is a more interpretive model compared to other models because it first uses clustering algorithm to group each data point and then uses Taylor series expansion to approximate the data points.
In addition, this study compares multiple feature selection methods and analyzes the impact of features on the results.
Ultimately, with the accuracy and sensitivity reaching 0.857 and 0.854, respectively,
the multi-layer perception and self-selected features have the best effect on cross-validation.
關鍵字(中) ★ 糖尿病腎病變,
★ 慢性腎臟病
★ 深度學習
★ 疾病預測模型
關鍵字(英)
論文目次 一、緒論1
1.1 研究動機.................................................................. 1
1.2 研究目的.................................................................. 2
1.3 論文架構.................................................................. 2
二、背景知識以及文獻回顧3
2.1 背景知識.................................................................. 3
2.1.1 糖尿病腎病變................................................... 3
2.1.2 特徵選取......................................................... 3
2.1.3 資料集............................................................ 5
2.1.4 機器學習模型................................................... 7
2.2 文獻回顧.................................................................. 10
2.2.1 對於時間序電子健康紀錄進行建模之研究............... 10
2.2.2 將Machine Learning 應用於疾病相關議題之研究....... 11
三、研究方法13
3.1 資料前處理............................................................... 13
3.1.1 資料篩選與清理................................................ 13
3.1.2 正規化............................................................ 14
3.1.3 特徵選取......................................................... 15
3.2 時間軸資料整合......................................................... 16
3.3 腎病變預測模組......................................................... 18
3.3.1 損失函數......................................................... 19
3.3.2 演算流程......................................................... 19
四、實驗設計與結果22
4.1 評估方法.................................................................. 22
4.2 特徵選取方法實驗結果................................................ 24
4.2.1 PCA................................................................ 24
4.2.2 皮爾森相關係數................................................ 28
4.2.3 Fisher’s Ratio .................................................... 29
4.2.4 ANOVA ........................................................... 30
4.2.5 自選特徵......................................................... 31
4.2.6 共同特徵5-fold 交叉驗證結果............................... 33
4.3 不同時間序之實驗與結果............................................. 34
五、總結40
5.1 結論........................................................................ 40
5.2 未來展望.................................................................. 40
參考文獻42
參考文獻 [1] 國民健康署. “控糖5 撇步健康有保固,” 國民健康署. (Sep. 29, 2021), [Online].
Available: https://www.mohw.gov.tw/cp-5020-63343-1.html.
[2] 衛生福利部中央健康保險署. “健保醫療品質專區,” 衛生福利部中央健康保險署.
(Jun. 21, 2022), [Online]. Available: https://www.nhi.gov.tw/mqinfo/Content.aspx?
Type=CKD&List=1.
[3] 全民健康保險會. “醫療給付費用總額協商參考指標要覽,” 全民健康保險會.
(Jan. 24, 2017), [Online]. Available: https://dep.mohw.gov.tw/nhic/lp-1665-116.html.
[4] S. Krishnamurthy, K. Ks, E. Dovgan, et al., “Machine learning prediction models for
chronic kidney disease using national health insurance claim data in taiwan,” Healthcare
(Basel, Switzerland), vol. 9, no. 5, p. 546, May 7, 2021.
[5] “社團法人中華民國糖尿病衛教學會,” 社團法人中華民國糖尿病衛教學會, [Online].
Available: https://www.tade.org.tw/download/index.asp?Type=11.
[6] “慢性合併症-腎臟病變- 全民糖尿病觀測站,” [Online]. Available: http : / / www .
diabetes.org.tw/wddt_heduc01.jsp?P_TNO=EDUC990070003&P_HCTG=G.
[7] “What is the criteria for CKD,” National Kidney Foundation. (Apr. 20, 2015), [Online].
Available: https://www.kidney.org/professionals/explore-your-knowledge/what-is-thecriteria-
for-ckd.
[8] H. Abdi and L. J. Williams, “Principal component analysis,” WIREs Computational Statistics,
vol. 2, no. 4, pp. 433–459, 2010.
[9] “Pearson's correlation coefficient,” in Encyclopedia of Public Health, W. Kirch, Ed.,
Dordrecht: Springer Netherlands, 2008, pp. 1090–1091.
[10] K. Z. Mao, “RBF neural network center selection based on fisher ratio class separability
measure,” IEEE transactions on neural networks, vol. 13, no. 5, pp. 1211–1217, 2002.
[11] J. Kaufmann and A. Schering, “Analysis of variance ANOVA,” in Wiley StatsRef: Statistics
Reference Online, John Wiley & Sons, Ltd, 2014.
[12] “檢驗項目|,” [Online]. Available: https://www.sl-lab.com.tw/test/.
[13] “機器學習:類神經網路、模糊系統以及基因演算法則(修訂二版).”
[14] A. L. Maas, A. Y. Hannun, and A. Y. Ng, “Rectifier nonlinearities improve neural network
acoustic models,” in in ICML Workshop on Deep Learning for Audio, Speech and
Language Processing, 2013.
[15] T. Chen and C. Guestrin, “XGBoost: A scalable tree boosting system,” in Proceedings
of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data
Mining, Aug. 13, 2016, pp. 785–794.
[16] M.-C. Su, Y.-Z. Hsieh, C.-H. Wang, and P.-C. Wang, “A jacobian matrix-based learning
machine and its applications in medical diagnosis,” IEEE Access, vol. 5, pp. 20 036–
20 045, 2017.
[17] K. Orphanou, A. Stassopoulou, and E. Keravnou, “Temporal abstraction and temporal
bayesian networks in clinical domains: A survey,” Artificial Intelligence in Medicine,
vol. 60, no. 3, pp. 133–149, Mar. 2014.
[18] J. C. Augusto, “Temporal reasoning for decision support in medicine,” Artificial Intelligence
in Medicine, vol. 33, no. 1, pp. 1–24, Jan. 2005.
[19] Y. Shahar, “A framework for knowledge-based temporal abstraction,” Artificial Intelligence,
vol. 90, no. 1, pp. 79–133, Feb. 1, 1997.
[20] S. Ghosh, J. Li, L. Cao, and K. Ramamohanarao, “Septic shock prediction for ICU patients
via coupled HMM walking on sequential contrast patterns,” Journal of Biomedical
Informatics, vol. 66, pp. 19–31, Feb. 2017.
[21] X. Song, L. R. Waitman, A. S. Yu, D. C. Robbins, Y. Hu, and M. Liu, “Longitudinal risk
prediction of chronic kidney disease in diabetic patients using a temporal-enhanced gradient
boosting machine: Retrospective cohort study,” JMIR medical informatics, vol. 8,
no. 1, e15510, Jan. 31, 2020.
[22] M. Makino, R. Yoshimoto, M. Ono, et al., “Artificial intelligence predicts the progression
of diabetic kidney disease using big data machine learning,” Scientific Reports, vol. 9,
no. 1, p. 11 862, Dec. 2019.
[23] A. Singh, G. Nadkarni, O. Gottesman, S. B. Ellis, E. P. Bottinger, and J. V. Guttag,
“Incorporating temporal EHR data in predictive models for risk stratification of renal
function deterioration,” Journal of Biomedical Informatics, vol. 53, pp. 220–228, Feb. 1,
2015.
[24] H. Mukhtar and S. A. Azwari, “Investigating non-laboratory variables to predict diabetic
and prediabetic patients from electronic medical records using machine learning,” International
Journal of Computer Science and Network Security, vol. 21, no. 9, pp. 19–30,
Sep. 30, 2021.
[25] N. H. Chowdhury, M. B. I. Reaz, F. Haque, et al., “Performance analysis of conventional
machine learning algorithms for identification of chronic kidney disease in type 1
diabetes mellitus patients,” Diagnostics, vol. 11, no. 12, p. 2267, Dec. 2021.
[26] X. Song, L. R. Waitman, Y. Hu, A. S. L. Yu, D. Robins, and M. Liu, “Robust clinical
marker identification for diabetic kidney disease with ensemble feature selection,” Journal
of the American Medical Informatics Association, vol. 26, no. 3, pp. 242–253, Mar. 1,
2019.
44
指導教授 蘇木春 許藝瓊 審核日期 2022-8-20
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明