摘要: | 新冠病毒感染在大人主要是以呼吸道疾病為主,但在兒童,確有少數引起腦病變/腦炎,甚至病程進展迅速到死亡,國外SARS-CoV-2引起兒童的急性腦炎的臨床病程比較輕微,與台灣和香港的新冠腦炎病程不同,且香港的個案數不多,可參考的資料有限,建構兒童的新冠病毒相關腦病變/腦炎之檢驗模型是一件很重要議題。本研究為回溯性病例研究,使用常規血液樣本預測入院兒童的新冠病毒發生相關腦病變/腦炎的人工智慧模型的檢驗模型開發和驗證,找出重要的血液檢測項目,精準判讀兒童新冠病毒感染併發腦病變/腦炎。 本研究為回溯性病例分析,收集2022年1月至11月於林口長庚醫院確診並住院的470名兒童病例,其中66名診斷為新冠病毒相關腦病變/腦炎。分析入院後前三天的臨床常規血液檢測數據,依臨床醫師建議及人工篩選,保留所有個案中有50%以上的檢驗數據,原129項特徵篩選後為26 項,並採用數據正則化、K-Means Clustering合併馬哈拉諾比斯距離(Mahalanobis distance)處理資料不平衡、以及SMOTE方法處理資料不平衡。資料集以70%作為訓練集、30%作為測試集,並使用Decision Tree、Random Forest、SVM、XGBoost、LightGBM及Logistic Regression等六種機器學習模型建立預測模型。 本研究構建的三種檢驗模型(Random Forest、XGBoost、LightGBM)測試準確度皆大於0.95。其中,以LightGBM進行特徵選取,並結合LightGBM模型表現最佳,透過25項常規血液檢測特徵,即可達到0.95的高準確度。模型分析結果顯示,分葉嗜中性球%、高血糖(Sugar)及酸中毒(pH)為關鍵預測因子,依據相關研究,高血糖和酸中毒為腦病變顯著風險因子,與本研究重要特徵一致,強化了其臨床預測價值。嗜中性球絕對計數(Abs Neutro#)先前研究指出,嗜中性球介導的全身性發炎反應與腦病變密切相關,本模型納入嗜中性球百分比(Segmented Neutrophil%),能有效反映此生理指標,並對重症風險進行早期預警。 本研究成功開發高準確度的人工智慧檢驗模型,可利用常規血液檢測數據有效預測兒童新冠病毒感染併發腦病變/腦炎,進而提升臨床診斷效率,減少採血量及檢驗成本。同時,本研究亦填補了現有文獻在實驗室參數方面的不足,為臨床提供更精準的檢測工具與決策參考。 ;While SARS-CoV-2 primarily manifests as a respiratory disease in adults, a small number of pediatric cases develop severe neurological complications, including encephalopathy and encephalitis—some progressing rapidly to fatal outcomes. Notably, SARS-CoV-2 associated encephalopathy. in children abroad tends to have a milder course compared to cases observed in Taiwan and Hong Kong. However, the limited number of reported cases in Hong Kong restricts available reference data. This highlights the critical need to establish a reliable diagnostic model for pediatric SARS-CoV-2 associated encephalopathy. Here, we aimed to develop and validate an AI-powered diagnostic model using routine blood test data to predict pediatric SARS-CoV-2 associated encephalopathy. A total of 470 hospitalized children diagnosed with SARS-CoV-2 at Linkou Chang Gung Memorial Hospital between January and November 2022 were included, of whom 66 were diagnosed with SARS-CoV-2 associated encephalopathy. Clinical blood test data from the first three days of hospitalization were analyzed. Features present in at least 50% of cases were retained, reducing the original 129 features to 26. Data preprocessing included standardization, K-Means Clustering with Mahalanobis distance for handling data imbalance, and SMOTE for oversampling. The dataset was split into 70% training and 30% testing, and six machine learning models—Decision Tree, Random Forest, SVM, XGBoost, LightGBM, and Logistic Regression—were used for model construction. Three models—Random Forest, XGBoost, and LightGBM—achieved exceptional predictive accuracy, all exceeding 0.95 in test performance. Among them, LightGBM, when combined with feature selection, demonstrated the best performance, achieving 0.95 accuracy using only 25 routine blood test features. Key predictive biomarkers identified include segmented neutrophil percentage, hyperglycemia (high blood sugar), and acidosis (low pH)—factors strongly associated with neurological complications in existing research. Additionally, absolute neutrophil count (Abs Neutro#) has been previously linked to systemic inflammatory responses contributing to encephalopathy. Our model′s inclusion of segmented neutrophil percentage further enhances early warning capabilities for severe cases. We successfully developed a high-accuracy AI-powered diagnostic model that leverages routine blood test data to effectively predict pediatric SARS-CoV-2 associated encephalopathy.By improving diagnostic efficiency, reducing blood sample requirements and testing costs, and bridging gaps in laboratory-based risk assessment, our model provides a critical tool for precision medicine and clinical decision-making. |