| 摘要: | 精準醫學旨在透過考量個體間的差異性進行個人化的疾病預測與治療策略,其中腸道微生物菌相因其與多種疾病的密切關聯性而成為重要研究熱點之一。然而,傳統微生物分析方法多依賴群體層級之微生物共現網路,可能忽略個體特有的微生物交互作用及其與疾病之間的關聯。為突破此限制,本研究系統性地建構、採用線性插值個體樣本網路估計法,並結合自門至種等級之多層次微生物分類資訊,利用圖神經網路與機器學習模型,提升微生物數據在精準醫學疾病預測之效能與解釋性。同時,使用公開之大腸直腸癌菌相數據與血管老化數據集進行驗證,結果顯示,大腸直腸癌菌相數據集中,將單一樣本微生物交互作用網路與隨機森林整合,可改善疾病分類表現,其中屬(Genus)及種(Species)等級的預測準確性最佳,接收者操作特徵曲線下面積最高可達0.78,超越僅使用豐度數據之基準模型。儘管圖神經網路理論上對網路結構之資料具優勢,但實際效果有限,其表現高度敏感於超參數選擇及網路稀疏性。最佳的分類結果通常出現在中等篩選閾值及較淺層的網路架構下。然而,將相同的方法應用於血管老化數據集時,分類效果並未明顯提升,差異豐度分析結果顯示,此資料集中各臨床群組間缺乏顯著的微生物差異,表明在此情境下,單一樣本網路可能捕捉到雜訊,而非具生物意義之訊號,因此限制預測效能。此外,透過夏普利值(Shapley additive explanations value)進行之模型可解釋性分析,於大腸直腸癌資料集中挖掘出具生物意義之微生物交互作用,特別是涉及微小類桿菌(Parvimonas micra)以及短鏈脂肪酸生產菌之交互作用。然而,當分類層級提升至較粗糙的分類(如門或綱),生物學解釋性明顯下降,凸顯分類解析度對生物意涵推論之重要性。本研究證明透過腸道菌相數據建構個人化微生物互作網路,於具明確微生物差異之疾病情境(如大腸直腸癌)中能顯著提升疾病預測能力,但其效能高度仰賴數據內在之可分性。未來研究應進一步精進相關方法、增強特徵整合策略,並於更廣泛、多元之臨床數據中進行驗證,以完整實現單一樣本微生物網路於精準醫學應用的潛能。;Precision medicine aims to personalize disease prediction and treatment by accounting for individual variability, in which the gut microbiome has emerged as a crucial factor due to its associations with various diseases. Traditional microbiome analyses, however, typically rely on population-level microbial co-occurrence networks, potentially hiding individual-specific microbial interactions and their disease associations. To overcome this limitation, we systematically implemented and evaluated a single-sample network inference approach, linear interpolation to obtain network estimates for single samples (LIONESS), integrated with multi-level microbial classification (from phylum to species level), utilizing graph neural networks (GNN) and machine learning models, striving to improve disease prediction performance and interpretability in microbiome-based precision medicine. For comprehensive evaluations, we utilized datasets including colorectal cancer (CRC) from curated metagenomic data and vascular aging data from the longitudinal aging study (LAST) cohort. In the CRC dataset, integrating single-sample LIONESS microbial interaction networks with traditional machine learning classifiers, particularly random forest, improved disease classification performance, achieving an area under the receiver operating characteristic curve (AUC) of up to 0.78 at the genus and species levels, exceeding baseline abundance-only models. Although theoretically advantageous for network data, graph neural networks exhibited limited improvement due to sensitivity to hyperparameters and network sparsity. The optimal classification results were consistently obtained at intermediate edge-filtering thresholds, retaining the top 80% edge weights, with shallow architectures. However, our methods provided no significant classification improvements when applied to the LAST dataset. Differential abundance analyses showed no significant microbial differences between clinical groups, suggesting that LIONESS-derived single-sample networks in this scenario primarily captured noise rather than biologically meaningful signals, limiting their predictive power. Interpretability analyses using Shapley additive explanations (SHAP) values revealed biologically relevant microbial interactions in CRC, highlighting taxa such as Parvimonas micra and interactions involving short-chain fatty acid-producing bacteria. Nevertheless, interpretability declined at higher taxonomic levels, emphasizing the importance of taxonomic resolution for meaningful biological inference. Our results highlight that personalized microbial interaction networks derived from gut microbiome data can enhance disease prediction under conditions of meaningful microbial differentiation, as evidenced by CRC data. However, their effectiveness is heavily dependent on inherent data separability. Future research should focus on methodological refinements, enhanced integration strategies, and validation across diverse clinical datasets to fully realize the potential of single-sample microbial networks in precision medicine. |