English  |  正體中文  |  简体中文  |  Items with full text/Total items : 78852/78852 (100%)
Visitors : 36427907      Online Users : 670
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version

    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/92642

    Title: Privacy-Preserving Machine Learning for Predicting Second Primary Cancer in the Context of Data Heterogeneity
    Authors: 洪睿甫;Hong, Jui-Fu
    Contributors: 資訊管理學系
    Keywords: 隱私保護;機器學習;聯邦學習;遷移學習;二次原發性癌症;肺癌;privacy preserving;machine learning;federated learning;transfer learning;second primary cancer;lung cancer
    Date: 2023-07-26
    Issue Date: 2023-10-04 16:07:24 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 保護資料隱私在近幾年來一直是個 關鍵的議題。隨著 保護 資料隱私的議題的興起,許 多 法規 被制定並用來 限制資料的傳輸和保存 。因此,收集不同機構的資料並進行模型訓練成為一項具有挑戰性的任務。為了解決這個問題, 許多研究 提出了聯邦學習和遷移學習等隱私保護方法。 在 我們的 研究中,我們 使用來自 8 家醫院的 癌症登記 數據來探 討這些隱私保護方法在肺癌倖存者第二原發 性 癌症預測中的表現。我們比較了本地化學習、集中式學習、聯邦學習和遷移學習的 預測效能。結果顯示 ,聯邦學習 在多數機構表現 優於本地 化 學習,並取得了與集中 式 學習相似的結果。 此外,我們提出了一些方法 處理 聯邦學習中數據異質性問題 所帶來的 負面影響。第一種方法排除了數據分佈 差異過大 的機構,而第二種方法則結合了 個性化 學習率和 模型 層 數 的個性化模型 方法 。 和 大多數機構的聯邦學習 的基準 結果 相比,這兩種方法都 改善預測效能 。 然而,對於那些被排除在外或表現出嚴重 資料偏移 的機構, 可以看到 這些機構 使用 遷移學習 訓練模型後有較好的預測校能 所以 我們 可 使用遷移 式 學習 作 為 替代方案 。 綜上所述,我們的研究 結果顯示隱私 保護 的機器學習方法在嚴格的數據法規下 能達到與集中資料訓練模型相似的成效,且能夠 使用 有效的 訓練策略來解決 機構間資料的異質性性問題。;Data privacy has been a critical issue in recent years. With the rise of data privacy issues, many regulations have been established to restrict data transmission and preservation. Therefore, gathering data from different institutions for model training has become challenging. To address this, privacy preserving methods such as federated learning and transfer earning have been proposed. In this research, we aim to explore the performance of these privacy preserving methods on Second Primary Cancer prediction in lung cancer survivors, using data from 8 hospitals. We compared the performances of localized learning, centralized learning, federated learning, and transfer learning. The results demonstrated that federated learning outperformed localized learning and achieved similar results to centralized learning. Besides, we proposed methods to mitigate the negative impact caused by the data heterogeneity issue in federated learning. The first method excluded the institutions with divergent data distribution, while the second method incorporated personalized models with customized learning rates and the personalized layer. Both methods demonstrate a better result compared to the federated learning baseline in most institutions. However, for the institutions that were excluded or exhibited severe divergence, transfer learning can be served as an alternative as its prominent performance. To sum up, our study suggests that the privacy preserving machine learning methods exhibit efficiency under strict data regulations and implement effective training strategies when addressing the data heterogeneity issues.
    Appears in Collections:[資訊管理研究所] 博碩士論文

    Files in This Item:

    File Description SizeFormat

    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明