中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/98403
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 83776/83776 (100%)
造访人次 : 59503433      在线人数 : 694
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98403


    题名: 透過大型語言模型之檢索增強生成方法探討Android 漏洞行檢測與漏洞可解釋性;Investigating Android Vulnerability Line Detection and Explainability via Retrieval-Augmented Generation with Large Language Models
    作者: 楊晴閔;Yang, Qing-Min
    贡献者: 資訊管理學系
    关键词: 檢索增強生成;大型語言模型;資料流圖;Android 漏洞偵測;少量示例提示;可解釋性;Retrieval-Augmented Generation (RAG);Large Language Models;Data Flow Graph;Android Vulnerability Detection;Few-shot Prompting;Explainability
    日期: 2025-08-01
    上传时间: 2025-10-17 12:44:50 (UTC+8)
    出版者: 國立中央大學
    摘要: 隨著 Android 平台上的應用程式數量持續攀升,如何強化原始碼層級的漏洞檢測
    能力成為資安領域的重要課題。儘管大型語言模型(Large Language Models,LLM)已
    被廣泛應用於程式碼分析與漏洞定位 , 其生成結果仍容易受到幻覺現象 ( hallucination)
    影響 , 導致錯誤或無關的語句 , 降低預測正確率與可解釋性 。 為解決此問題 , 本研究提
    出一套結合「檢索增強生成(Retrieval-Augmented Generation,RAG)」 機制的漏洞行
    級檢測與解釋方法。
    本研究利用資料流圖(Data Flow Graph,DFG)建構程式語意表示,並建立外部知
    識庫 , 儲存具語意相似性的程式碼片段 , 輔助模型進行行級漏洞定位 。 此外 , 設計少樣
    本提示語 ( few-shot prompt)以引導 LLM 聚焦在與目標CWE-ID類型語意相關的知識 ,
    有效提升模型在語意模糊情境下的預測能力 。 於解釋層面 , 本研究分析模型生成的解釋
    語句,並透過BERTScore評估其與標準答案(Ground Truth)的語意一致性。
    實驗以MobSF所產出之 Android 漏洞資料集為基礎,針對定位準確度與語句解釋
    品質進行評估 。 結果顯示 , 融合外部知識與少樣本提示後 , 模型在漏洞行級定位任務中
    的F1-score 較原始模型提升48% Precision提升34% 所生成的解釋語句亦能與標準
    答案維持高度語意一致性(BERTScore > 0.85)。 本研究證實RAG機制能有效降低語句
    幻覺現象,並顯著提升模型在程式碼漏洞偵測與可解釋性任務上的整體表現。
    關鍵字: Retrieval-Augmented Generation, Large Language Models, Data Flow Graph,
    Android Vulnerability Detection, Few-shot Prompting, Explainability;With the rapid growth of Android applications, enhancing source-level vulnerability
    detection has become increasingly critical in the field of cybersecurity. Although Large
    Language Models (LLMs) have shown promise in code analysis and vulnerability localization,
    they are still prone to generating hallucinations—irrelevant or incorrect outputs—especially in
    semantically ambiguous scenarios, thus reducing both accuracy and interpretability. To address
    this issue, this study proposes a line-level vulnerability detection and explanation framework
    based on Retrieval-Augmented Generation (RAG).
    Our approach constructs Data Flow Graphs (DFG) to represent code semantics and builds
    an external knowledge base consisting of semantically related code snippets, which serve as
    context for improving LLM predictions. In addition, we design few-shot prompts tailored to
    the target CWE-ID type, guiding the LLM to focus on relevant patterns for more accurate
    localization and explanation. To evaluate explanation quality, we use BERTScore to measure
    the semantic similarity between the generated explanation and the ground truth.
    Experiments conducted on a MobSF-generated Android vulnerability dataset show that
    our RAG-based method significantly outperforms the original model: the F1-score improves
    by 48%, and precision increases by 34%. Furthermore, explanations generated under the few
    shot setting achieve a BERTScore above 0.85. These results demonstrate that RAG not only
    enhances line-level vulnerability localization but also effectively mitigates hallucinations,
    contributing to better interpretability and robustness in code understanding tasks.
    Keywords: Retrieval-Augmented Generation, Large Language Models, Data Flow Graph,
    Android Vulnerability Detection, Few-shot Prompting, Explainability
    显示于类别:[資訊管理研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML34检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明