SMSGuard：隱私優先與成本效益導向之混合多階段詐騙簡訊偵測框架;SMSGuard: A Privacy‑Preserving, Cost‑Effective Hybrid Multi‑Stage Framework for Scam SMS Detection

NCU Institutional Repository > 資訊電機學院 > 資訊工程學系碩士在職專班 > 博碩士論文 > Item 987654321/98186

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98186

題名:	SMSGuard：隱私優先與成本效益導向之混合多階段詐騙簡訊偵測框架;SMSGuard: A Privacy‑Preserving, Cost‑Effective Hybrid Multi‑Stage Framework for Scam SMS Detection
作者:	林華恩;Lin, Hua-En
貢獻者:	資訊工程學系在職專班
關鍵詞:	詐騙簡訊;大型語言模型;隱私保護;成本效益;混合式架構;惡意 URL 偵測;提示注入;Smishing;Large Language Models;Privacy-Preserving;Cost-Effective;Hybrid Architecture;Malicious URL Detection;Prompt Injection
日期:	2025-07-22
上傳時間:	2025-10-17 12:28:06 (UTC+8)
出版者:	國立中央大學
摘要:	隨著行動通訊普及，簡訊詐騙已對台灣構成嚴峻資安威脅。現有偵測方法普遍在偵測效能、營運成本、使用者隱私與系統強健性等相互衝突的目標間掙扎，缺乏系統性的解決方案。本研究提出 SMSGuard，一個旨在系統性地回應此多目標權衡挑戰的混合式多階段詐騙簡訊偵測框架。 SMSGuard 的設計哲學根植於對詐騙宣告「非對稱可驗證性」的洞察：相較於合法宣告，詐騙內容更容易被客觀事實證偽。基於此，框架採用「本地匿名化、雲端分析」的隱私優先模式，在本地端透過在地化模型移除個人識別資訊。其自適應決策流程優先以低成本、高確定性的事實查核機制 (如本地知識庫比對、網址完整轉址鏈追蹤) 進行驗證。唯有在事實查核無法給出明確結論時，系統才會啟動策略性的分級大型語言模型 (LLM) 應用：將結構化資訊提取等任務交由輕量級模型 (GPT-4o-mini) 處理，而將需要深度推理的複雜語義分析保留給高階模型 (GPT-4o) ，並整合提示注入防禦閘門以確保系統安全。實驗評估顯示，相較於一個經最佳化的純高階 LLM 基線方法，SMSGuard 不僅修正了其因缺乏事實查核能力而導致的低落精確率 (從 60.95% 提升至 96.80%) ，更在維持極高偵測召回率的同時，顯著降低了超過 70% 的 API 營運成本。本研究的核心貢獻並非單一演算法的創新，而是提出並實證了一個系統工程框架，為解決真實世界中對抗性領域的多重衝突目標問題，提供了一套經過驗證的方法論與可行的實踐藍圖。;The proliferation of mobile communication has established SMS phishing (Smishing) as a formidable cybersecurity threat in Taiwan. Prevailing detection methods often struggle with the inherent trade-offs among detection performance, operational cost, user privacy, and system robustness, lacking a systematic solution. This thesis presents SMSGuard, a hybrid, multi-stage framework for scam SMS detection designed to systematically address this multi-objective optimization challenge. SMSGuard′s design philosophy is grounded in the principle of the "asymmetric verifiability" of fraudulent claims: malicious assertions are more easily falsified by objective facts than legitimate ones. Accordingly, the framework employs a privacy-by-design model of "local anonymization, cloud analysis," utilizing a localized NER model to redact Personally Identifiable Information (PII) on-device. Its adaptive workflow prioritizes low-cost, high-certainty fact-checking mechanisms, such as local knowledge base verification and full resolution of URL redirection chains. Only when deterministic methods are inconclusive does the system engage a strategic, tiered Large Language Model (LLM) application: it delegates structured information extraction to cost-effective models (GPT-4o-mini) while reserving high-capability models (GPT-4o) for nuanced semantic reasoning, all fortified by a prompt injection detection gateway. Experimental evaluation demonstrates that, compared to an optimized baseline relying purely on a high-end LLM, SMSGuard not only rectifies the baseline′s critically low precision (from 60.95% to 96.80%), which stems from its inability to perform fact-checking, but also achieves over a 70% reduction in API operational costs while maintaining superior detection recall. The principal contribution of this research is not a singular algorithmic novelty, but a validated system engineering framework that provides a proven methodology and a viable blueprint for resolving multi-objective conflicts within a real-world, adversarial domain.
顯示於類別:	[資訊工程學系碩士在職專班 ] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	51	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....