| 摘要: | 隨著行動通訊普及,簡訊詐騙已對台灣構成嚴峻資安威脅。現有偵測方法普遍在偵測效能、營運成本、使用者隱私與系統強健性等相互衝突的目標間掙扎,缺乏系統性的解決方案。本研究提出 SMSGuard,一個旨在系統性地回應此多目標權衡挑戰的混合式多階段詐騙簡訊偵測框架。
SMSGuard 的設計哲學根植於對詐騙宣告「非對稱可驗證性」的洞察:相較於合法宣告,詐騙內容更容易被客觀事實證偽。基於此,框架採用「本地匿名化、雲端分析」的隱私優先模式,在本地端透過在地化模型移除個人識別資訊。其自適應決策流程優先以低成本、高確定性的事實查核機制 (如本地知識庫比對、網址完整轉址鏈追蹤) 進行驗證。唯有在事實查核無法給出明確結論時,系統才會啟動策略性的分級大型語言模型 (LLM) 應用:將結構化資訊提取等任務交由輕量級模型 (GPT-4o-mini) 處理,而將需要深度推理的複雜語義分析保留給高階模型 (GPT-4o) ,並整合提示注入防禦閘門以確保系統安全。
實驗評估顯示,相較於一個經最佳化的純高階 LLM 基線方法,SMSGuard 不僅修正了其因缺乏事實查核能力而導致的低落精確率 (從 60.95% 提升至 96.80%) ,更在維持極高偵測召回率的同時,顯著降低了超過 70% 的 API 營運成本。本研究的核心貢獻並非單一演算法的創新,而是提出並實證了一個系統工程框架,為解決真實世界中對抗性領域的多重衝突目標問題,提供了一套經過驗證的方法論與可行的實踐藍圖。;The proliferation of mobile communication has established SMS phishing (Smishing) as a formidable cybersecurity threat in Taiwan. Prevailing detection methods often struggle with the inherent trade-offs among detection performance, operational cost, user privacy, and system robustness, lacking a systematic solution. This thesis presents SMSGuard, a hybrid, multi-stage framework for scam SMS detection designed to systematically address this multi-objective optimization challenge.
SMSGuard′s design philosophy is grounded in the principle of the "asymmetric verifiability" of fraudulent claims: malicious assertions are more easily falsified by objective facts than legitimate ones. Accordingly, the framework employs a privacy-by-design model of "local anonymization, cloud analysis," utilizing a localized NER model to redact Personally Identifiable Information (PII) on-device. Its adaptive workflow prioritizes low-cost, high-certainty fact-checking mechanisms, such as local knowledge base verification and full resolution of URL redirection chains. Only when deterministic methods are inconclusive does the system engage a strategic, tiered Large Language Model (LLM) application: it delegates structured information extraction to cost-effective models (GPT-4o-mini) while reserving high-capability models (GPT-4o) for nuanced semantic reasoning, all fortified by a prompt injection detection gateway.
Experimental evaluation demonstrates that, compared to an optimized baseline relying purely on a high-end LLM, SMSGuard not only rectifies the baseline′s critically low precision (from 60.95% to 96.80%), which stems from its inability to perform fact-checking, but also achieves over a 70% reduction in API operational costs while maintaining superior detection recall. The principal contribution of this research is not a singular algorithmic novelty, but a validated system engineering framework that provides a proven methodology and a viable blueprint for resolving multi-objective conflicts within a real-world, adversarial domain. |