應用於半導體領域之可靠知識生成與幻覺檢測雙模型語言系統;Dual-Model Language Systems for Reliable Knowledge Generation and Hallucination Detection in Semiconductor Applications

NCU Institutional Repository > 資訊電機學院 > 人工智慧國際碩士學位學程 > 博碩士論文 > Item 987654321/99166

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/99166

題名:	應用於半導體領域之可靠知識生成與幻覺檢測雙模型語言系統;Dual-Model Language Systems for Reliable Knowledge Generation and Hallucination Detection in Semiconductor Applications
作者:	林碩約;Lin, Shuo-Yueh
貢獻者:	人工智慧國際碩士學位學程
關鍵詞:	半導體工程;大型語言模型;可靠性驗證;持續預訓練;推理增強;安全過濾;雙模型架構;Semiconductor engineering;Large language models;Reliability verification;Continued pretraining;Reasoning alignment;Safety filtering;Dual-model framework
日期:	2025-12-18
上傳時間:	2026-03-06 18:15:12 (UTC+8)
出版者:	國立中央大學
摘要:	半導體產業是臺灣經濟的重要核心，其中任何微小的製程參數錯誤或推論偏差，都可能導致昂貴的後續失誤。雖然大型語言模型 (LLMs) 逐漸展現出在半導體工程中的應用潛力，但其生成內容仍可能出現不可靠或與事實不符的輸出，使其在高風險情境中難以直接部署。因此，提升模型在領域專屬任務上的「可靠性」成為關鍵研究課題。本論文提出一套旨在提高事實可靠性的雙模型架構。系統包含兩個核心組件： (1) 生成器：以 Qwen2.5-14B 為基礎，透過二億 token 半導體語料進行持續預訓練 (Continued Pretraining)，並利用 Chat Vector 進行推理能力增強，以強化領域知識內化； (2) 驗證器 (Verifier)：以約 8,000 筆領域問答資料進行微調，用於攔截任何與標準答案不符的輸出，採取召回導向的安全過濾策略，以降低錯誤資訊外洩的風險。在 1,000 題測試中，本系統優於業界標準 RAG 基準。純生成器模型透過知識內化，在準確率上超越 RAG (82.0% 對比 75.8%)；完整系統則優先考量安全性，將錯誤率壓低至 9.5% (顯著低於 RAG 的 24.2%)。雖然驗證器的保守過濾降低了覆蓋率，但有效攔截了不安全輸出並維持低延遲，符合半導體工程對高事實精準度的嚴格需求。綜合而言，本研究貢獻包含：(1) 提出首個專為半導體領域設計的生成器—驗證器雙模型可靠性框架；(2) 建構領域專屬的持續預訓練語料與驗證資料集；(3) 建立強調端到端安全性的評估方法。實驗結果顯示，召回優先的安全過濾機制能有效提升模型可靠性，為大型語言模型在半導體等高風險工程場域中的可信部署提供可行路徑。;The semiconductor industry is a cornerstone of Taiwan’s economy, where even small mistakes in process parameters or fabrication reasoning can cause costly downstream fail- ures. Although large language models (LLMs) are increasingly capable, their outputs may still contain incorrect or unverifiable statements, limiting safe deployment in semi- conductor engineering. Enhancing reliability in such high-stakes, domain-specific settings is therefore essential. This thesis proposes a dual-model framework to improve factual reliability in semi- conductor related LLM outputs. The system comprises: (1) a generator model, based on Qwen2.5-14B, adapted through continued pretraining on a 200M-token semiconductor cor- pus and reasoning alignment via Chat Vector; and (2) a lightweight Verifier fine-tuned on ∼8,000 domain QA pairs to filter outputs that deviate from ground-truth references. The Verifier follows a recall-oriented design that prioritizes intercepting potentially incorrect answers. On a 1,000-QA benchmark, the system outperformed an industry-standard RAG baseline. Specifically, the Generator-only model surpassed RAG in accuracy (82.0% vs. 75.8%) via knowledge internalization, while the full system prioritized safety, reducing the error rate to 9.5%—significantly lower than RAG’s 24.2%. Although conservative filtering reduced coverage, this trade-off effectively minimized unsafe outputs while maintaining practical latency. Overall, this work contributes: (1) a reliability-focused generator–verifier architec- ture for semiconductor engineering, (2) domain-specific datasets for continued pretraining and verification, and (3) an evaluation framework centered on safety metrics. The find- ings show that recall-oriented verification offers a viable path toward trustworthy LLM deployment in semiconductor workflows where factual correctness is critical.
顯示於類別:	[人工智慧國際碩士學位學程] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	13	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....