生醫文獻的檢索增強生成系統發展;Development of a retriever-augmented generation system for biomedical literature

NCUIR > college of Health Sciences and Technology > Institute of Biomedical Engineering > Electronic Thesis & Dissertation > Item 987654321/99238

Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/99238

Title:	生醫文獻的檢索增強生成系統發展;Development of a retriever-augmented generation system for biomedical literature
Authors:	曾郁蓉;Tseng, Yu Jung
Contributors:	生物醫學工程研究所
Keywords:	檢索增強生成系統;大型語言模型;提示詞工程;嵌入模型;重排器模型;關鍵字索引;retriever-augmented generation system;Large language model;Prompt Engineering;Embedding model;Reranker model;Keyword Table Index
Date:	2025-11-12
Issue Date:	2026-03-06 18:24:52 (UTC+8)
Publisher:	國立中央大學
Abstract:	摘要本研究旨在優化檢索增強生成系統（Retrieval-Augmented Generation, RAG）於生醫文獻應用中的檢索配置與回應結果。隨著大型語言模型（LLMs）在專業領域的應用逐漸普及，如何避免幻覺（hallucinations）、提升檢索精準度與回應品質成為關鍵課題。研究中以三篇具代表性的生醫文獻為資料來源，透過不同的切分長度（chunk size）、嵌入模型（embedding models）、最相關排名數（Top-k）、重排序模型（reranker）、以及關鍵字索引（keyword table index）進行實驗，比較其在命中率（hit rate）、平均倒數排名（MRR）、回應正確性(Correctness)、忠實(faithfulness）、相關性（relevancy）等指標上的差異。結果顯示，最佳檢索配置為：chunk size = 1024 tokens﹑OpenAI text-embedding-3-large 嵌入模型﹑向量索引 + 關鍵字索引結合﹑Jina Reranker(jina-reranker-v1-tiny-en, Top-k=10, rerank Top-n=5)。回應生成建議使用 GPT 模型，溫度（temperature）設為 0，以提升忠實度與相關性。本研究驗證了透過檢索與生成雙重優化，可有效提升生醫文獻問答的正確性，並指出 RAG 系統仍存在延遲（latency）、需定期更新數據來源等限制。未來方向包括建立更進階的問答系統與結合Model Context Protocol (MCP)，發展智能型生醫文獻搜尋引擎。 ;ABSTRACT This study aims to optimize the retriever configuration and response outcomes of Retrieval-Augmented Generation (RAG) systems in the context of biomedical literature. As large language models (LLMs) become increasingly prevalent in specialized domains, addressing hallucinations, improving retrieval accuracy, and enhancing response quality are critical challenges. Using three representative biomedical papers as the data source, experiments were conducted with different chunk sizes, embedding models, top-k retrieval parameters, reranker models, and keyword table index. The performance was compared across evaluation metrics including hit rate, mean reciprocal rank(MRR), response correctness, faithfulness, and relevancy. The results indicate that the optimal retriever configuration is: Chunk size = 1024 tokens, OpenAI text-embedding-3-large embedding model, Combination of vector index and keyword index, Jina Reranker (jina-reranker-v1-tiny-en, Top-k=10, rerank Top-n=5) and GPT-based models for response generation with temperature = 0 to improve faithfulness and relevancy. This study demonstrates that dual optimization of retrieval and generation significantly improves the accuracy of biomedical literature answers. However, limitations such as latency and the need for regular data updates remain. Future work includes developing advanced QA systems and integrating the Model Context Protocol(MCP) to build intelligent biomedical literature search engines.
Appears in Collections:	[Institute of Biomedical Engineering] Electronic Thesis & Dissertation

Files in This Item:

File	Description	Size	Format
index.html		0Kb	HTML	115	View/Open

社群 sharing

Loading...