多模態檢索增強生成問答系統之實作與評估： 以中西獸醫知識照護老年犬常見疾病為例;Implementation and Evaluation of a Multimodal Retrieval-Augmented Generation Question-Answering System for Geriatric Canine Care

NCU Institutional Repository > 管理學院 > 資訊管理研究所 > 博碩士論文 > Item 987654321/98266

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98266

題名:	多模態檢索增強生成問答系統之實作與評估：以中西獸醫知識照護老年犬常見疾病為例;Implementation and Evaluation of a Multimodal Retrieval-Augmented Generation Question-Answering System for Geriatric Canine Care
作者:	王梓蓉;Wang, Zi-Rong
貢獻者:	資訊管理學系
關鍵詞:	生成式AI;自然語言處理;檢索增強生成;專業領域問答系統;多模態;大型語言模型;Generative AI;Natural Language Processing;Retrieval-Augmented Generation;Domain-Specific Question Answering System;Multimodal;Large Language Models
日期:	2025-07-12
上傳時間:	2025-10-17 12:33:42 (UTC+8)
出版者:	國立中央大學
摘要:	隨著寵物高齡化問題日益嚴重，老年常見疾病與照護需求逐漸成為飼主照護上的重要挑戰。然而，現有資訊多為零散的社群討論或專業文獻，缺乏有效整合，導致飼主在尋求實用、可信的照護建議時面臨困難，特別是在中獸醫等非主流知識領域中，更常因用語抽象、缺乏圖文輔助而影響飼主理解與應用。為解決此問題，本研究提出一套結合多模態大型語言模型與檢索增強生成技術的問答系統，結合中西獸醫專業資料與飼主經驗，提供即時、可信的照護建議。系統架構以 SigLIP 模型進行文字與圖像之跨模態嵌入，搭配向量資料庫 Chroma 支援中英文文獻與社群貼文的混合檢索。經查詢擷取後之內容將融合為 prompt，輸入 Llama 3.2 語言模型進行生成回應。為提升系統效能與檢索準確性，本研究亦採用 chunk 切割最佳化、語義別名對應表設計與 Hybrid Retrieval等多階段優化策略。實驗設計涵蓋選擇題、是非題與開放式問答，針對中醫藥材、針灸療法、行為照護與飼主常見疑問進行測試，並評估系統在回答準確率、內容一致性、檢索品質與圖像對應性等面向之表現。結果顯示，RAG 系統相較於 Baseline 有顯著進步，整體準確率由 57% 提升至 82%，開放問答得分由 2.52 提升至 3.50 (5分制)，幻覺現象亦大幅降低，並於圖像檢索任務中展現良好語意連結與應用價值，特別有助於中藥材辨識與疾病行為輔助說明。本研究驗證多模態 RAG 架構於老年犬照護與中獸醫應用場景的實用潛力，並提出一套具體且可擴充的專業問答系統建構流程，為未來生成式 AI 在臨床健康知識整合與飼主教育領域之應用奠定基礎。 ;As companion animals age, age-related diseases and care needs have become critical challenges for pet owners. However, relevant information—especially regarding Traditional Chinese Veterinary Medicine (TCVM) is often scattered, abstract, and lacks visual support, limiting its accessibility and practicality. This study proposes a multimodal Retrieval-Augmented Generation (RAG) system that integrates professional veterinary knowledge and owner experiences to provide reliable, image-assisted care information. The system employs SigLIP for cross-modal embedding, Chroma for hybrid text-image retrieval, and Llama 3.2 for answer generation. Optimization techniques such as chunk tuning, alias mapping, and hybrid retrieval enhance precision and consistency. Experiments include multiple-choice, true/false, and open-ended questions covering herbal medicine, acupuncture, and behavioral care. Results show that the RAG system significantly improves accuracy (from 57% to 82%), open-ended answer quality (from 2.52 to 3.50), and image relevance, while reducing hallucinations. This research demonstrates the feasibility and effectiveness of a multimodal RAG approach in elderly dog care and TCVM, offering a scalable framework for domain-specific AI question-answering systems.
顯示於類別:	[資訊管理研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	51	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....