SLM 驅動的代理：以 Grafana 為例的可視化導向自然語言互動介面;SLM-Powered Agent for Visualization-Oriented Natural Language Interface: A Case Study on Grafana

NCU Institutional Repository > 資訊電機學院 > 資訊工程研究所 > 博碩士論文 > Item 987654321/98187

請使用永久網址來引用或連結此文件: https://ir.lib.ncu.edu.tw/handle/987654321/98187

題名:	SLM 驅動的代理：以 Grafana 為例的可視化導向自然語言互動介面;SLM-Powered Agent for Visualization-Oriented Natural Language Interface: A Case Study on Grafana
作者:	童伯堯;Tong, Bo-Yao
貢獻者:	資訊工程學系
關鍵詞:	視化導向自然語言互動介面;大型語言模型;小型語言模型;代理;工具調用;visualization-oriented natural language interfaces;large language model;small language model;agent;tool call
日期:	2025-07-01
上傳時間:	2025-10-17 12:28:21 (UTC+8)
出版者:	國立中央大學
摘要:	傳統的視覺化導向自然語言互動介面（V-NLI）主要依賴下拉式選單來產生圖表，這種方式相較於自然語言輸入不夠直觀。因此，近年來越來越多研究開始引入大型語言模型（LLM），透過自然語言輸入來生成視覺化圖形。儘管 LLM 在視覺化領域表現出色，但隨著模型複雜度提升， LLM 對 GPU 與硬體資源的需求也隨之增加，使得本地部署變得困難，這促使輕量化的小型語言模型（SLM）成為替代方案。雖然 SLM 具備輕量化優勢，但在生成視覺化圖形時仍面臨諸多挑戰，例如產生的圖形參數不符合使用者需求，或在相同輸入下，輸出結果卻不一致等問題。為了解決這些問題，我們設計一個以 Agent 為核心的流程架構，負責接收 SLM 分別生成的工具名稱與參數，組成完整的工具調用（Tool Call），並據此執行對應的視覺化工具操作，包括前端圖表更新、後端資料查詢，以及處理輸入中的資訊缺漏與非視覺化相關請求，進一步提升互動流程的彈性與穩定性。在實驗中，我們採用公開的 ChartGPT 資料集進行驗證，並以常用於機器翻譯評估的兩項指標——ROUGE-L 與 BLEU——來衡量參數生成的準確性。實驗結果顯示，微調後的 Phi-3.5 模型在參數生成準確性上有明顯提升，其中 ROUGE-L 較目前以小型語言模型為基礎的視覺化系統 ChartGPT 提高約 1\%，而 BLEU 表現則相當。進一步與三個未經視覺化任務微調的一般化語言模型（xLAM、LLaMA3.2 與 Qwen2.5）相比，我們的模型在 ROUGE-L 指標上平均提升約 78\%，在 BLEU 指標上則平均提升約 48\%。此外，在使用私有的 6G 資料集進行測試時，即使面對使用者輸入不完整的情境，系統仍能生成符合需求的視覺化圖表，展現出良好的強健性（robustness）。;Traditional visualization-oriented natural language interfaces (V-NLIs) primarily rely on drop-down menus to generate charts, which are less intuitive than natural language input. As a result, recent research has increasingly incorporated large language models (LLMs) to generate visualizations through natural language commands. While LLMs have demonstrated strong performance in visualization, their growing hardware and GPU resource demands have made local deployment challenging. The situation has led to the adoption of lightweight small language models (SLMs) as an alternative solution. Although SLMs offer the advantage of lightweight deployment, they still face several challenges in generating visualizations—such as producing chart parameters that do not meet user expectations or yielding inconsistent outputs for the same input. To address these issues, we propose a framework centered around an agent, which receives the tool name and parameters separately generated by the SLM, assembles them into a complete tool call format (also known as a function call), and executes the corresponding visualization tool operations. These operations include frontend chart updates, backend data queries, and handling missing input information or non-visualization-related requests, thereby enhancing the flexibility and stability of the overall interaction process. In our experiments, we used the publicly available ChartGPT dataset for evaluation. We adopted two commonly used metrics in machine translation—ROUGE-L and BLEU—to assess the accuracy of parameter generation. The results show that the fine-tuned Phi-3.5 model significantly improves parameter generation accuracy. Specifically, it outperforms the existing visualization system ChartGPT based on a small language model by approximately 1\% in ROUGE-L, with comparable performance in BLEU. Furthermore, compared to three general-purpose language models that were not fine-tuned for visualization tasks (xLAM, LLaMA3.2, and Qwen2.5), our model achieves an average improvement of approximately 78\% in ROUGE-L and 48\% in BLEU. In addition, tests on a private 6G dataset demonstrate that the system can still generate appropriate visualization charts even when faced with incomplete user input, showcasing strong robustness.
顯示於類別:	[資訊工程研究所] 博碩士論文

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	33	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....