中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/98216
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 83776/83776 (100%)
Visitors : 59223072      Online Users : 504
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/98216


    Title: Browser Agent 效能瓶頸分析與改進挑戰;Browser Agent Performance Bottleneck Analysis and Improvement Challenges
    Authors: 李德泰;LI, DE-TAI
    Contributors: 資訊工程學系在職專班
    Keywords: AI代理;瀏覽器代理;AI Agent;WebVoyager;Browser-use;Browser Agent;RAG
    Date: 2025-08-21
    Issue Date: 2025-10-17 12:30:19 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 近年來,大型語言模型(Large Language Models, LLMs)在語言理解、
    推理與任務執行等方面的能力大幅提升。隨著網頁介面逐漸成為異質資訊系統
    的統一入口,基於 LLM 的瀏覽器代理(Browser Agent)已成為建構通用智慧
    代理的重要研究方向。
    本研究以 2024 年公開的開源專案 WebVoyager 為對象,探討其在真實網
    站任務執行上的表現。初步實驗顯示,原始模型在面對內容動態變化大、結構
    複雜或以視覺導向為主的網站時,在理解與執行效率上表現不佳。
    為提升代理模型的適應能力, 本研究針對 WebVoyager 的核心能力與整體
    架構提出改良方案,並與另一套主流開源系統 Browser-use 進行比較分析,涵
    蓋感知能力、思考規劃與執行方法等面向。並採用 WebVoyager 資料集做為評
    估標準,進行相關任務的實證。
    此外,本研究導入檢索增強生成(Retrieval-Augmented Generation, RAG)
    機制,透過本研究建構的輕量級知識文件 (lightweight knowledge texts), 使代
    理模型在執行任務前能獲取網站結構與功能的基礎知識,進而提升理解能力與
    操作準確性。實驗結果顯示,加入 RAG 的 WebVoyager 在任務成功率上提升
    了 8.7%,並在多數測試場景中優於 Browser-use。這些結果驗證了外部知識整
    合對 LLM 決策品質與瀏覽器代理系統泛化能力的實質助益。
    ;In recent years, Large Language Models (LLMs) have demonstrated signifi cant improvements in language understanding, reasoning, and task execution. As
    web interfaces increasingly serve as unified access points to heterogeneous infor mation systems, LLM-based browser agents have emerged as a crucial direction
    for building general-purpose intelligent agents.
    This study focuses on WebVoyager, an open-source project released in 2024,
    investigating its performance in executing tasks on real-world websites. Prelimi nary experiments reveal that the original model struggles with sites characterized
    by dynamic content, complex structures, or visually oriented layouts, resulting
    in inefficiencies in comprehension and execution.
    To enhance the adaptability of browser agents, this research proposes im provements to both the capabilities and architecture of WebVoyager, and con ducts a comparative analysis with another mainstream open-source system, Browser Use. The comparison covers aspects such as perception, planning, and execution
    strategies. The evaluation is based on the WebVoyager benchmark dataset and
    includes empirical testing across relevant tasks.
    Furthermore, this study integrates a Retrieval-Augmented Generation (RAG)
    mechanism. By providing lightweight knowledge texts constructed during the
    experiments, the agent can acquire basic knowledge of website structures and
    functionalities prior to task execution, thereby improving its comprehension and
    operational accuracy. Experimental results show that the RAG-enhanced Web Voyager achieves an 8.7% improvement in task success rate and consistently out performs Browser-Use across most test scenarios. These findings demonstrate the
    practical benefits of external knowledge integration for improving LLM decision
    quality and the generalization ability of browser agents.
    Appears in Collections:[Executive Master of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML12View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明