中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/98028
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 83696/83696 (100%)
Visitors : 56333456      Online Users : 2174
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://ir.lib.ncu.edu.tw/handle/987654321/98028


    Title: 基於多面體結構與迭代更新機制的Q學習演算法;Polytope-Based Correlated Q-learning with Iterative Strategy Updates
    Authors: 高永瀚;Kao, Yung-Han
    Contributors: 統計研究所
    Keywords: 相關均衡;多智能體強化學習;納許均衡;粒子群優化;Correlated equilibrium;Multi-agent Reinforcement Learning;Nash equilibrium;Particle Swarm Optimization
    Date: 2025-07-10
    Issue Date: 2025-10-17 12:16:02 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 多智能體強化學習(Multi-Agent Reinforcement Learning, MARL)在動態與不確定環境中的決策問題中具有廣泛的應用。然而,由於多個智能
    體的交互影響,其學習過程比單智能體強化學習更加複雜,尤其在均衡策略的求解上。本研究針對MARL中的均衡問題,提出了一種創新的學習方法,結合幾何結構訊息,並透過粒子群優化(PSO)與單純形法(Simplex)進行策略更新,以提高均衡策略的計算效率與學習穩定性。首先,我們分析了隨機博弈中的相關均衡多面體結構,並透過頂點識別方法提供幾何詮釋。其次,我們設計了一種新的Q更新策略,以改善MARL的學習穩定性與收斂性。在實驗部分,我們透過三組模擬,涵蓋狀態數與動作數不同的二人博弈設定,藉由逐步提升環境複雜度,以評估方法的穩健性與策略估計準確性。結果顯示,所提方法在複雜情境下仍具良好學習表現。此外,我們亦補充一組貼近實務的模擬,用以初步檢視該方法於應用面向的可行性與後續改進方向。;Multi-Agent Reinforcement Learning (MARL) has been widely applied to decision-making problems in dynamic and uncertain environments. However, due to interactions among multiple agents, the learning process is more complex than in single-agent settings, particularly in solving equilibrium strategies. This study proposes an innovative learning method that incorporates geometric structure information and updates strategies using both Particle Swarm Optimization (PSO) and the Simplex method. The proposed approach aims to improve the computational efficiency and stability of equilibrium strategy learning in MARL settings. We first analyze the structure of correlated equilibrium in stochastic games and offer geometric insight through vertex identification. We then introduce a new Q-update mechanism to improve the learning stability of MARL. In experiments, we evaluate the method through three two-player game scenarios with varying state and action spaces, showing that the proposed approach remains robust under increasing complexity. Additionally, a realistic simulation is included to examine the method′s practical applicability and inform future improvements.
    Appears in Collections:[Graduate Institute of Statistics] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML6View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明