中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/93172
English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 80990/80990 (100%)
造訪人次 : 41687183      線上人數 : 1641
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/93172


    題名: RS-NAS:基於策略與價值含獎勵重構之 強化學習於網路結構搜索;RS-NAS: A Policy and Value-Based Reinforcement Learning with Reward Shaping on Neural Architecture Search
    作者: 張慕平;ZHANG, MU-PING
    貢獻者: 資訊工程學系
    關鍵詞: 強化學習;稀疏獎勵;網路結構搜索;Reinforcement Learning;Sparse Reward;Neural Architecture Search
    日期: 2023-07-24
    上傳時間: 2024-09-19 16:45:43 (UTC+8)
    出版者: 國立中央大學
    摘要: 十年來,隨著硬體效能的增加,深度學習成為了熱門的研究對象,其中在電腦視覺中,卷積神經網路(Convolutional Neural Networks , CNNs)是廣為人知的技術。研究過程中,人們發現複雜的網路模型往往能獲得更高的準確率,但在一些資源有限的終端設備上,複雜模型所帶來的龐大資源消耗,大幅度地限制了CNN的使用。因此近幾年,許多研究都專注於網路結構搜索(Neural Architecture Search, NAS)領域:根據不同目標來自動設計網路模型的技術,而在NAS領域中,我們根據優化方法的不同,將其分成三個類別:強化學習(Reinforcement Learning, RL)、進化演算法(Evolutionary Algorithms, EA)、可微分優化(Differentiable Optimization)。 本篇論文針對強化學習方法的 NAS任務上提出一種新的獎勵重構(Reward Shaping)機制,我們稱為RS-NAS,目的是解決強化學習在NAS搜索過程中,會遭遇的稀疏獎勵挑戰,強化學習中的代理人(Agent)無法在搜索過程中獲得獎勵,只能根據最後一步搜索出的模型架構來取得獎勵,這樣使代理人無法評估搜索過程中的每一步優劣,從而降低整體的搜索效能。我們使用兩種強化學習演算法來實作RS-NAS,一種是基於策略(Policy-Based)的近端策略優化(Proximal Policy Optimization, PPO);另一種是基於價值(Value-Based)的深度Q網路(Deep Q Network, DQN)。 同時為了降低搜索成本與變因,讓不同方法盡量在同一標準上比較,本篇論文中我們使用NATS。當作我們的搜索空間,相較於NATS原本的強化學習方法,RS-NAS有更好的搜索性能與穩定性。;Over the past decade, deep learning emerges as a popular research domain with the upgrading of hardware performance. Recently, Convolutional Neural Networks (CNNs) have been admitted as a significant success in computer vision. Moreover, researchers observe that complex network models can often achieve higher accuracy. However, complex models greatly limit the use of CNNs on resource-constrained devices. As a result, many researchers focus their attention on Neural Architecture Search (NAS) recently, which aims at automatically designing network models based on different objectives. Among them, Reinforcement Learning (RL) is a commonly utilized optimization method in NAS. In this thesis. we propose a novel reward shaping mechanism called RS-NAS for designing the RL-based NAS task. The objective is to address the challenge of sparse rewards encountered during the search process in RL. In traditional RL, agents cannot obtain rewards during the search process and can only receive rewards based on the final model architecture obtained. It prevents agents from evaluating the quality of each step in the search process, and hence reduces overall search efficiency. The proposed RS-NAS is implemented using two RL algorithms: Proximal Policy Optimization (PPO) which is a policy-based method, and Deep Q Network (DQN) which is a value-based method. In this thesis, we utilize NATS as the search space to reduce search costs and alleviate the factors undergoing on different methods for fair comparison. Comparing with the original RL methods in NATS, experimental results verify that the proposed RS-NAS demonstrates better search performance and stability.
    顯示於類別:[資訊工程研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML13檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明