中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/92623
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 78852/78852 (100%)
Visitors : 37007687      Online Users : 653
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: http://ir.lib.ncu.edu.tw/handle/987654321/92623


    Title: 探究強化學習與停止策略於活動來源頁面探勘之設計;On the Design of RL Algorithms and Termination Strategies for Focused Crawling - A case study for Event Source Page Discovery
    Authors: 葉庭;Yeh, Ting
    Contributors: 資訊工程學系
    Keywords: 強化式學習;網頁探勘;多任務學習;Reinforcement learning;web mining;Multitask Learning
    Date: 2023-07-26
    Issue Date: 2023-10-04 16:06:54 (UTC+8)
    Publisher: 國立中央大學
    Abstract: 本研究旨在開發一個智能的爬蟲系統,以收集活動來源網頁的資訊。我們的目標是希望能節省使用者在瀏覽器尋找活動的時間,並提供結構化的活動資訊,以滿足現代人尋找當地特色活動的需求。在我們先前的工作中,我們使用了基於強化學習的策略梯度方法進行活動源網頁的挖掘。然而,我們發現兩階段訓練存在兩個問題:第一階段僅能使用固定步伐進行訓練,第二階段的微調訓練效能沒有顯著提升。為了改進這些問題,我們希望在初始訓練階段就能通過可變動步伐的方式控制回合的停止。這樣能夠提供更靈活的訓練,以適應不同的場景和環境變化,並改善模型的性能和結果。為了實現這一目標,我們設計了資產控制的停止策略,並且採用不同的強化式學習演算法。同時,我們將原本的兩階段訓練框架定義得更加嚴謹,將訓練策略擴展為四種不同的方法。通過與先前的工作進行比較,我們想要確定新設計的停止策略是否能夠降低點擊成本,並且確定在應用不同的強化學習算法後,選擇最適合我們任務的方法。最終,我們也想選擇最適合我們任務的訓練策略。結果顯示,我們新設計的停止策略在DQN算法中實現了更低的點擊成本和更高的性能。點擊成本從1.4%降低到1.2%,性能從72%提升到78.2%。比較不同的訓練策略後,我們得出結論,通過使用標記數據和給予正確答案的獎勵函數對於我們的任務更加適合。;The purpose of this study is to develop an intelligent web crawler system that collects information from activity source web pages. Our objective is to save users′ time in browsing for activities and provide structured activity information to fulfill the modern demand for finding local distinctive activities. In our previous work, we utilized a reinforcement learning-based policy gradient method for activity source web page mining. However, we identified two issues with the two-stage training process: firstly, the first stage only allowed training with a fixed step size, and secondly, the fine-tuning in the second stage did not have significant improvement. To address these problems, we aim to control episode termination using a variable step size during the initial training phase. This approach would provide more flexibility in training to adapt to different scenarios and environmental changes, thereby improving the performance and outcomes of the model. To achieve this goal, we introduced an asset control stopping strategy and employed different reinforcement learning algorithms. Moreover, we redefined the original two-stage training framework, expanding the training strategies to four different methods. By comparing the results with our previous work, we aimed to determine if the newly designed stopping strategy could reduce click costs and identify the most suitable method after applying various reinforcement learning algorithms. Ultimately, we aimed to select the training strategy that best suited our task. The results showed that our newly designed stopping strategy achieved lower click costs and higher performance in the DQN algorithm. The click cost decreased from 1.4% to 1.227%, and the performance improved from 72% to 78.2%. After comparing different training strategies, we concluded that using labeled data and a reward function that incorporates correct answers is more suitable for our task.
    Appears in Collections:[Graduate Institute of Computer Science and Information Engineering] Electronic Thesis & Dissertation

    Files in This Item:

    File Description SizeFormat
    index.html0KbHTML32View/Open


    All items in NCUIR are protected by copyright, with all rights reserved.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明