探究強化學習與停止策略於活動來源頁面探勘之設計

DC 欄位	值	語言
DC.contributor	資訊工程學系	zh_TW
DC.creator	葉庭	zh_TW
DC.creator	Ting Yeh	en_US
dc.date.accessioned	2023-7-26T07:39:07Z
dc.date.available	2023-7-26T07:39:07Z
dc.date.issued	2023
dc.identifier.uri	http://ir.lib.ncu.edu.tw:88/thesis/view_etd.asp?URN=110522070
dc.contributor.department	資訊工程學系	zh_TW
DC.description	國立中央大學	zh_TW
DC.description	National Central University	en_US
dc.description.abstract	本研究旨在開發一個智能的爬蟲系統，以收集活動來源網頁的資訊。我們的目標是希望能節省使用者在瀏覽器尋找活動的時間，並提供結構化的活動資訊，以滿足現代人尋找當地特色活動的需求。在我們先前的工作中，我們使用了基於強化學習的策略梯度方法進行活動源網頁的挖掘。然而，我們發現兩階段訓練存在兩個問題：第一階段僅能使用固定步伐進行訓練，第二階段的微調訓練效能沒有顯著提升。為了改進這些問題，我們希望在初始訓練階段就能通過可變動步伐的方式控制回合的停止。這樣能夠提供更靈活的訓練，以適應不同的場景和環境變化，並改善模型的性能和結果。為了實現這一目標，我們設計了資產控制的停止策略，並且採用不同的強化式學習演算法。同時，我們將原本的兩階段訓練框架定義得更加嚴謹，將訓練策略擴展為四種不同的方法。通過與先前的工作進行比較，我們想要確定新設計的停止策略是否能夠降低點擊成本，並且確定在應用不同的強化學習算法後，選擇最適合我們任務的方法。最終，我們也想選擇最適合我們任務的訓練策略。結果顯示，我們新設計的停止策略在DQN算法中實現了更低的點擊成本和更高的性能。點擊成本從1.4%降低到1.2%，性能從72%提升到78.2%。比較不同的訓練策略後，我們得出結論，通過使用標記數據和給予正確答案的獎勵函數對於我們的任務更加適合。	zh_TW
dc.description.abstract	The purpose of this study is to develop an intelligent web crawler system that collects information from activity source web pages. Our objective is to save users′ time in browsing for activities and provide structured activity information to fulfill the modern demand for finding local distinctive activities. In our previous work, we utilized a reinforcement learning-based policy gradient method for activity source web page mining. However, we identified two issues with the two-stage training process: firstly, the first stage only allowed training with a fixed step size, and secondly, the fine-tuning in the second stage did not have significant improvement. To address these problems, we aim to control episode termination using a variable step size during the initial training phase. This approach would provide more flexibility in training to adapt to different scenarios and environmental changes, thereby improving the performance and outcomes of the model. To achieve this goal, we introduced an asset control stopping strategy and employed different reinforcement learning algorithms. Moreover, we redefined the original two-stage training framework, expanding the training strategies to four different methods. By comparing the results with our previous work, we aimed to determine if the newly designed stopping strategy could reduce click costs and identify the most suitable method after applying various reinforcement learning algorithms. Ultimately, we aimed to select the training strategy that best suited our task. The results showed that our newly designed stopping strategy achieved lower click costs and higher performance in the DQN algorithm. The click cost decreased from 1.4% to 1.227%, and the performance improved from 72% to 78.2%. After comparing different training strategies, we concluded that using labeled data and a reward function that incorporates correct answers is more suitable for our task.	en_US
DC.subject	強化式學習	zh_TW
DC.subject	網頁探勘	zh_TW
DC.subject	多任務學習	zh_TW
DC.subject	Reinforcement learning	en_US
DC.subject	web mining	en_US
DC.subject	Multitask Learning	en_US
DC.title	探究強化學習與停止策略於活動來源頁面探勘之設計	zh_TW
dc.language.iso	zh-TW	zh-TW
DC.title	On the Design of RL Algorithms and Termination Strategies for Focused Crawling - A case study for Event Source Page Discovery	en_US
DC.type	博碩士論文	zh_TW
DC.type	thesis	en_US
DC.publisher	National Central University	en_US

博碩士論文 110522070 完整後設資料紀錄