dc.description.abstract | The purpose of this study is to develop an intelligent web crawler system that collects information from activity source web pages. Our objective is to save users′ time in browsing for activities and provide structured activity information to fulfill the modern demand for finding local distinctive activities. In our previous work, we utilized a reinforcement learning-based policy gradient method for activity source web page mining. However, we identified two issues with the two-stage training process: firstly, the first stage only allowed training with a fixed step size, and secondly, the fine-tuning in the second stage did not have significant improvement. To address these problems, we aim to control episode termination using a variable step size during the initial training phase. This approach would provide more flexibility in training to adapt to different scenarios and environmental changes, thereby improving the performance and outcomes of the model. To achieve this goal, we introduced an asset control stopping strategy and employed different reinforcement learning algorithms. Moreover, we redefined the original two-stage training framework, expanding the training strategies to four different methods. By comparing the results with our previous work, we aimed to determine if the newly designed stopping strategy could reduce click costs and identify the most suitable method after applying various reinforcement learning algorithms. Ultimately, we aimed to select the training strategy that best suited our task. The results showed that our newly designed stopping strategy achieved lower click costs and higher performance in the DQN algorithm. The click cost decreased from 1.4% to 1.227%, and the performance improved from 72% to 78.2%. After comparing different training strategies, we concluded that using labeled data and a reward function that incorporates correct answers is more suitable for our task. | en_US |