根據Zona Research的研究報告指出,線上購物的使用者最不能忍受的事,就是花時間在等待網頁下載。研究中也同時指出,使用者若是等待超過八秒,就會因不耐久候而放棄消費;因此,如何利用代理伺服器減少延遲時間以爭取消費者的使用忠誠度,是電子商務中十分重要的研究課題。本研究嘗試深入探討代理伺服器之存取紀錄(Access log)內容,分析使用者瀏覽網頁的瀏覽序列(Clickstream),以改進網頁快取演算法。首先,我們嘗試去除內容之雜訊,利用移動視窗熵值過濾法來區別使用者行為規則的可信度,對於可信度較高的,才去進行統計。而且,我們考慮每個網站受歡迎程度的不同,給予不同長度的熱門網頁紀錄。接著,我們依時間內容之不同,一天分上午和下午兩次去計算和更新熱門網頁紀錄,保持資訊的新鮮度。最後,我們提出on-demand prefetching的方法,當使用者有需求時,代理伺服器才去抓取網頁回來,但是給予熱門網頁較大的有效存活時間TTL值(Time-To-Live),以免它們太快被置換出快取。實驗顯示所提出之演算法較原有的演算法可以提升更多的快取命中率,而且不會對網路造成額外的負擔,能有效地降低使用者的等待時間,提高整體服務品質。 A report from Zona Research shows that customers hate waiting long time to load web pages in online shopping. They may lose patience and abandon shopping if the requested web page is not loaded within 8 seconds. In this paper, we try to improve the caching method by analyzing the content of the proxy's access log. Our contributions are listed as follows. First, we use the sliding window scheme and calculate the entropy of the window's access log to decide the habitual behavior (domain mode or exploratory mode) of each individual user. Then, improving the domain-top approach, we consider only the domain mode data in choosing the most popular domains. Next, a more precise statistic mode is applied to calculate the numbers of documents prefetched in different selected domains. Afterwards, analyzing the time variation of the content, we update the rank-list twice a day to further improve the hit rate. Finally, we propose an on-demand prefetching approach to avoid the un-necessary prefetching. It makes hot documents stay longer in the cache. Experiments show that the proposed approach can improve not only the hit rate but also the waiting time.