擴展點擊流：分析點擊流中缺少的使用者行為

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：33

、訪客IP：3.139.82.23

姓名

陳廷睿(Ting-Rui Chen) 查詢紙本館藏

畢業系所

資訊工程學系

論文名稱

擴展點擊流：分析點擊流中缺少的使用者行為
(Extended Clickstream: an analysis of the missing user behaviors in the Clickstream)

相關論文

★ 透過網頁瀏覽紀錄預測使用者之個人資訊與性格特質	★ 透過矩陣分解之多目標預測方法預測使用者於特殊節日前之瀏覽行為變化
★ 動態多模型融合分析研究	★ 關聯式學習：利用自動編碼器與目標傳遞法分解端到端倒傳遞演算法
★ 融合多模型排序之點擊預測模型	★ 分析網路日誌中有意圖、無意圖及缺失之使用者行為
★ 基於自注意力機制產生的無方向性序列編碼器使用同義詞與反義詞資訊調整詞向量	★ 探索深度學習或簡易學習模型在點擊率預測任務中的使用時機
★ 空氣品質感測器之故障偵測--基於深度時空圖模型的異常偵測框架	★ 以同反義詞典調整的詞向量對下游自然語言任務影響之實證研究
★ 結合時空資料的半監督模型並應用於PM2.5空污感測器的異常偵測	★ 藉由權重之梯度大小調整DropConnect的捨棄機率來訓練神經網路
★ 使用圖神經網路偵測 PTT 的低活躍異常帳號	★ 針對個別使用者從其少量趨勢線樣本生成個人化趨勢線
★ 基於雙變量及多變量貝他分布的兩個新型機率分群模型	★ 一種可同時更新神經網路各層網路參數的新技術— 採用關聯式學習及管路化機制

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

一般認為使用者的點擊流 (clickstream) 可以代表使用者的線上瀏覽行為，然而，我們發現點擊流只能概略表示使用者的部份行為，例如：分頁切換、視窗切換等介面間的瀏覽行為因為沒有產生與伺服器的互動，所以不會出現在點擊流或日誌 (log) 中，但使用者仍然在瀏覽網頁。本文將這些行為收集並命名為「擴展點擊流」(extended clickstream)。透過建設完整的系統服務並招募受試者來同步蒐集點擊流和擴展點擊流，並對兩者進行比較分析及建構深度學習模型。我們使用含有 GRU 元件的深度學習模型，對點擊流和擴展點擊流這類型的時序資料進行「使用者下次會去什麼類型的網站」、「下次點擊會間隔多久」的多目標預測。實驗結果顯示：融合點擊流和擴展點擊流可以增進預測效能。除此之外，本文發現點擊流會因為部分網站的運作機制而多計入了使用者沒有意圖執行的行為；另外，我們也可以透過融合點擊流及擴展點擊流來區分出來自不同裝置的單一使用者

摘要(英)

Nowadays, people often use clickstream to represent the behavior of online users. However, we found that clickstream only represents part of users′ browsing behaviors. For instance, clickstream does not include tab switching and browser window switching. We collect these kinds of behaviors and named as ``extended clickstream". This thesis builds a service to capture both of clickstream and extended clickstream, also provides an analysis of the differences between above. We use a Multi-Task learning model with GRU components to perform multi-objective predictions of ``what kind of website the user will go next time" and ``how long the interval of clicks will be" for the time series of clickstreams and extended clickstreams. Our experimental results show that combining clickstream and extended clickstream can improve the prediction performance. In addition, this article finds that the clickstream will record unintended clicks due to the operation mechanism of certain websites. Moreover, we can differentiate the single user from several devices by combining the clickstream and extended clickstream.

關鍵字(中)

★ 點擊流
★ 日誌分析
★ 使用者行為分析
★ 時序資料回歸預測
★ Clickstream
★ log analysis
★ User Behavior Model
★ Time-Series Recurrent Prediction

關鍵字(英)

★ Clickstream
★ log analysis
★ Web mining
★ web usage mining
★ User Behavior Model
★ Time-Series Recurrent Prediction

論文目次

摘要 ... ix
Abstract ... xi
Contents ... xiii
1 Introduction ... 1
2 Related Work ... 3
2.1 Clickstream & Long-Term Cross-Domain Clickstream ... 3
2.2 Post-collected Dataset ... 5
2.2.1 Published as an Open Dataset ... 6
2.3 Discretize the intervals between events in Time-Series data ... 6
2.4 Multi-Task Learning(MTL) ... 7
3 Extended Clickstream(ECS) 9
3.1 What is Extended Clickstream(ECS) ... 9
3.2 Merits of ECS ... 13
3.2.1 Easy to understand ... 13
3.2.2 Make CS more useful ... 13
3.2.3 Enhance the predictive power of modeling user behavior ... 13
4 Methods ... 15
4.1 Phase I. - Data Collecting ... 15
4.1.1 System Requirement ... 15
4.1.2 Market Analysis ... 16
4.1.3 Solution ... 16
4.2 Phase II. - Data Preprocessing ... 17
4.2.1 Filter unintentional event ... 17
4.2.2 Session split ... 17
4.2.3 Time Mapping ... 18
4.2.4 Time Precision Alignment ... 19
4.2.5 Summary of Data Preprocessing ... 20
4.3 Phase III. - Model the User Behavior ... 21
5 Results ... 25
5.1 Collected Data ... 25
5.2 Data Analysis ... 27
5.2.1 Statics Analysis ... 27
5.2.2 Case Study - Multi-device detection and Unintentional events in CS ... 31
5.3 Model Evaluate ... 32
6 Conclusion & Discussion ... 39
Bibliography ... 41
A Data Collect System ... 43

參考文獻

[1] F. Benevenuto, T. Rodrigues, M. Cha, and V. Almeida, “Characterizing user behavior in online social networks,” in Proceedings of the 9th ACM SIGCOMM Conference on Internet Measurement, ser. IMC ’09, Chicago, Illinois, USA: ACM, 2009, pp. 49–62, isbn: 978-1-60558-771-4. doi: 10.1145/1644893.1644900. [Online]. Available: http://doi.acm.org/10.1145/1644893.1644900.
[2] Y. Chi, T. Jiang, D. He, and R. Meng, “Towards an integrated clickstream data analysis framework for understanding web users’ information behavior,” iConference 2017 Proceedings, 2017.
[3] Z. S. Zubi and M. Raiani, “Using web logs dataset via web mining for user behavior understanding,” Int J Comput Comm, vol. 8, pp. 103–111, 2014.
[4] Y. Wang, N. Law, E. Hemberg, and U.-M. O’Reilly, “Using detailed access trajectories for learning behavior analysis,” in Proceedings of the 9th International Conference on Learning Analytics & Knowledge, ser. LAK19, Tempe, AZ, USA: ACM, 2019, pp. 290–299, isbn: 978-1-4503-6256-6. doi: 10 . 1145 / 3303772 . 3303781. [Online]. Available: http://doi.acm.org/10.1145/3303772.3303781.
[5] G. Wang, X. Zhang, S. Tang, C. Wilson, H. Zheng, and B. Y. Zhao, “Clickstream user behavior models,” ACM Trans. Web, vol. 11, no. 4, 21:1–21:37, Jul. 2017, issn: 1559-1131. doi: 10.1145/3068332. [Online]. Available: http://doi.acm.org/10. 1145/3068332.
[6] K. Ma, R. Jiang, M. Dong, Y. Jia, and A. Li, “Neural network based web log analysis for web intrusion detection,” in Security, Privacy, and Anonymity in Computation, Communication, and Storage, G. Wang, M. Atiquzzaman, Z. Yan, and K.-K. R. Choo, Eds., Cham: Springer International Publishing, 2017, pp. 194–204, isbn: 978-3-319-72395-2.
[7] C.-Y. Lien, Predicting Users?Demographic Information and Personality Through Browsing History. 2018. [Online]. Available: https://github.com/ncu-dart/Lab-Publications/raw/master/Thesis2018_Cheng_You_Lien.pdf.
[8] G.-J. Bai, Predicting Users?Browsing Tendency During Holidays by Matrix Factorization based Multi-objective Method. 2018. [Online]. Available: https://github.com/ncu- dart/Lab- Publications/raw/master/Thesis2018_Guo_Jhen_Bai.pdf.
[9] T.-R. Chen, Clickstream open dataset. [Online]. Available: https://ncu-dart.github.io/#CS_open_dataset.
[10] S. Ruder, “An overview of multi-task learning in deep neural networks,” CoRR, vol. abs/1706.05098, 2017. arXiv: 1706.05098. [Online]. Available: http://arxiv. org/abs/1706.05098.
[11] G. Zhou, N. Mou, Y. Fan, Q. Pi, W. Bian, C. Zhou, X. Zhu, and K. Gai, “Deep interest evolution network for click-through rate prediction,” CoRR, vol. abs/1809.03672, 2018. arXiv: 1809 . 03672. [Online]. Available: http : / / arxiv . org / abs / 1809 . 03672.
[12] Google, Chrome.history. [Online]. Available: https://developer.chrome.com/ extensions/history#transition_types.
[13] J. Chung, Ç. Gülçehre, K. Cho, and Y. Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” CoRR, vol. abs/1412.3555, 2014. arXiv: 1412.3555. [Online]. Available: http://arxiv.org/abs/1412.3555.
[14] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., vol. 9, no. 8, pp. 1735–1780, Nov. 1997, issn: 0899-7667. doi: 10.1162/neco.1997. 9.8.1735. [Online]. Available: http://dx.doi.org/10.1162/neco.1997.9.8.1735.
[15] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. [Online]. Available: http://arxiv.org/abs/1412.6980.
[16] S. L. Smith, P. Kindermans, and Q. V. Le, “Don’t decay the learning rate, increase the batch size,” CoRR, vol. abs/1711.00489, 2017. arXiv: 1711.00489. [Online]. Available: http://arxiv.org/abs/1711.00489.
[17] S. Kullback and R. A. Leibler, “On information and sufficiency,” Ann. Math. Statist., vol. 22, no. 1, pp. 79–86, Mar. 1951. doi: 10.1214/aoms/1177729694. [Online]. Available: https://doi.org/10.1214/aoms/1177729694.

指導教授

陳弘軒(Hung-Hsuan Chen)

審核日期

2019-7-17

推文