以作者查詢圖書館館藏 、以作者查詢臺灣博碩士 、以作者查詢全國書目 、勘誤回報 、線上人數:18 、訪客IP:18.191.62.68
姓名 范登凱(Teng-Kai Fan) 查詢紙本館藏 畢業系所 資訊工程學系 論文名稱 網路個人化廣告配置之研究
(Allocation of Personalized Ads on the Web)相關論文 檔案 [Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] [檢視] [下載]
- 本電子論文使用權限為同意立即開放。
- 已達開放權限電子全文僅授權使用者為學術研究之目的,進行個人非營利性質之檢索、閱讀、列印。
- 請遵守中華民國著作權法之相關規定,切勿任意重製、散佈、改作、轉貼、播送,以免觸法。
摘要(中) Contextual Advertisement 是一種透過Web 來吸引顧客的廣告形式,目前已逐漸成為重要的行銷管道之一。本研究提出一套基於廣告內容 (文字型廣告) 的機制,將相關的廣告分配至一般個人網頁 (如部落格網頁)。部落格是Web2.0 的代表應用之一,其特點在給予使用者便利的內容創造工具。透過部落格,內容創造者可輕易地表達個人的意見,而一般瀏覽者也可據此意見做為商品選購的參考。因此如果一個部落格網頁呈現的是產品的負面意見,那麼瀏覽者的點選機率可能是較低的。再者,部落格的擁有者恐怕會更樂意地獲得和自身網站內容呈現高度正相關的廣告。因此,我們提出一個透過意見分析中的情緒偵測來改善目前現有的文字型廣告,稱之為SOCA (Sentiment oriented contextual advertising) 架構。
從另一個角度來說,雖然大部份文字型廣告考慮的是“放置瀏覽者可能關心的廣告”,但事實上大多數不是熱門的部落格網頁最主要的瀏覽者就是部落客本身。因此就長尾理論來說,我們更應該為大多數內容創造者配置合適的廣告。因此本研究即據此想法提出以部落客為中心的廣告配置架構 (Blogger-centric contextual advertising)。本研究應用意見分析、意圖辨識、資料檢索、機器學習等技術做為SOCA 及 BCCA 廣告架構。
此外社群網路 (Social Network, 例如: Facebook, Morgenstern) 已漸漸成為使用者產生社群活動的平台。文獻指出超過五分之一的線上廣告被配置於社群網路中,因此線上廣告配置基於使用者興趣及社群關係更顯得格外重要。本研究最後以社群網路為背景提出以機器學習為基礎的廣告配置架構。首先我們提出三種基礎過濾機制 (1) 基於內容過濾 (Content-based filtering);(2) 協同式過濾 (Collaborative filtering);(3) 社群式過濾 (Social filtering):。此外為了加強協同式過濾法,我們採用boosted 方法來解決 first-rater 和 sparse 問題,稱為boosted models;最後為了有效將 fundamental 及 boosted 過濾機制整合,我們提出以機器學習為主的過濾機制來配置社群網路中的廣告。
本論文透過真實資料集進行實驗,資料集包含有文字廣告、部落格網頁及社群網路相關資料,實驗結果顯示,本論文所提出的方法模型確實有效地配置和使用者興趣相關的廣告。
摘要(英) Web advertising, a form of advertising that uses the World Wide Web to attract customers, has become one of the world’s most important marketing channels. This study addresses the mechanism of Content-based advertising (Contextual advertising), which refers to the assignment of relevant ads to a generic individual web page, e.g. a blog post. As blogs become a platform for expressing personal opinion, they naturally contain various kinds of expressions, including both facts and opinions. Such opinions can be references for decision making of the Web surfers who browses the blog. Thus, if a blog contains negative opinion of some product, it is less likely the ad be clicked. Besides, the web-site owners would be more willing to have ads which are positively related to their contents. Hence, we propose the utilization of sentiment detection to improve Web-based contextual advertising. The proposed SOCA (Sentiment-Oriented Contextual Advertising) framework aims to combine contextual advertising matching with sentiment analysis to select ads that are related to the positive (and neutral) aspects of a blog and rank them according to their relevance.
On the other hand, although most contextual advertising scheme considers Web surfers as the ad consumer, bloggers themselves are the constant visitors of their own blogs. Thus, it is worth that the advertising scheme put in the first place the interests or intention of the content creators. Thus, we propose BCCA (Blogger-Centric Contextual Advertising) framework which combines contextual advertising matching with text mining technique to select ads that are related to immediate interests as revealed in a blog and rank them according to their relevance.
In addition to general web pages, as social networks are becoming a more interactive platform for social activities, more than 20% of online advertisements are served on social networks. The allocation of advertisements based on both individual information and social relationships is becoming ever more important. In our third advertising framework, we firstly propose the idea of social filtering and compare it with content-based filtering and collaborative filtering for advertisement allocation in a social network. Secondly, we apply content-boosted and social-boosted methods to enhance existing collaborating filtering model. Finally, an effective learning-based framework is proposed for combing filtering models to improve social advertising.
We experimentally validate our approach using a set of data that includes real ads, actual blog pages and social networks using metrics of information retrieval. The results indicate that our proposed method could efficiently identify those ads that are relevant to users’’ interests.
關鍵字(中) ★ 社會網路
★ 資訊檢索
★ 機器學習
★ 文字探勘
★ 文字型廣告關鍵字(英) ★ machine learning
★ text mining
★ contextual advertising
★ social network
★ information retrieval論文目次 Table of Contents
Chinese Abstract ........................................................................................ i
English Abstract ......................................................................................... ii
Acknowledgement ...................................................................................... iii
Table of Contents ....................................................................................... iv
List of Figures ............................................................................................. vi
List of Tables .............................................................................................. vii
1. Introduction ............................................................................................ 1
1.1 Motivation ....................................................................................... 1
1.2 Overview of the Dissertation ........................................................... 2
1.2.1 Sentiment-Oriented Contextual Advertising ......................... 2
1.2.2 Blogger-Centric Contextual Advertising ............................... 3
1.2.3 Learning to Predict Ads Click Based on Boosted Collaborative Filtering 4
1.3 Organization of the Dissertation ..................................................... 4
2. Sentiment-Oriented Contextual Advertising ....................................... 5
2.1 Introduction ..................................................................................... 5
2.2 Background ..................................................................................... 7
2.2.1 Content-based Advertising .................................................... 7
2.2.2 Blogosphere, Sentiment Detection ........................................ 8
2.3 SOCA FRAMEWORK ................................................................... 9
2.3.1 Sentiment Detection .............................................................. 10
2.3.2 Term Expansion .................................................................... 12
2.3.3 Page-Ad Matching ................................................................. 14
2.4 EXPERIMENTAL RESULTS ........................................................ 16
2.4.1 Datasets and Text-Preprocessing ........................................... 17
2.4.2 Evaluation of Sentiment Detection ........................................ 18
2.4.3 Evaluation of Page-Ad Matching .......................................... 21
2.5 Summary ......................................................................................... 27
3. Blogger-Centric Contextual Advertising.............................................. 28
3.1 Introduction...................................................................................... 28
3.2 BCCA Framework........................................................................... 30
3.2.1 Intention Recognition............................................................. 33
3.2.2 Sentiment Detection............................................................... 36
3.3.3 Term Expansion .................................................................... 37
3.2.4 Page-Ad Matching ................................................................. 38
3.3 Experimental Results ....................................................................... 39
3.3.1 Datasets and Text-Preprocessing ........................................... 40
3.3.2 Evaluation of Intention Recognition ..................................... 41
3.3.3 Evaluation of Sentiment Detection ........................................ 42
3.4.4 Evaluation of Page-Ad Matching .......................................... 47
3.4 Summary ......................................................................................... 51
4. Learning to Predict Ads Click Based on Boosted Collaborative Filtering 52
4.1 Introduction ..................................................................................... 53
4.2 OVERVIEW OF MORGENSTERN .............................................. 55
4.3 Framework....................................................................................... 55
4.3.1 Fundamental Filtering Mechanisms ...................................... 56
4.3.2 Boosted Filtering Mechanisms.............................................. 58
4.3.3 Learning-based Framework................................................... 59
4.4 EVALUATION ............................................................................... 59
4.4.1 Data Set and Measurements .................................................. 59
4.4.2 Evaluation of Fundamental and Boosted Filtering Mechanisms......................................................................... 61
4.4.3 Evaluation of Learning-Based Filtering Models................... 61
4.4.4 Further Analysis for Feature Combinations .......................... 62
4.4.5 Parameter Selection .............................................................. 63
4.4.6 Comparison of Multiple Classifiers....................................... 63
4.5 Discussions and Observations......................................................... 65
4.6 Summary.......................................................................................... 67
5. Related Work .......................................................................................... 69
5.1 Contextual Advertising ................................................................... 69
5.2 Sentiment Detection ........................................................................ 70
5.3 Personalized search and User Interests ........................................... 71
5.4 Social Community and Advertising ................................................ 72
6. Conclusion and Future Works .............................................................. 74
6.1 Conclusions ..................................................................................... 74
6.2 Future Directions ............................................................................. 75
Bibliography ............................................................................................... 77
List of Figures
2.1 Correlation ads conflicting with blog content ................................... 6
2.2 Key players in content-based advertising ......................................... 7
2.3 The SOCA framework ...................................................................... 10
2.4 Process for generating associated terms ............................................ 13
2.5 Precision-Recall 11-point curve ......................................................... 24
2.6 The performance of the three matching strategies ............................. 24
2.7 Precision-Recall cure for the negative sentiment dataset .................. 25
2.8 The performance of the three matching strategies on the negative sentiment dataset ............................................................................... 25
2.9 Precision-Recall curve for the positive sentiment dataset ................. 26
2.10 The performance of the three matching strategies on the positive sentiment dataset ............................................................................... 26
3.1 Example of a blog page with correlation ads ..................................... 29
3.2 The BCCA framework ...................................................................... 32
3.3 Pseudo code of ads assignment strategies ......................................... 32
3.4 Example of a post with buyer's requirements .................................... 34
3.5 Effect of feature size on accuracy ...................................................... 43
3.6 Precision-Recall curve ....................................................................... 50
3.7 The performance of the three IR models ........................................... 51
4.1 System Overview................................................................................ 56
4.2 Comparison of MAE for logistic regression with various feature combinations with the Bonferroni-Dunn test. Group of feature sets that are not significantly different (at p = 0.05) are connected. 63
4.3 Comparison of all classifiers against each other with the Nemenyi test. Group of classifiers that are not significantly different (at p = 0.05) are connected. 65
List of Tables
2.1 Subjective sentence identification results ........................................ 19
2.2 Positive and negative sentence classification results ...................... 20
2.3 System results for triggering pages ................................................. 20
2.4 Accuracy of page-ad matching ........................................................ 22
3.1 Contingency table for Chi-square .................................................... 35
3.2 Intention and Non-Intention sentence discovery results ................. 41
3.3 Intention recognition by various feature sizes ............................... 43
3.4 Subjective and objective sentence identification results ................ 45
3.5 Positive and negative sentence classification results ...................... 46
3.6 System results for triggering pages ................................................. 47
3.7 Accuracy of page-ad matching ........................................................ 48
3.8 Accuracy of page-ad matching ........................................................ 49
4.1 Filtering mechanisms results. (bold = best performing, * = p < 0.01 ) 61
4.2 Learning-based filtering models results (bold = best performance). Learning models with significant difference from best-performing filtering mechanisms in Table 1 are marked (* = p < .05, ** = p < .01)................................................................................................... 62
4.3 The positive class performance of each filtering methods. (bold = best performing)............................................................................... 66
4.4 The aggregating MAE of filtering methods. (bold = best performing)....................................................................................... 67
參考文獻 [1] A. Abbasi, H. Chen, and A. Salem, "Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums," ACM Transactions on Information Systems (TOIS), vol. 26, 2008.
[2] E. Agichtein, E. Brill, S. Dumais, and R. Ragno, "Learning User Interaction Models for Predicting Web Search Result Preferences," in Proceedings of the 29th International Conference on SIGIR, 2006, pp. 3-10.
[3] A. Anagnostopoulous, A. Z. Broder, E. Gabrilovich, V. Josifovski, and L. Riedei, "Just-in-Time Contexual Advertising," in Proceedings of the 16th International Conference on CIKM, 2007, pp. 331-340.
[4] S. M. Bae, S. C. Park, and S. H. Ha, "Fuzzy Web ad selector based on Web usage mining," Intelligent Systems, IEEE, vol. 18, pp. 62-69, 2003.
[5] A. Broder, M. Fontoura, V. Josifovski, and L. Riedel, "A semantic approach to contextual advertising," in Proceedings of the 30th International Conference on SIGIR, 2007, pp. 559-566.
[6] D. Carmel, N. Zwerdling, I. Guy, S. Ofek-Koifman, N. Har'el, I. Ronen, E. Uziel, S. Yogev, and S. Chernov, "Personalized Social Search Based on the User's Social Network," in Proceeding of the 18th International Conference on CIKM, 2009, pp. 1227-1236.
[7] P. Chatterjee, D. L. Hoffman, and T. P. Novak, "Modeling the Clickstream: Implications for Web-Based Advertising Efforts," Marketing Science, vol. 22, pp. 520-541, 2003.
[8] H. Chernoff and E. L. Lehmann, "The Use of Maximum Likelihood Estimates in X2 Tests for Goodness of Fit," The Annals of Mathematical Statistics, vol. 25, pp. 579-586, 1954.
[9] ChoiceSteam, "ChoiceStream personalization survey: consumer trends and perceptions, http://www.choicestream.com/pdf/ChoiceStream_PersonalizationSurveyResults2005.pdf," 2005.
[10] J. Demsar, "Statistical Comparisons of Classifiers over Multiple Data Sets," Machine Learning Research, vol. 7, pp. 1-30, 2006.
[11] O. J. Dunn, "Multiple Comparisons Among Means," American Statistical Association, vol. 56, pp. 52-64, 1961.
[12] A. Esuli and F. Sebastiani, "SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining," in Proceedings of the 5th International Conference on Language Resources and Evaluation, 2006, pp. 417-422.
[13] T.-K. Fan and C.-H. Chang, "Sentiment-Oriented Contextual Advertising," Knowledge and Information Systems (KAIS), 2009.
[14] T.-K. Fan and C.-H. Chang, "Blogger-Centric Contextual Advertising," in Proceeding of the 18th International Conference on CIKM, 2009, pp. 1803-1806.
[15] J. Feng, H. K. Bhargava, and D. Pennock, "Comparison of Allocation Rules for Paid Placement Advertising in Search Engine," in Proceedings of the 5th international conference on Electronic commerce, 2003, pp. 294-299.
[16] M. Friedman, "The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance" Journal of the American Statistical Association, vol. 32, pp. 675-701, 1937.
[17] V. Hatzivassiloglou and K. R. Mckeown, "Predicting the Semantic Orientation of Adjectives," in Proceedings of the 35th International Conference on ACL, 1997, pp. 174-181.
[18] P. Kazienko and M. Adamski, "AdROSA-Adaptive personalization of web advertising," Information Sciences, vol. 177, pp. 2269-2295, 2007.
[19] S.-M. Kim and E. Hovy, "Determining the sentiment of opinions," in Proceedings of the 20th international conference on COLING, 2004, pp. 1367-1373.
[20] S.-M. Kim and E. Hovy, "Automatic Identification of Pro and Con Reasons in Online Reviews," in Proceedings of the COLING/ACL on Main conference poster sessions, 2006, pp. 483 - 490
[21] I. Konstas, V. Stathopoulos, and J. M. Jose, "On Social Networks and Collaborative Recommendation," in Proceedings of the 32nd International Conference on SIGIR, 2009, pp. 195-202.
[22] A. Lacerda, M. Cristo, M. A. Gonçalves, W. g. Fan, N. Ziviani, and B. Ribeiro-Neto, "Learning to advertise," in Proceedings of the 29th International Conference on SIGIR, 2006, pp. 549-556.
[23] M. Langheinrich, A. Nakamura, N. Abe, T. Kamba, and Y. Koseki, "Unintrusive customization techniques for Web advertising," The International Journal of Computer and Telecommunications Networking, vol. 31, pp. 1259-1272, 1999 1999.
[24] J. Luxenburger, S. Elbassuoni, and G. Weikum, "Matching Task Profiles and User Needs in Personalized Web Search," in Proceeding of the 17th International Conference on CIKM, 2008, pp. 689-698.
[25] Y. Matsuo and H. Yamamoto, "Community Gravity: Measuring Bidirectional Effects by Trust and Rating on Online Social Networks," in Proceedings of the 18th International Conference on WWW, 2009, pp. 751-760.
[26] P. Melville, R. J. Mooney, and R. Nagarajan, "Content-boosted collaborative filtering for improved recommendations," in Eighteenth national conference on Artificial intelligence, 2002, pp. 187-192.
[27] P. B. Nemenyi, "Distribution-free multiple comparisons." vol. PhD: Princeton University, 1963.
[28] T. P. Novak and D. L. Hoffman, "New metrics for new media: toward the development of Web measurement standards," World Wide Web Journal, vol. 2, pp. 213-246, 1997.
[29] B. Pang, L. Lee, and S. Vaithyanathan, "Thumbs up? Sentiment Classification Using Machine Learning Techniques," in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2002, pp. 79-86.
[30] P. Panzarasa, T. Opsahl, and K. M. Carley, "Patterns and dynamics of users' behavior and interaction: Network analysis of an online community," Journal of the American Society for Information Science and Technology, vol. 60, pp. 911-932, 2009.
[31] S. Papadopoulos, F. Menemenis, Y. Kompatsiaris, and B. Bratu, "Lexical Graphs for Improved Contextual Ad Recommendation," in Proceedings of the 31st European Conference on Information Retrieval, Toulouse, France, 2009, pp. 216-227.
[32] J. Parsons, K. Gallagher, and K. D. Foster, "Messages in the Medium: An Experimental Investigation of Web Advertising Effectiveness and Attitudes toward Web Content," in Hawaii International Conference on System Sciences, 2000.
[33] P. Perner and G. Fiss, "Intelligent E-marketing with Web Mining, Personalization, and User-Adpated Interfaces," Advances in Data Mining: Applications in E-Commerce, Medicine, and Knowledge Management, LNAI, vol. 2394, pp. 37-45, 2002.
[34] J. M. Ponte and W. B. Croft, "A language modeling approach to information retrieval," in Proceedings of the 21st International Conference on SIGIR, 1998, pp. 275-281.
[35] M. F. Porter, "An algorithm for suffix stripping," Electronic Library & Information System, vol. 40, pp. 211-218, 2006.
[36] F. Provost, B. Dalessandro, and R. Hook, "Audience Selection for On-line Brand Advertising: Privacy-friendly Social Network Targeting," in Proceedings of the 15th International Conference on SIGKDD, 2009, pp. 707-716.
[37] F. Qiu and J. Cho, "Automatic identification of user interest for personalized search," in Proceedings of the 15th International Conference on WWW, 2006, pp. 727-736.
[38] B. Ribeiro-Neto, M. Cristo, P. B. Golgher, and E. S. d. Moura, "Impedance coupling in content-targeted advertising," in Proceedings of the 28th International Conference on SIGIR, 2005, pp. 496-503.
[39] E. Riloff and J. Wiebe, "Learning Extraction Patterns for Subjective Expressions," in Proceedings of the 2003 conference on Empirical methods in natural language processing, 2003, pp. 105 - 112
[40] S. E. Robertson, S. Walker, S. Jones, M. Hancock-Beaulieu, and M. Gatford, "Okapi at TREC-3," in Proceedings of the Third TREC, . Gaithersburg, USA, 1994.
[41] M. Sahami and T. D. Heilman, "A web-based kernel function for measuring the similarity of short text snippets," in Proc. of the 15th International Conference on WWW, 2006, pp. 377-386.
[42] G. Salton and C. Buckley, "Term Weighting Approaches in Automatic Text Retrieval," Information Processing and Management, vol. 24, pp. 512-523, 1988.
[43] P. Singla and M. Richardson, "Yes, There is a Correlation-From Social Networks to Personal Behavior on the Web," in Proceedings of the 17th International Conference on WWW, 2008, pp. 655 - 664.
[44] B. Tan, X. Shen, and C. Zhai, "Mining long-term search history to improve search accuracy," in Proceedings of the 12th International Conference on SIGKDD, 2006, pp. 718-723.
[45] P. Turney, "Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews," in Proceedings of 40th Annual Meeting of the Association for Computational Linguistics, 2002, pp. 417-424.
[46] P. D. Turney, "Mining the Web for synonyms: PMI-IR versus LSA on TOEFL," in Proc. of ECML, 2001, pp. 491-502.
[47] C. Wang, P. Zhang, R. Choi, and M. D'Eredita, "Understanding consumers attitude toward advertising," in Eighth Americas Conference on Information Systems, 2002, pp. 1143-1148.
[48] M. Weideman and T. Haig-Smith, "An investigation into search engines as a form of targeted advert delivery," in Proceedings of the 2002 annual research conference of the South African institute of computer scientists and information technologists on Enablement through technology, 2002, pp. 258-258.
[49] R. W. White, P. Bailey, and L. Chen, "Predicting User Interests from Contextual Information," in Proceedings of the 32nd International Conference on SIGIR, 2009, pp. 363-370.
[50] J. Wiebe and E. Riloff, "Creating Subjectives and Objective Sentence Classifiers from Unannotated Texts," in CICLing, 2005.
[51] T. Wilson, J. Wiebe, and P. Hoffmann, "Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis," in Proceedings of the conference on HLT/ EMNLP, 2005, pp. 347-354.
[52] J. Yan, N. Liu, G. Wang, W. Zhang, Y. Jiang, and Z. Chen, "How much can Behavioral Targeting Help Online Advertising?," in Proceedings of the 18th International Conference on WWW, 2009, pp. 261-270.
[53] Y. Yang and J. O. Pedersen, "A Comparative Study on Feature Selection in Text Categorization," in Proceedings of the 4th International Conference on Machine Learning(ICML), 1997, pp. 412-420.
[54] H. Yu and V. Hatzivassiloglou, "Towards answering opinion questions: separating facts from opinions and identifying the polarity of opinion sentences," in Proceedings of the 2003 conference on EMNLP, 2003, pp. 129-136.
[55] C. Zhai and J. Lafferty, "A study of smoothing methods for language models applied to information retrieval," ACM Transactions on Information Systems (TOIS), vol. 22, pp. 179-214, 2004.
[56] W. Zhang, C. Yu, and W. Meng, "Opinion Retrieval from Blogs," in Proceedings of the 16th International Conference on CIKM, 2007, pp. 831-840.
指導教授 張嘉惠(Chia-Hui Chang) 審核日期 2010-6-15 推文 facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu 網路書籤 Google bookmarks del.icio.us hemidemi myshare