訊息影響力預測：使用Facebook資料為例

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：20

、訪客IP：3.145.69.239

姓名

鄭如筠(Ju-Yun Cheng) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

訊息影響力預測：使用Facebook資料為例
(Predict Influence of Posts：Using Data from Facebook)

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構	★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題	★ 以新奇的方法有序共識群應用於群體決策問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

社群網站已成為近年來人們最常使用的網站之一，人們可以在社群網站上透過個人檔案分享或是和其他使用者的聯繫與溝通展現自我。Facebook是眾多社群網站中最受歡迎且使用者最多的網站，許多公司為了可以直接與網路上的客戶聯繫，皆已在Facebook架設公司專屬的粉絲專頁或社團。
本論文蒐集了Facebook的資料集並依此建立混和模型以預測訊息在經過一段給定時間後其影響力程度，影響力程度的預測主要根據該訊息內容、時間特性及作者特性。本論文之混和模型利用投票的方式整合了五種分類器的分類結果，包含類神經網路、決策樹、羅吉斯回歸、貝式分類及支援向量機。過去的研究在預測訊息重要性時只考慮訊息被使用者存取的次數，忽略了存取該訊息的使用者在社群網站中之個別影響力。本篇論文和過去研究最大的相異之處即在於我們假設每個使用者皆有不同的權重，反應其在社群網路中的個別影響力，因此Facebook上的訊息其影響力即為針對該訊息按”讚”的使用者的權重加總。
本篇論文的實驗採用資料探勘工具Clementine執行十折交叉驗證。實驗結果顯示本篇論文採用的混和模型其預測表現皆優於上述的五個分類器，且在本篇論文中提出的預測因子也具有相當的顯著性。實驗結果也說明了在預測訊息的影響力時，若同時考量使用者的個別影響力，其預測結果會相較只考量訊息被存取的次數而忽略使用者的個別影響力時準確。

摘要(英)

Social networking web sites (SNWs) have become one of the several main sites where people spend most of their time. People can present themselves on their individual profiles, make links to other users, and communicate with them on SNWs. Facebook is one of the most popular media of SNWs and becomes the top most-trafficked website in the world. In order to contact with on-line customers, many corporations have their own page on Facebook.
In this study, we focus on Facebook and build an ensemble model to predict the influence of posts in the future based on content features, temporal features, and authorial features. The ensemble model integrates results from Neural Network, Decision Tree (C5.0), Logistic Regression, Naive Bayes, and Support Vector Machines (SVM) by voting method. Different from previous research in predicting influence of posts on SNWs which only consider their access counts and neglect different influence of individual user, this work assumes that each user is associated with a weight to reflect his influence in social network and the influence of a post on Facebook is defined as the weighted sum of the influence of the users who clicked “like.
Our experiments are executed by the data mining tool, Clementine, and performed by a 10-fold cross-validation. Experiment results show that the predicting performance of our ensemble model outperforms each individual classifier and the features we propose can significantly improve the prediction of posts’’ influence. The results also show that our model, which considers different weights of users, can achieve higher accuracy than traditional model, which treats all users the same, in predicting influence of posts.

關鍵字(中)

★ 資料探勘
★ 社群網路
★ 分類

關鍵字(英)

★ Data mining
★ Social network
★ Classification

論文目次

Abstract vii
摘要 viii
致謝 ix
Contents x
List of Figures xi
List of Tables xii
Chapter 1 Introduction 1
Chapter 2 Background and Related work 5
2.1 Social networking web sites (SNWs) and Facebook.com 5
2.2 Identifying high-influence users 6
2.3 Identifying high-influence content 9
Chapter 3 Problem Description 15
3.1 Content features 17
3.2 Temporal features 18
3.3 Authorial features 19
Chapter 4 Methodology 21
4.1 Data preprocessing 21
4.2 Ensemble model 22
4.3 Model Evaluation 30
Chapter 5 Experiments 32
5.1 Features examination 34
5.2 Predicting with different feature sets 36
5.3 Comparison of ensemble model with other classifiers 39
5.4 Discussion 40
Chapter 6 Conclusion 43
Reference 44

參考文獻

[1] Serrat, O., 2008, “Social Network Analysis,” Knowledge Solutions.
[2] Mangold, W.G., Faulds, D.J., 2009, “Social Media: The New Hybrid Element of the Promotion Mix,” Business Horizons, Vol.52, No.4, pp. 357–365.
[3] Regus, A., 2011,“Social Recovery: A global survey of business use of social networks, ”available online athttp://www.slideshare.net/REGUSmedia/a-social-recovery-a-global-survey-of-business-use-of-social-networks.
[4] Yu, B., Chen, M., and Kwok, L., 2011, “Toward predicting popularity of social marketing messages,” Computer Science, Vol.6589, pp. 317–324.
[5] Shastri, V., 2011, “Social Media Marketing in India,” available online at http://www.sibm.edu/FacultyResearch/pdf/samvad2.pdf#page=31.
[6] Gallaugher, J., and Ransbotham, S., 2010, “Social media and customer dialog management at Starbucks,” MIS Quarterly Executive, Vol.9, No.4, pp. 197-212.
[7] Kiss, C., and Bichler, M., 2008, “Identification of influencers: Measuring influence in customer networks,” Decision Support Systems, Vol.46, pp. 233–253.
[8] Kim, E.S., and Han, S.S., 2009, “An Analytical Way to Find Influencers on Social Networks and Validate their Effects in Disseminating Social Games,” Social Network Analysis and Mining, pp.41-46.
[9] Cha, M., Haddadi, H., Benevenuto, F., and Gummadi, K.P., 2010, “Measuring User Influence in Twitter: The Million Follower Fallacy,” Proceeding of the Fourth International AAAI Conference on Weblogs and Social Media.
[10] Li, Y.M., Lin, C.H., and Lai, C.Y., 2010, “Identifying influential reviewers for word-of-mouth marketing,” Electronic Commerce Research and Applications, Vol.9, pp. 294-304.
[11] Bakshy, E., Hofman, J., Mason, W., and Watts, D. J., 2011, “Everyone’s an influencer: Quantifying influence on twitter, ”Proceedings of the fourth ACM international conference on Web search and data mining.
[12] Li, F., and Du, T.C., 2011, “Who is talking? An ontology-based opinion leader identification framework for word-of-mouth marketing in online social blogs,” Decision Support Systems, Vol.51, pp. 190-197.
[13] Li, Y.-M., Lai, C.-Y., and Chen, C.-W., 2011, “Discovering influencers for marketing in the blogosphere,”Information Sciences, Vol.181, pp. 5143-5157.
[14] Adamic, L., Zhang, J., Bakshy, E., and Ackerman, M., 2008, “Knowledge sharing and yahoo answers: everyone knows something,” Proceedings of the 17th international conference on World Wide Web, pp. 665-674.
[15] Bian, J., Liu, Y., Zhou, D., Agichtein, E., and Zha, H., 2009, “Learning to recognize reliable users and content in social media with coupled mutual reinforcement,” Proceedings of the 18th international conference on the World Wide Web.
[16] Ratkiewicz, J., Menczer, F., Fortunato, S., Flammini, A., and Vespignani, A., 2010, “Characterizing and modeling the dynamics of online popularity,” Physical Review Letters, Vol.105.
[17] Suh, B., Hong, L., Pirolli, P., and Chi, E., 2010, “Want to be retweeted? Large scale analytics on factors impacting retweet in twitter network,” IEEE Second International Conference on Social Computing, pp. 177-184.
[18] Cao, Q., Duan, W., and Gan, Q., 2011, “Exploring determinants of voting for the "helpfulness" of online user reviews: A text mining approach,” Decision Support Systems, Vol.50, pp. 511-521.
[19] Hong, L., Dan, O., and Davison, B. D., 2011, “Predicting popular messages in Twitter,” Proceeding of International World Wide Web Conference, pp. 57–58.
[20] Ahn, Y.-Y., Han, S., Kwak, H., Moon, S., and Jeong, H., 2007, “Analysis of Topological Characteristics of Huge Online Social Networking Services,” In Proceedings of the 16th international conference on World Wide Web (WWW’07), Banff, Canada.
[21] Gross, R. and Acquisti, A., 2005, “Information Revelation and Privacy in Online Social Networks,” in ACM Workshop on Privacy in the Electronic Society, pp. 71–80.
[22] Koroleva, K., Krasnova, H., and Gunther, O., 2011, “Cognition or Affect? – Exploring Information Processing on Facebook,” Computer Science, Vol.6984, pp. 171-183.
[23] Jin, X., Wang, C., Luo, J., Yu, X., and Han, J., 2011, “Likeminer: A system for mining the power of ‘like’ in social media networks,” Proceedings of the 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
[24] Li, N., and Wu, D.D., 2010, “Using Text Mining and Sentiment Analysis for Online Forums Hotspot Detection and Forecast,” Decision Support Systems, Vol.48, pp. 354-368.
[25] McAndrew, F.T., and De Jonge, C.R., 2011, “Electronic Person Perception: What Do We Infer About People From the Style of Their E-mail Messages?” Social Psychological and Personality Science, Vol.2, No.4, pp. 403-407.
[26] Zhou, Y., and Cao, Z., 2011, “Research on the Construction and Filter Method of Stop-word List in Text Preprocessing,” Fourth International Conference on Intelligent Computation Technology and Automation, pp. 217–221.
[27] Silva, C., and Ribeiro, B. “The importance of stop word removal on recall values in text categorization,” Proceedings of the International Joint Conference on Neural Networks, Vol.3, pp. 1661-1666.
[28] Lakkaraju, H., Rai, A., and Merugu, S., 2011, “Smart News Feeds for Social Networks Using Scalable Joint Latent Factor Models,” Proceedings of the 20th international conference companion on World Wide Web.
[29] Golbeck, J., Robles, C., and Turner, K., 2011, “Prediciting Personality with Social Media,” Proceedings of the 2011 annual conference extended abstracts on Human factors in computing systems, pp. 253-262.
[30] Paek, T., Gamon, M., Counts, S., Chickering,D.M., and Dhesi, A., 2010, “Predicting the Importance of Newsfeed Posts and Social Network Friends,” Proceedings of the TwentyFourth AAAI Conference on Artiﬁcial Intelligence, pp. 1419–1424.
[31] Welser, H. T., Gleave, E., Fisher, D., and Smith, M. 2007, “Visualizing the signatures of social roles in online discussion groups,” The Journal of Social Structure, Vol.8, No.2.
[32] Ku, L.-W., Liang, Y.-T., and Chen, H.-H., 2006, “Opinion extraction, summarization and tracking in news and blog corpora,” American Association for Artificial Intelligence, pp. 100–107.
[33] Ku, L.W., Wu, T.H., Lee, L.Y., and Chen, H.H., 2007, “Using Polarity Scores of Words for Sentence level Opinion Extraction,” Proceedings of the Sixth NTCIR Workshop, pp. 316-322.
[34] Tan, S., Wang, Y., and Cheng, X., 2008, “Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples,” Proceedings of the SIGIR, pp. 743–744.
[35] Cortes, C. and Vapnik, V., 1995, “Support vector networks,” Machine Learning, Vol.20, No.3, pp. 273-297.
[36] Boswell, D., 2002, “Introduction to Support Vector Machines,” available online at http://www.work.caltech.edu/boswell/IntroToSVM.pdf.
[37] Yang, Y., 1999, “An Evaluation of Statistical Approaches to Text Categorization,” Information Retrieval of Computer Science, Vol.1, No.1-2, pp. 69-90.
[38] MacQueen, J., 1967, “Some methods for classification and analysis multivariate observation,” Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, Vol.1, pp. 281–297.

指導教授

陳彥良(Yen-Liang Chen)

審核日期

2012-6-30

推文