以交易資料與產品分類樹進行市場區隔之研究

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：57

、訪客IP：13.59.217.105

姓名

王敏姿(MinTzu Wang) 查詢紙本館藏

畢業系所

企業管理學系

論文名稱

以交易資料與產品分類樹進行市場區隔之研究
(Segmenting Customers with Quantitative Transactions Annotated with Unbalanced Hierarchies)

相關論文

★ 在社群網站上作互動推薦及研究使用者行為對其效果之影響	★ 以AHP法探討伺服器品牌大廠的供應商遴選指標的權重決定分析
★ 以AHP法探討智慧型手機產業營運中心區位選擇考量關鍵因素之研究	★ 太陽能光電產業經營績效評估－應用資料包絡分析法
★ 建構國家太陽能電池產業競爭力比較模式之研究	★ 以序列採礦方法探討景氣指標與進出口值的關聯
★ ERP專案成員組合對績效影響之研究	★ 推薦期刊文章至適合學科類別之研究
★ 品牌故事分析與比較-以古早味美食產業為例	★ 以方法目的鏈比較Starbucks與Cama吸引消費者購買因素
★ 探討創意店家創業價值之研究- 以赤峰街、民生社區為例	★ 以領先指標預測企業長短期借款變化之研究
★ 應用層級分析法遴選電競筆記型電腦鍵盤供應商之關鍵因子探討	★ 以互惠及利他行為探討信任關係對知識分享之影響
★ 結合人格特質與海報主色以類神經網路推薦電影之研究	★ 資料視覺化圖表與議題之關聯

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

市場區隔是企業根據客戶所需要的產品或服務，將市場區割成不同群組，
使得每個群組內之消費者均有類似需求或購買行為。對個別區隔市場設計專屬的行銷組合，更能有效運用企業的行銷資源，企業更能做好行銷工作。一個合適的市場行銷策略，可以應用到各個領域的客戶，以增加銷售量。這個過程可由四個活動，即市場區隔的顧客辨識，了解該市場區隔的顧客輪廓及特徵，評價該市場區隔的顧客和目標市場的策略及資源配置。市場區隔通常採用集群技術，使得每個群組內之消費者均有類似購買行為、需求或喜好。這種資訊可幫助企業做選擇性營銷的決策過程，如目錄設計，交叉銷售，或安排他們的貨架空間等可能會導致銷售量的增加。增加銷售不僅只是削減價格，若根據客戶需要和喜好，提供一定的數量之熱門項目的混搭銷售，也是一種受到客戶高度讚賞的促銷方式。類似品項及其不同數量的組合包裝是行銷重要策略之一，例如可從一組合為2瓶A牌酒及2包B牌菸或另一組合為2瓶C牌酒及2包D牌菸去選購，符合消費者多樣化需求。但是以往的促銷組合包裝通常來自設計者喜好或觀念，如何從日常大量交易中自動有效得到受顧客歡迎的類似品項及其不同數量的組合包裝是本研究主要課題。
但分析連鎖零售商店擷取的交易資料集是一大挑戰，因為經常包含稀疏的數據，係因交易資料由大量的產品組成，每筆交易包含極少量產品項目，類似稀疏資料集，導致每個項目只出現於小部分的交易資料。以購物籃技術分析的結果，通常得到極端低項目支持度。
資料關聯技術之購物籃分析大都以品項來區別哪些產品客戶會一起購買，有
時因忽略物件階層式架構關係產生相似度的問題，而嘗試將品項提升至類別來增加支持度而產生關聯規則，亦不考慮購買數量，但如此亦失去了可能更明細的關聯品項的決策性資訊。而資料分群技術主要功能則可利用成員的相似度將群集間之差異及群內相似性找出來。
但大部分的研究專注在分群技術及相似度衡量上，忽略了物件間階層式架構
的概念及品項購買數量問題，例如實務上從最終消費者的觀點來看產品或物件相似度,其實是有不同意義的,而且在零售業眾多的產品類別中，實務上也幾乎不可能為平衡式之階層架構。
固本研究提出一個架構來衡量非平衡式之階層架構物件之相似度，從消費者觀點出發，貼近人類直覺，並考慮品項購買數量問題，據以分群，協助分群意義具體化，對決策者提供有用的資訊。

摘要(英)

Market segmentation is an important marketing process for enterprises to identify and group customers according to the products or services they need so that suitable market stimuli can be applied to each individual segment of customers to increase sales volumes. The process is composed of four activities, namely, segment identification, segment characterization, segment evaluation and target segment evaluation. Segment identification is usually performed with clustering, which groups customers with similar transactions to the same segment. Such information can lead to increased sales by helping retailers do selective marketing and can help in many business decision-making processes, such as catalog design, cross-marketing and arrange their shelf space. A mixture of popular items with certain quantities in a package according to needs and preferences from customer is a highly appreciated promotion approach to increase sales instead of just cutting price. What popular items and how many quantities supposed to be packed together should be emerged from customer buying behaviors instead of from designers’ perspective.
Performing segment identification from transaction data is difficult because a typical retailer usually carries tens of thousands of goods whereas a transaction typically contains less than a hundred items. Besides the goods that are purchased, the quantities that are consumed also play an important role in distinguishing customers. To reduce the issues of low cardinality and high intra distancing of transaction clustering, the majority of shopping basket analysis attempted to import a product hierarchy to a higher concept level, or aggregate transactions to a customer level to alleviate the sparcity from transactions. However, among the enormous volume of retail products, it is almost impossible to have a balanced hierarchical structure. In empirical practices, similarity in the hierarchy from the perspective of consumers (bottom-up) is also quite different from designers (top-down). Aggregate transactions to a higher concept level may also lose some detail information.
Previous studies have seldom been found to have applied a combined quantity and similarity concept from a bottom-up perspective on an unbalanced tree to clustering transactions. This study presents an algorithm for mining quantitative similar clusters via an improved clustering algorithm which tracks top k clusters with its own quality of intra-similarity. This algorithm is based on a QSKM (Quantity Sensitive kth matched similar pair) similarity measures which we derived for transactions with purchased quantities using an unbalanced hierarchical structure from a consumer’s perspective. From our experiments, we found that QSKM measures outweighed traditional similarity measures in finding the clusters of similar products and quantities purchased together from a real-life transaction database, and also discovered up to 5 clusters with enough coverage from sparse data which cannot be discovered by using traditional similarity measure or frequent patterns. Besides, the cluster intra-similarities are better than GCS (General Cosine Similarity).

關鍵字(中)

★ 購買數量
★ 資料挖掘
★ 群集分析
★ 相似度度量
★ 階層架構

關鍵字(英)

★ data mining
★ quantitative clustering
★ similarity (distance) measure
★ concept hierarchy

論文目次

Table Contents
摘要 I
ABSTRACT III
誌謝 V
CHAPTER 1 INTRODUCTION 1
1.1 FRAMEWORK OF THE PROPOSED MODEL 8
1.2 ORGANIZATION OF THE DISSERTATION 8
CHAPTER 2 LITERATURE REVIEW 10
2.1 MARKET SEGMENTATION AND TRANSACTIONS SPARSITY 10
2.2 QUANTITATIVE FREQUENT PATTERN MINING AND CONCEPT HIERARCHY 12
2.3 SIMILARITY AND CLUSTERING 15
CHAPTER 3 AN EXPLORATORY SIMILARITY MEASURE OF CLUSTERING TRANSACTIONS WITH AN UNBALANCED HIERARCHICAL PRODUCT STRUCTURE 23
3.1 RESEARCH PROBLEM 23
3.2 PROBLEM DEFINITION 25
3.2.1 Unbalanced Hierarchy 25
3.2.2 Computing Distances on Unbalanced Hierarchy 26
3.3 ALGORITHM OF COMPUTING TRANSACTION DISTANCE WITH AN UNBALANCED HIERARCHY 35
3.4 EXPERIMENTAL RESULTS 37
3.4.1 Data Description and Preparation 37
3.4.2 Comparisons of Skew Concept Hierarchy with Different Distance Measures 38
3.5 SUMMARY AND MANAGERIAL IMPLICATIONS 42
CHAPTER 4 SEGMENTING CUSTOMERS WITH QUANTITATIVE TRANSACTIONS ANNOTATED WITH UNBALANCED HIERARCHIES 43
4.1 RESEARCH PROBLEM 43
4.2 PROBLEM DEFINITION 46
4.3 METHODOLOGY 48
4.3.1 Algorithm of QSKM (Quantity Sensitive kth matched similar pair) Similarity 48
4.3.2 Algorithm of Top k Clustering 50
4.4 EXPERIMENTAL RESULTS 53
4.4.1 Data Description and Preparation 53
4.4.2 Model Observation and Comparisons of different distance measures with purchased quantity 55
4.5 SUMMARY AND MANAGERIAL IMPLICATIONS 66
4.5.1 Specific Cluster Observation 66
4.5.2 Summary 67
CHAPTER 5 CONCLUSIONS AND FUTURE WORKS 69
REFERENCES 72

參考文獻

References
[1] D. A. Aaker. Strategic market management. New York: John Wiley & Son, (2001).
[2] G. Adomavicius and A. Tuzhilin. “Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions”. IEEE Transactions on Knowledge and Data Engineering 17(6) (2005) 734-749.
[3] R. Agarwal, C. Aggarwal and V.V.V. Prasad. “A Tree Projection Algorithm for Generation of Frequent Item Sets”, J. Parallel Distributed Comput. 61 (3) (2001) 350–371.
[4] R. Agrawal, J. Gehrke, D. Gunopulos and P. Raghavan. “Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”, SIGMOD'98 (1998).
[5] R. Agrawal, T. Imielinski and A. Swami. “Mining Association Rules Between Sets of Items in Large Databases”, Proceedings of the ACM-SIGMOD international conference on management of data (SIGMOD’93), Washington, DC, (1993) 207–216.
[6] R. Agrawal and R. Srikant. “Fast Algorithms for Mining Association Rules”, Proc. of the 20th Int'l Conference on Very Large Databases. Santiago, Chile (1994).
[7] R. Agrawal and R. Srikant. “Mining Sequential Patterns”, in: Proceedings of the 11th International Conference on Data Engineering (ICDE), IEEE Press, New York, (1995) 3–14.
[8] C. Anderson and J. W. Vincze. Strategic marketing management, New York: Houghton Mifflin, (2000).
[9] G. H. Ball and D. J. Hall. A Novel Technique for Data Analysis and Pattern Classification. Menlo Park, CA, Standford, Res. Inst. (1965).
[10] Berson, S. Smith and K. Thearling. Building data mining applications for CRM, New York: McGraw-Hill, (2000).
[11] S. Brin, R Motwani and C. Silverstein. “Beyond Market Basket: Generalizing Association Rules to Correlations. In: Proceeding of the 1997 ACM-SIGMOD International conference on management of data (SIGMOD’97), Tucson, AZ, (1997) 265–276.
[12] S. Chen, J. Han and P. S. Yu. “Data Mining: An Overview from a Database Perspective”, IEEE Transactions on Knowledge and Data Engineering, 8(6) (1996) 866-883.
[13] Y. L. Chen, J. M. Chen and C. W. Tung. “A data mining approach for retail knowledge discovery with consideration of the effect of shelf-space adjacency on sales”. Decision Support Systems 42 (2006) 1503–1520.
[14] J. Cheng, Y. Ke and W. Ng. “Effective Elimination of Redundant Association Rules”, Data Min Knowl Disc 16 (2008) 221–249.
[15] M. J. Croft. Market segmentation: A step-by-step guide to profitable new business, London, New York: Routledge. (1994).
[16] R. G. Drozdenko and P. D. Drake. Optimal database marketing: Strategy, development, and data mining, London: Sage. (2002).
[17] M. Ester, H. P. Kriegel, J. Sander and X. Xu. “A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases”, KDD'96 (1996).
[18] P. Ganesan, H. Garcia-Molina and J. Widom. “Exploiting Hierarchical Domain Structure to Compute Similarity”, ACM Transactions on Information Systems, 21 (1) (2003/1/1).
[19] D. Golsberg, D. Nichols, B. M. Oki and D. Terry. “Using Collaborative Filtering to Weave Information Tapestry”, Commun. ACM 35 (12) (1992) 61–70.
[20] G. Grahne and J. Zhu. “Efficiently Using Prefix-Trees in Mining Frequent Itemsets”. In: Proceeding of the ICDM’03 international workshop on frequent itemset mining implementations (FIMI’03), Melbourne, FL, (2003) 123–132.
[21] S. Guha, R. Rastogi and K. Shim. “ROCK: A Robust Clustering Algorithm for Categorical Attributes”, In ICDE'99, Sydney, Australia, (March 1999) 512-521.
[22] K. Hammond, A. S. C. Ehrenberg and G. J. Goodhardt. “Market segmentation for competitive brands”, European Journal of Marketing, 30(12) (1996) 39–49.
[23] J. Han, Y. Cai and N. Cercone. “Knowledge Discovery in Databases: An Attribute-Oriented Approach”, VLDB (1992) 547-559.
[24] J. Han, H. Cheng, D. Xin and X. Yan, “Frequent Pattern Mining: Current Status and Future Directions”, Data Min Knowl Disc 15 (2007) 55–86.
[25] J. Han and Y. Fu. “Discovery of Multiple-Level Association Rules from Large Databases”. In: Proceeding of the 1995 International conference on very large data bases (VLDB’95), Zurich, Switzerland, (1995) 420–431.
[26] J. Han and Y. Fu. “Mining Multiple-Level Association Rules in Large Databases”, IEEE Transactions on Knowledge and Data Engineering, 11(5) (1999) 798-805.
[27] J. Han and M. Kamber. Data Mining: Concepts and Techniques, 2nd edn. Morgan Kaufmann (2006).
[28] J. Han, J. Pei and Y. Yin. “Mining Frequent Patterns without Candidate Generation”, in: Proceedings of the ACM SIGMOD International Conference on Management of Data, Dallas, TX, (2000) 1–12.
[29] J. Herlocker, J. Konstan, A. Borcher and J. Riedl. “An Algorithmic Framework for Performing Collaborative Filtering”, Proceedings of the 1999 Conference on Research and Development in Information Retrieval (1999).
[30] K. Jain, M. N. Murthy and P. J. Flynn. “Data Clustering: a review”, ACM Computing Reviews, 31(3) (1999) 264–323.
[31] M. Kantardzic. Data Mining: Concepts, Models, Methods, and Algorithms, John Wiley (2002).
[32] G. Karypis, E. H. Han and V. Kumar. “CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling”, COMPUTER, 32(8) (1999) 68-75.
[33] L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster Analysis, John Wiley & Sons, (1990).
[34] T. Kohonen. Self-organizing maps. Secaucus, NJ: Springer-Verlag New York, Inc. (1997).
[35] J. B. Kruskal. Multidimensional Scaling and Other Methods for Discovering Structure. In K. Enslein, A. Ralston, & H. S. Wilf (Eds.), Statistical methods for digital computers. New York: Wiley. (1977) 296–339.
[36] H. H. Liu and C. S. Ong. “Variable Selection in Clustering for Marketing Segmentation Using Genetic Algorithms”, Expert Systems with Applications 34 (2008) 502–510.
[37] M. J. McGill. Introduction to Modern Information Retrieval, McGraw-Hill (1983).
[38] J. H. Myers. Segmentation and positioning for strategic marketing decisions, Chicago: American Marketing Association. (1996).
[39] R. Ng and J. Han. “Efficient and Effective Clustering Method for Spatial Data Mining”. VLDB'94 (1994).
[40] G. K. Palshikar, M. S. Kale and M. M. Apte. “Association Rules Mining Using Heavy Itemsets”, Data & Knowledge Engineering 61 (2007) 93–113.
[41] D. Peppers, M. Rogers and B. Dorf. "Is your Company Ready for One-to-One Marketing?", Harvard Business review - January-February (1999) 151- 160.
[42] G. Piatetsky-Shapiro. Knowledge Discovery in Databases, AAAI/MIT Press, Anaheim, CA, (1991).
[43] G. Salton and C. Buckley. “Term-Weighting Approaches in Automatic Text Retrieval”, Inf. Process. Manage. 24(5) (1988) 513–523.
[44] R. Srikant and R. Agrawal. “Mining Generalized Association Rules”, In Proceedings of VLDB ’95, (1995) 407–419.
[45] J.-B.E.M. Steenkamp and F. T. Hofstede. “International market segmentation: Issues and perspectives”. International Journal of Research in Marketing, 19(3) (2002) 185– 213.
[46] P. N. Tan, V. Kumar and J. Srivastava “Selecting the Right Objective Measure for Association Analysis”, Information Systems 29 (2004) 293–313.
[47] P. N. Tan, M. Steinbach and V. Kumar. Introduction to Data Mining, Pearson International Edition (2005).
[48] C. Y. Tsai and C. C. Chiu. “A Purchase-Based Market Segmentation Methodology”, Expert Systems with Applications 27 (2004) 265–276.
[49] P. S. M. Tsai and C. M. Chen. “Mining Quantitative Association Rules in a Large Database of Sales Transactions”, Journal of Information Science and Engineering 17, (2001) 667-681.
[50] J. Wang and G. Karypis. “HARMONY: Efficiently Mining the Best Rules for Classification”. In: Proceeding of the 2005SIAM conference on data mining (SDM’05), Newport Beach, CA, (2005) 205–216.
[51] M. T. Wang, P. Y. Hsu, K. C. Lin and S. S. Chen. “Clustering Transactions with an Unbalanced Hierarchical Product Structure”, LNCS 4654, (2007)251–261.
[52] S. J. Yen and Y. S. Lee. “Mining High Utility Quantitative Association Rules”, LNCS 4654 (2007) 283–292.
[53] M. J. Zaki. “Fast Mining of Sequential Patterns in Very Large Databases”, Technical Report 668, Department of Computer Science, University of Rochester, (1997).
[54] M. J. Zaki. “Scalable Algorithms for Association Mining”. IEEE Trans Knowl Data Eng 12 (2000) 372–390.
[55] M. J. Zaki and K. Gouda. “Fast Vertical Mining Using Diffsets”, Technical Report 01-1, Department of Computer Science, Rensselaer Polytechnic Institute (2001).
[56] M. J. Zaki and C. J. Hsiao. “CHARM: An Efficient Algorithm for Closed Itemset Mining”. In: Proceeding of the 2002 SIAM international conference on data mining (SDM’02), Arlington, VA, (2002) 457–473.
[57] T. Zhang, R. Ramakrishnan and M. Livny. “BIRCH : An Efficient Data Clustering Method for Very Large Databases”, SIGMOD (1996).

指導教授

許秉瑜(Ping-Yu Hsu)

審核日期

2011-7-4

推文