摘要: | 序列資料挖掘是一種在資料挖掘領域中非常重要的一種方法,其目標是從序列資料庫中,找出與時間相關的行為樣式,目前,序列資料挖掘已被應用到許多的領域中,包含網路日誌的分析,顧客行為的研究,科學實驗流程的分析,醫學記錄的分析…等等。然而,雖然已經有許多序列資料挖掘相關的研究,但這些研究在挖掘序列樣式時,通常只考慮樣式的發生頻率(frequency),因為樣式的發生頻率是唯一可以被用來過濾掉那些不有趣的樣式(uninteresting patterns),不過有個前提,最小支持度(minimum support)不能設定過高,否則,很多有價值的樣式將不會被找出。 然而,這樣的做法卻會導致組合性的爆炸,以致於產生過多的規則,為了解決這樣的問題,本研究使用了行銷學者用於做顧客及市場區隔的RFM(Recency, Frequency and Monetary)概念,去挖掘出有價值的序列樣式,提出了一套演算法叫RFM-Apriori,它是修改傳統序列挖掘演算法Apriori ,除了考慮序列資料頻率之外(Frequency),額外考慮了二個限制,最近購買時間(Recency)和購買金額(Monetary),去挖掘同時滿足RFM的樣式(RFM-patterns)。 經由額外考慮兩個要素,可以確保所有被挖掘出來的樣式是近期發生且有價值的,經實驗評估,證明我們所提出的方法比傳統的方法更有效率,並且提供使用者一個更有效率的方法來找出有價值的樣式。 Sequential pattern mining is an important data mining task of discovering time-related behaviors in sequence databases. Sequential pattern mining technology has been applied in many domains, including web-log analysis, the analyses of customer purchase behavior, process analysis of scientific experiments, medical record analysis, etc. Although a lot of works have been done to sequential pattern mining, most of them discover sequential patterns only based on frequency. Due to this reason, the minimum support must be set to a low value; otherwise, a lot of valuable patterns may not be found. Unfortunately, doing so may cause combinatorial explosion, producing too many rules. To resolve this dilemma, this research uses the concept of Recency, Frequency and Monetary (RFM), which is usually used by marketing researchers to do customer or market segmentation, to find valuable sequential patterns. The proposed algorithm RFM-Apriori modifies the traditional sequential pattern mining algorithm-Apriori, so that, except the frequency, we also consider two additional constraints, the last purchasing time (Recency) and purchasing money (Monetary), to discover the RFM-patterns. The advantage of considering these two additional factors is that this can ensure all patterns are recently active and profitable. The empirical evaluation shows that the proposed method is computationally efficient and can offer users an effective means to discover valuable patterns. |