姓名 林兒萱(Er-Hsuan Lin)
論文名稱 屬性值隨時間改變的資料分類方法研究
論文名稱 屬性值隨時間改變的資料分類方法研究
(A Classification Using Time-Sequential Attributes.)
摘要(中) 隨著資訊科技的進步以及產業電子化與企業整合的趨勢,從龐大的資料中擷取出有價值的資訊,已成為研究與實務上的重要議題,資料挖礦是一種不斷循環的資料分析與決策支援過程,主要是以自動或是半自動的方式從大量資料中探索和分析,以發現出有意義的規則,並將其整理成有價值的知識。
摘要(英) Classification is an important method for class label predicting from databases. Most existing methods, however, assume that attribute-values are all constant. In many real-life applications, however, attribute-values may change at different time, such as the daily stock price, the blood pressure at different time, or others. We call these attributes time-sequential attributes. In this paper, we first extend the traditional classification problem to deal with time-sequential attributes. Next, the algorithm, called MutipleMIS-SP, is presented to generate all classification rules for classifier generation. In our approach, we also consider the concept of multiple minimum supports since each attribute and attribute-value pair doesn’t have similar frequency in the database. Using the concept of single minimum support may lead to rare item problem and finally result in low classification accuracy. Finally, two classification criteria are proposed to predict the class label using the generated classification rules.
 Detailed experiments were also presented. Seven synthetic datasets and a real-life dataset, BA-CUSTOMER, were used in our performance analyses and the scalability tests were also given. The result shows that the accuracy of MutipleMIS-SP is better than traditional classification technique C4.5 algorithm in both synthetic datasets and the real dataset.
關鍵字(中) ★ 資料挖礦
★ 分類
★ 時間序列
★ 多重門檻值
關鍵字(英) ★ Data mining
★ Classification
★ Time-sequential data
★ Multiple minimum support
指導教授 陳彥良(Yen-Liang Chen)
