多時間區間K-means

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：131

、訪客IP：3.139.67.161

姓名

盧俊昀(Chun-Yun Lu) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

多時間區間K-means

相關論文

★ 零售業商業智慧之探討	★ 有線電話通話異常偵測系統之建置
★ 資料探勘技術運用於在學成績與學測成果分析 -以高職餐飲管理科為例	★ 利用資料採礦技術提昇財富管理效益 -以個案銀行為主
★ 晶圓製造良率模式之評比與分析－以國內某DRAM廠為例	★ 商業智慧分析運用於學生成績之研究
★ 運用資料探勘技術建構國小高年級學生學業成就之預測模式	★ 應用資料探勘技術建立機車貸款風險評估模式之研究－以A公司為例
★ 績效指標評估研究應用於提升研發設計品質保證	★ 基於文字履歷及人格特質應用機械學習改善錄用品質
★ 以關係基因演算法為基礎之一般性架構解決包含限制處理之集合切割問題	★ 關聯式資料庫之廣義知識探勘
★ 考量屬性值取得延遲的決策樹建構	★ 從序列資料中找尋偏好圖的方法 - 應用於群體排名問題
★ 利用分割式分群演算法找共識群解群體決策問題	★ 以新奇的方法有序共識群應用於群體決策問題

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 ( 永不開放)

摘要(中)

社群發現是針對社群數據,進行分群的動作,以找出當中的群體。在過往的研究中,大多數的方法都是針對靜態時間下的數據進行分群,所以每個時間點的數據分群是獨立的,不會受到前後時間點的分群結果影響。然而,實際的社群是會經過時間的推移而演化,所以我們會想了解某一時間點的群如何受前後時間點的群影響,演化成後面時間點的群,以找出群的演化軌跡。意即我們要找出群演化的軌跡,希望每一個群都能平順的從上一時間點演化到下一時間點,因此本研究與過去研究不同的是我們要求前後時間點的群的差異能夠越小越好。根據上述的概念,我們提出了多時間區間 K-means 演算法來滿足傳統K-means 演算法的不足,我們的研究是限定在一固定的 T 個期間,每個時間期間都先各自使用 K-means 演算法把資料分為 K 群,分群完後我們會反覆調整每一時間區間的分群結果,每一個群根據前後時間的分群結果加以調整,除了要讓同一群體內的資料點具有最小的群內誤差外,也要使得前後時間點的相似群的差異能夠越小越好。最終,我們可以得到 K 個群體在 T 個期間的演化軌跡,我們的實驗證明這些不同期間的相對群體差異會減小,並且可以沿著時間產生較為平順的群體演化軌跡。

摘要(英)

In a large data, it is very common to divide the large data into multiple clusters. Therefore, in the past research, most of the methods are clustering the data in static time, so the data clustering in a single time period is independent and is not affected by the previous or latter time period. In hence, the clustering results at different time period are inconsistent. However, the actual data will evolve over time, we would like to know how a cluster at a certain time point is affected by the time before and after, and evolves into a cluster at a later time point to find out the evolutionary trajectory. That means we need to find out the trajectory of cluster evolution, and hope that each cluster can smoothly evolve from the previous time point to the next. Therefore, this study is different from previous studies in that we require that the clusters difference between adjacent time be as small as possible. Based on the above concepts, we propose a Multi-time Periods K-means algorithm to meet the shortcomings of the traditional K-means algorithm. Our study is limited to a time length of T, the time T is divided into equal time period t, each time period is performing K-means algorithm first. After clustering, we will adjust the results of each time interval repeatedly. Each cluster will adjusted according to the clustering results of the time before and after. Therefore, our research hopes that the clusters within each time period have the smallest intra-group error and inter-group error. Finally, we can get the evolutionary trajectory of each cluster. Our experiments show that the relative clustering differences in different periods are reduced and that a smoother evolution trajectory can be generated over time.

關鍵字(中)

★ 社群發現
★ 演化
★ K-means 演算法
★ 多時間區間

關鍵字(英)

★ Community discovery
★ Evolutionary
★ K-means
★ Multiple time periods

論文目次

目錄
摘要 I
ABSTRACT II
目錄 III
圖目錄 V
表目錄 VII
1. 前言 1
1.1 背景 1
1.2 動機 1
1.3 目標 3
2. 文獻探討 5
2.1 K-MEANS 演算法 5
2.2 K-MEANS演算法中的距離計算公式 5
2.3 K-MEANS演算法中的效率改善 6
2.4 K-MEANS演算法中的初始群心挑選 6
2.5 K-MEANS演算法中的群數設定 6
2.6 FUZZY C-MEANS演算法 7
2.7 其他 8
2.8 演化分群演算法 8
3. 演算法架構 10
3.1 變數的定義 10
3.2 方法 11
3.2.1 演算法流程說明 12
3.3 演算法描述與範例 17
3.3.1 演算法一 ─ K-means 分群 17
3.3.2 演算法二 ─ 群配對 17
3.3.3 演算法三 ─ 前到後的分群調整 18
3.3.4 演算法三 ─ 由後到前的調整 21
3.3.5 演算法四 ─ 成本計算 21
4 實驗 23
4.1 資料集描述 23
4.2 資料前處理 23
4.3 群數設定及初始群心挑選 23
4.4 實驗流程 24
4.5 評估方法 24
5 實驗結果 24
5.1 DOW JONES INDEX DATA的實驗結果 24
5.2 空氣品質監測資料的實驗結果 32
5.3 總覽平均群內成本、區間成本及總成本 45
6 結論 48
參考文獻 49

參考文獻

1. J. A. Hartigan and M.A. Wong, "A K-Means Clustering Algorithm," Applied Statistics, Vol. 28, No. 1, p100-108, 1979.
2. J. Mao, A.K. Jain, “A self-organizing network for hyper-ellipsoidal clustering (HEC),” IEEE Trans. Neural Networks 7 (January), 16–29, 1996.
3. Linde, Y., Buzo, A., Gray, R., “An algorithm for vector quantizer design,” IEEE Trans. Comm. 28, 84–94, 1980.
4. H. Kashima, J. Hu , B. Ray, Singh, M., “K-means clustering of proportional data using L1 distance,” Proc. Intervalnat. Conf. on Pattern Recognition, pp. 1–4, 2008.
5. Banerjee, Arindam, Merugu, Srujana, Dhillon, Inderjit, Ghosh, Joydeep. “Clustering with bregman divergences,” J. Machine Learn. Res., 234–245, 2004.
6. Scholkopf, Bernhard, Smola, Alexander, Muller, Klaus-Robert, “Nonlinear component analysis as a kernel eigenvalue problem”, Neural Comput. 10 (5), 1299–1319, 1998.
7. Na Shi, Xumin Liu, Yong Guan, “Research on k-means Clustering Algorithm: An Improved k-means Clustering Algorithm,” 2010 Third Intervalnational Symposium on Intelligent Information Technology and Security Informatics.
8. Pelleg Dan, Moore Andrew, “Accelerating exact k-means algorithms with geometric reasoning,” Chaudhuri Surajit, Madigan David (Eds.), Proc. Fifth Intervalnat. Conf. on Knowledge Discovery in Databases, AAAI Press, pp. 277–281, 1999.
9. Navjot Kaur, Jaspreet Kaur Sahiwal, Navneet Kaur, “Efficient K-Means Clustering Algorithm Using Ranking Method in Data Mining,” Intervalnational Journal of Advanced Research in Computer Engineering & Technology (IJARCET)
10. Ming-Chuan Hung, Jungpin Wu, Jin-Hua Chang and Don-Lin Yang, “An Efficient K-Means Clustering Algorithm Using Simple Partitioning,” Journal of Information Science and Engineering, Vol.21, No.6, PP. 1157~1177, 2005-11. (SCI,EI)
11. K.A. Abdul Nazeer, M.P. Sebastian, “Improving the Accuracy and Efficiency of the k-means Clustering Algorithm,” Proceedings of the World Congress on Engineering 2009 Vol I
12. Ka-Chun Wong, “A Short Survey on Data Clustering Algorithms,” 2015 Second Intervalnational Conference on Soft Computing and Machine Intelligence (ISCMI)
13. Calinski, R. B. and Harabasz, J., “A Dendrite Method for Cluster Analysis,” Communication in Statistics, Vol.3, 1974, pp.1-27.
14. D. L. Davies, and D. W. Bouldin, “A Cluster Separation Measure,” IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol.1, No.4, 1979, pp.224-227.
15. J. C. Dunn, “Well Separated Clusters and Optimal Fuzzy Partitions,” Journal of Cybernetica, Vol. 4, pp. 95-104, 1974
16. Ray, S., Turi, R.H., “Determination of Number of Cluster in K-Means Clustering and Application in Colour Image Segmentation,” Proceedings of the 4th 87 Intervalnational Conference on Advances in Pattern Recognition and Digital Techniques (ICAPRDT_99), Calcutta, India, 27–29 December, 1999.
17. M. Halkidi and M. Vazirgiannis and Y. Batistakis, “Quality Scheme Assessment in the Clustering Process,” Proc. of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, pp. 265-276, 2000
18. Pelleg, Dan, Moore, Andrew, “X-means: Extending k-means with efficient estimation of the number of clusters,” Proc. Seventeenth Intervalnat. Conf. on Machine Learning. pp. 727–734, 2000.
19. J.C. Bezdek, “Pattern Recognition with Fuzzy Objective Function Algorithms,” Plenum Press, 1981.
20. D. Pham, “An adaptive fuzzy C-means algorithm for image segmentation in the presence of intensity in-homogeneities,” Pattern Recognition Letters, vol. 20, pp. 57–68, 1999.
21. J. Noordam, W. van den Broek, and L. Buydens, “Geometrically guided fuzzy C-means clustering for multivariate image segmentation,” Proceedings of Intervalnational Conference on Pattern Recognition (ICPR’00), vol. 1, 2000, pp. 462–465.
22. M. Ahmed, S. Yamany, N. Mohamed, A. Farag, and T. Moriarty, “A modified fuzzy C-means algorithm for bias field estimation and segmentation of MRI data,” IEEE Transactions on Medical Imaging, vol. 21, pp. 193–199, 2002.
23. S. Chen and D. Zhang, “Robust image segmentation using FCM with spatial constraints based on new kernel-induced distance measure,” IEEE Transactions on Systems, Man and Cybernetics, vol. 34, no. 4, pp. 1907– 1916, 2004.
24. M. Steinbach, G. Karypis, V. Kumar, “A comparison of document clustering techniques,” KDD Workshop on Text Mining, 2000.
25. Akanksha Choudhary, “Survey on K-Means and Its Variants,” Intervalnational Journal of Innovative Research in Computer And Communication Engineering Vol. 4, Issue 1, January 2016
26. Kaufman, Leonard, Rousseeuw, Peter J., “Finding groups in data: An introduction to cluster analysis,” Wiley series in Probability and Statistics, 2005.
27. G. Ball, and D. Hall, “Isodata, a novel method of data anal-ysis and pattern classification,” Stanford Research Institute, NStanford, CA, Tech. Rep., 1965.
28. J. Lin, M. Vlachos, E. Keogh, and D. Gunopulos, “Iterative incremental clustering of time series,” Proceedings of the Intervalnational Conference on Extending Database Technology, pages 106–122, 2004.
29. D. Chakrabarti, R. Kumar, A. Tomkins, “Evolutionary clustering,” Proceedings of the 12th ACM SIGKDD Intervalnational conference on Knowledge discovery and data mining, 2006, 554-560
30. Folino, F., & Pizzuti, C. (2014). "An Evolutionary Multiobjective Approach for Community Discovery in Dynamic Networks."IEEE Transactions on Knowledge and Data Engineering, 26,(8): 1838–1852.

指導教授

陳彥良(Yen-Liang Chen)

審核日期

2019-6-19

推文