博碩士論文 111225008 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:90 、訪客IP:3.145.91.152
姓名 陸彥廷(Yen-Ting Lu)  查詢紙本館藏   畢業系所 統計研究所
論文名稱 SNF效應的理論解釋和高影響力聚類特徵的識別
(Theoretical Explanation of the SNF Effects and Identification of High-Impact Clustering Features)
相關論文
★ Q學習結合監督式學習在股票市場的應用★ 基於Q-learning與非監督式學習之交易策略
★ 視覺化股票市場之狀態變動★ 利用強化學習探索可再生能源交易市場中的參與者策略
★ 基於I-score和Q-learning的投資組合★ 軟訊息下的滯後多元貝氏結構GARCH模型及其應用
★ 基於動態網絡和vine copula的投資組合優化
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (2029-8-1以後開放)
摘要(中) 本研究從兩個角度討論高維度資料聚類。首先,我們從理論上解釋了相似網絡融合(SNF)對聚類的影響。人們發現,SNF 透過融合從不同特徵集計算出的相似性矩陣,可以增強許多應用中的聚類性能。所提出的理論解釋有助於更深入地理解融合的功能及其限制。接下來,我們提出了一個由多元邏輯迴歸和高維度資訊標準組成的迭代過程(以 SF-MLR 表示),以從大量特徵集中識別高影響力的聚類特徵。我們將SF-MLR 應用於幾個高維度資料集。數值結果表明,SF-MLR 可以識別高影響力的聚類特徵,這也有助於提高聚類性能。
摘要(英) This study discusses high-dimensional data clustering from two perspectives. First, we provide a theoretical explanation of the effect of similarity network fusion (SNF) on clustering. The SNF has been found to enhance clustering performances in many applications by fusing the similarity matrices computed from different feature sets. The proposed theoretical explanation helps a deeper understanding of how the fusion functions and where its limitations are. Next, we propose an iterative procedure consisting of multinomial logistic regression and high-dimensional information criterion, denoted by SF-MLR, to identify high-impact clustering features from a vast feature set. We apply the SF-MLR to several high-dimensional datasets. The numerical results reveal that the SF-MLR can identify high-impact clustering features, which also helps to improve clustering performances.
關鍵字(中) ★ 聚類
★ 融合
★ 高維度資訊準則
★ 多元邏輯迴歸
關鍵字(英) ★ clustering
★ fusion
★ high-dimensional information criterion
★ multinomial logistic regression
論文目次 Contents I
List of Figures II
List of Tables III
Chapter 1 Introduction 1
Chapter 2 Literature Review 5
2.1 Similarity Network Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 High-dimensional Information Criterion . . . . . . . . . . . . . . . . . . . . 7
2.3 Mutinomial Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5 Segmentation-Fusion Clustering Method . . . . . . . . . . . . . . . . . . . 11
Chapter 3 Methodology 14
3.1 Theoretical Explanation of the SNF Effects . . . . . . . . . . . . . . . . . . 14
3.2 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Chapter 4 Numerical Results 20
4.1 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.3 Emprical Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Chapter 5 Discussion 30
Appendix 31
Reference 35
參考文獻 [1] Arthur, D., Vassilvitskii, S. (2007). k-means++: The advantages of careful seeding.
In Soda, 7, 1027-1035.
[2] Calinski, T., Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics - Theory and Methods, 3, 1-27.
[3] Davies, D. L., Bouldin, D. W. (1979). A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1, 224-227.
[4] Duan, L., Xu, L., Guo, F., Lee, J., Yan, B. (2007). A local-density based spatial
clustering algorithm with noise. Information Systems, 32, 978-986.
[5] El-Habil, A. M. (2012). An application on multinomial logistic regression model.
Pakistan Journal of Statistics and Operation Research, 8, 271-291.
[6] Huang, S. F., He, Y. H., Lu, Y. T. (2024). A clustering method based on feature
segmentation and fusion. Manuscript.
[7] Ing, C. K., Lai, T. L. (2011). A stepwise regression method and consistent model
selection for high-dimensional sparse linear models. Statistica Sinica, 21, 1473-1513.
[8] Fang, J., Wang, H., Zhu, H. (2018). Fast and accurate detection of complex imaging
genetics associations based on greedy projected distance correlation. IEEE Transactions on Medical Imaging, 37, 860-870.
[9] Gilpin, S., Qian, B., Davidson, I. (2013). Efficient hierarchical clustering of large high
dimensional datasets. In Proceedings of the 22nd ACM international conference on
Information and Knowledge Management, 1371-1380.
[10] Jing, L., Ng, M. K., Huang, J. Z. (2007). An entropy weighting k-means algorithm for
subspace clustering of high-dimensional sparse data. IEEE Transactions on Knowledge and Data Engineering, 19, 1026-1041.
[11] Jeon, Y., Yoo, J., Lee, J., Yoon, S. (2017). Nc-link: A new linkage method for efficient
hierarchical clustering of large-scale data. IEEE Access, 5, 5594-5608.
[12] Lin, C. T., Cheng, Y. J., Ing, C. K. (2023). Greedy variable selection for highdimensional Cox models. Statistica Sinica, 34.
[13] Liu, T., Lu, Y., Zhu, B., Zhao, H. (2023). Clustering high-dimensional data via
feature selection. Biometrics, 79, 940-950.
[14] Mansoori, E. G. (2014). GACH: A grid-based algorithm for hierarchical clustering of
high-dimensional data. Soft Computing, 18, 905-922.
[15] Rousseeuw, P. J. (1987). Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20,
53-65.
[16] Vijendra, S., Laxman, S. (2013). Subspace clustering of high-dimensional data: An
evolutionary approach. Applied Computational Intelligence and Soft Computing,
2013, 16-16.
[17] Wang, B., Mezlini, A. M., Demir, F., Fiume, M., Tu, Z., Brudno, M., ... and Goldenberg, A. (2014). Similarity network fusion for aggregating data types on a genomic
scale. Nature Methods, 11, 333.
[18] Wang, J., Zhu, C., Zhou, Y., Zhu, X., Wang, Y., Zhang, W. (2017). From partitionbased clustering to density-based clustering: Fast find clusters with diverse shapes
and densities in spatial databases. IEEE Access, 6, 1718-1729.
[19] Zhang, J., Li, Y., Dai, W. et al. (2024) Molecular classification reveals the sensitivity
of lung adenocarcinoma to radiotherapy and immunotherapy: Multi-omics clustering
based on similarity network fusion. Cancer Immunol Immunother, 73, 71.
指導教授 黃士峰(Shih-Feng Huang) 審核日期 2024-7-10
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明