交叉數據矩陣型PCA：理論與應用;Cross Data Matrix-Based PCA: Theory and Applications

NCU Institutional Repository > 理學院 > 統計研究所 > 研究計畫 > Item 987654321/84661

請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/84661

題名:	交叉數據矩陣型PCA：理論與應用;Cross Data Matrix-Based PCA: Theory and Applications
作者:	王紹宣
貢獻者:	統計研究所
關鍵詞:	漸近常態性;交叉數據矩陣;高維度;低樣本數;主成分分析;隨機矩陣;尖峰協方差模型;擾動;Asymptotic normality;cross data matrix;high dimension;low sample size;principal component analysis;random matrix;spiked covariance model;perturbation
日期:	2020-12-08
上傳時間:	2020-12-09 10:39:29 (UTC+8)
出版者:	科技部
摘要:	主成分分析（PCA）已成為降維方法有用且重要的工具。兩位日本學者 Yata和Aoshima（2010）在高維度低樣本數的設置中提出了一種基於交叉數據矩陣（CDM）的PCA。研究表明，CDM-PCA具有比PCA更寬的收斂性區域（Yata和Aoshima，2010; Aoshima等，2018）; 對於特徵向量，CDM-PCA和PCA具有相同的收斂性區域（Wang 等，2020）。在實務上，CDM-PCA在一些高維度高相關性的數值資料中表現比PCA更好(Yata和Aoshima，2010; Wang 等，2020) 這些文獻結果表明了CDM PCA具有改進高維數據中PCA方法的巨大潛力，但至今仍沒有明確的理論來支持這樣數值結果。本計劃旨在提供理論解釋，以支持CDM-PCA 在高維度高相關性資料上有更好性能,。同時，預期發展一個量化指標，用來建議CDM-PCA或PCA的使用時機。此外，我們將研究了CDM-PCA的其他理論特性。例如，眾所周知，隨機矩陣設置中，具有標準高斯分配的樣本協方差矩陣的PCA，它的特徵值微弱收斂於Marcenko-Pastur分佈。我們一個有趣的問題是CDM-PCA對應的漸近行為是什麼？另一方面，本計劃也將CDM的設計結合到其他降維方法中，例如影像處理常用的MPCA，2SDR等。此外，該設計可用於機器學習以改善人工智能中的計算機算法。 ;Principal component analysis (PCA) has been a useful and important tool for dimension reduction. Yata and Aoshima (2010) proposed a cross data matrix (CDM)- based PCA in the setting of high dimension and low sample size. It has been shown that CDM-based PCA has a broader consistency region than the usual PCA for leading eigenvalues (Yata and Aoshima 2010; Aoshima et al. 2018) and the same consistency region for eigenvectors (Wang et al., 2020). In numerical study, CDM-PCA has a better performance than PCA in some high dimensional and high correlation data (Yata and Aoshima 2020; Wang et al. 2020). These existing results imply that CDM PCA has great potentialities for improving PCA method in high dimensional data. However, it is still lacking with regard to a theoretical evident to support these performances. The project aims to give a theoretical explanation to support the better performance of CDM-PCA in high dimensional and high correlation data. Meanwhile, we will develop a guideline for using CDM-PCA or PCA. In addition, we will investigate the other theoretical properties for CDM-PCA. For example, it is well-known that eigenvalues of PCA for a sample covariance matrix with standard Gaussian entries in the setting of random matrix weakly converge to the Marcenko-Pastur distributions. An interesting question is what the corresponding asymptotic behavior CDM-PCA has? On the other hand, this projection also incorporates the design of CDM into other dimension reduction methods like MPCA, 2SDR, and so on. Further, this design can be used for machine learning to improve computer algorithms in artificial intelligence.
關聯:	財團法人國家實驗研究院科技政策研究與資訊中心
顯示於類別:	[統計研究所] 研究計畫

文件中的檔案:

檔案	描述	大小	格式	瀏覽次數
index.html		0Kb	HTML	128	檢視/開啟

在NCUIR中所有的資料項目都受到原著作權保護.

社群 sharing

資料載入中.....