Principal Components on t-SNE

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：4

、訪客IP：3.12.164.103

姓名

傅維康(Connor Wei Fu) 查詢紙本館藏

畢業系所

統計研究所

論文名稱

(Principal Components on t-SNE)

相關論文

★ 長期追蹤資料上的 Gamma-EM 分群	★ Contrastive Principal Component Analysis for High Dimension, Low Sample Size Data
★ Bayesian method for sparse principal component analysis	★ Sparse Bayesian Estimation with High-dimensional Binary Response Data
★ Q學習結合監督式學習在股票市場的應用	★ γ-EM approach to latent orientations for cryo-electron microscopy image clustering analysis
★ Contrastive Principal Component Analysis for High-Dimension, Low-Sample-Size Data with Noise-Reduction	★ 基於Q-learning與非監督式學習之交易策略
★ 視覺化股票市場之狀態變動

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-7-1以後開放)

摘要(中)

在眾多視覺化方法中，t-隨機鄰近嵌入法 (t-SNE) 是相當有效且被廣泛使用的
技術之一。視覺上，t-SNE 有能力在 2 維或 3 維空間中呈現高維度資料集的結
構，然而，對資料進一步的解釋能力較弱。相對地，主成分分析 (PCA) 具有足
夠的解釋性，但視覺化效果較差。在本文中，我們提出一套新的方法。過程將
t-SNE 與 PCA 的概念做結合，旨在保留良好視覺化結果的同時，也提升資料的
解釋力。透過尋找與 t-SNE 分群相關的特徵，我們能夠得到用來解釋 t-SNE 映
射的主成分 (principal component)。這種方法除了提高 t-SNE 的解釋性以及應用
價值，也為資料視覺化研究提供了新的思路。在數值研究中，我們透過提出的
方法以及 PCA 方法獲得主成分進行資料降維，再重新執行 t-SNE 演算法進行視
覺化。視覺化的重建結果顯示，PCA 所找到的主成分無法有效還原 t-SNE 的映
射，而我們的方法不僅能夠重新還原，甚至能提供更優秀的視覺化效果。

摘要(英)

t-distributed stochastic neighbor embedding (t-SNE) is one of highly effective and
widely used visualization methods. It is capable to visualize the structure of highdimensional data by giving each datapoint a location in a 2D or 3D map. However,
it lacks further interpretability of data. On the other hand, principal component analysis
(PCA) provides sufficient interpretability but yields inferior visualization. In this paper,
we propose a novel approach that combines the concepts of t-SNE and PCA to preserve
good visualizing results while keeping the interpretability of data. By searching for features that are correlated with the clustering performed by t-SNE, we can obtain dedicated
principal components for t-SNE. This method not only improves the interpretability and
applicability of t-SNE but also provides new insights for data visualization research. In
our numerical study, we use the principal components from our method and PCA method
to reapply the t-SNE algorithm for visualization. The reconstructed results demonstrate
that the principal components identified by PCA fail to effectively reproduce the mappings of t-SNE, while our method not only achieves successful reconstruction but also
offers superior visualization outcomes

關鍵字(中)

★ 高維度資料
★ 解釋性
★ 主成分分析
★ t-隨機鄰近嵌入法
★ 視覺化

關鍵字(英)

論文目次

1 Introduction 1
2 Review 3
2.1 Principal Component Analysis 3
2.2 t-Distributed Stochastic Neighborhood Embedding 4
3 Method 8
3.1 Minimization on the Stiefel manifold 9
3.2 Principal Components of t-SNE 12
4 Experiments 15
4.1 Experimental Setup 15
4.2 Simulated Data 16
4.3 Word Vector Data 19
4.4 MNIST Data 24
5 Conclusion 27
A Proofs 28
A.1 Properties of the Cayley Transform 28
A.1.1 28
A.1.2 30
A.2 Derivation of the Gradient 30
References 31

參考文獻

Abdi, H., & Williams, L. J. (2010). Principal component analysis. Wiley interdisciplinary
reviews: computational statistics, 2(4), 433–459.
Barzilai, J., & Borwein, J. M. (1988). Two-point step size gradient methods. IMA journal of
numerical analysis, 8(1), 141–148.
Cayley, A. (1846). Sur quelques propriétés des déterminants gauches.
Hinton, G. E., & Roweis, S. (2002). Stochastic neighbor embedding. Advances in neural
information processing systems, 15.
Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components.
Journal of educational psychology, 24(6), 417.
Jacobs, R. A. (1988). Increased rates of convergence through learning rate adaptation. Neural
networks, 1(4), 295–307.
Jatnika, D., Bijaksana, M. A., & Suryani, A. A. (2019). Word2vec model analysis for semantic similarities in english words. Procedia Computer Science, 157, 160–167.
Magnus, J. R. (2010). On the concept of matrix derivative. Journal of Multivariate Analysis,
101(9), 2200–2206.
Nocedal, J., & Wright, S. J. (1999). Numerical optimization. Springer.
Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of
cluster analysis. Journal of computational and applied mathematics, 20, 53–65.
Sherman, J., & Morrison, W. J. (1950). Adjustment of an inverse matrix corresponding to a
change in one element of a given matrix. The Annals of Mathematical Statistics, 21(1),
124–127.
Tagare, H. D. (2011). Notes on optimization on stiefel manifolds. Yale University, New
Haven.
Van Der Maaten, L. (2014). Accelerating t-sne using tree-based algorithms. The journal of
machine learning research, 15(1), 3221–3245.
Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-sne. Journal of machine
learning research, 9(11).
Van Der Maaten, L., Postma, E., Van den Herik, J., et al. (2009). Dimensionality reduction:
a comparative. J Mach Learn Res, 10(66-71).
Wattenberg, M., Viégas, F., & Johnson, I. (2016). How to use t-sne effectively. Distill, 1(10),
e2.
Wen, Z., & Yin, W. (2013). A feasible method for optimization with orthogonality constraints.
Mathematical Programming, 142(1-2), 397–434.
Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal
of computational and graphical statistics, 15(2), 265–286.

指導教授

王紹宣

審核日期

2023-7-26

推文