半監督學習下自定義編碼特徵的大規模比較

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：47

、訪客IP：13.58.148.182

姓名

陳正浩(Zheng-Hao Chen) 查詢紙本館藏

畢業系所

軟體工程研究所

論文名稱

半監督學習下自定義編碼特徵的大規模比較
(A Large-scale Comparison of Customized Feature Encodings under Semi-supervised Learning)

相關論文

★ 透過網頁瀏覽紀錄預測使用者之個人資訊與性格特質	★ 透過矩陣分解之多目標預測方法預測使用者於特殊節日前之瀏覽行為變化
★ 預測交通需求之分佈與數量—基於多重式注意力機制之AR-LSTMs 模型	★ 動態多模型融合分析研究
★ 擴展點擊流：分析點擊流中缺少的使用者行為	★ 關聯式學習：利用自動編碼器與目標傳遞法分解端到端倒傳遞演算法
★ 融合多模型排序之點擊預測模型	★ 分析網路日誌中有意圖、無意圖及缺失之使用者行為
★ 基於自注意力機制產生的無方向性序列編碼器使用同義詞與反義詞資訊調整詞向量	★ 探索深度學習或簡易學習模型在點擊率預測任務中的使用時機
★ 空氣品質感測器之故障偵測--基於深度時空圖模型的異常偵測框架	★ 以同反義詞典調整的詞向量對下游自然語言任務影響之實證研究
★ 利用輔助語句與BERT模型偵測詞彙的上下位關係	★ 結合時空資料的半監督模型並應用於PM2.5空污感測器的異常偵測
★ 利用 SCPL 分解端到端倒傳遞演算法	★ 藉由權重之梯度大小調整DropConnect的捨棄機率來訓練神經網路

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

[檢視]

[下載]

本電子論文使用權限為同意立即開放。
已達開放權限電子全文僅授權使用者為學術研究之目的，進行個人非營利性質之檢索、閱讀、列印。
請遵守中華民國著作權法之相關規定，切勿任意重製、散佈、改作、轉貼、播送，以免觸法。

摘要(中)

自定義編碼方式可有效提升深度學習模型在監督式任務的表現，但自定義編碼方式在自監督對比學習的效果則尚未被大規模驗證。本論文設計並實現了一個靈活的自定義特徵編碼框架，讓研究者可以大規模比較比較不同編碼方式在自監督任務的效果。同時，我們提出了一種新的編碼方式，探索其在不同資料集上的潛力和應用價值。

摘要(英)

Custom encoding methods can effectively enhance the performance of deep learning models in supervised tasks. However, custom encoding′s effectiveness in self-supervised contrastive learning has yet to be extensively validated. This paper designs and implements a flexible framework for custom feature encoding evaluation, allowing researchers to comprehensively compare the effects of different encoding methods on self-supervised tasks. Additionally, we propose a new encoding method to explore its potential and application value across various datasets.

關鍵字(中)

★ 自監督學習
★ 對比學習
★ 表格資料
★ 自定義編碼

關鍵字(英)

★ Self-supervised learnin
★ Contrastive learnin
★ tabular data
★ custom encoding

論文目次

目錄
頁次
摘要 v
Abstract vi
誌謝 vii
目錄 viii
使用符號與定義 xiii
一、緒論 1
二、相關研究 3
2.1 SCARF .................................................................... 3
2.1.1 損壞特徵的數據增強 .......................................... 3
2.1.2 SCARF 方法 .................................................... 3
2.2 數值特徵編碼 ............................................................ 5
2.2.1 分段線性編碼 ................................................... 5
2.2.2 分段線性編碼方法 ............................................. 6
2.2.3 週期激勵函數 ................................................... 7
2.2.4 週期激勵函數方法 ............................................. 8
三、框架設計 9
3.1 框架設計理念 ............................................................ 9
3.2 框架架構 .................................................................. 9
3.2.1 核心組件 ......................................................... 9
3.2.2 模組間的交互 ................................................... 10
3.3 應用程式界面 (API).................................................... 11
四、研究模型及方法 12
4.1 整體模型架構 ............................................................ 12
4.2 標準差編碼 ............................................................... 14
4.3 自定義編碼器設計 ...................................................... 15
4.3.1 基本訓練 ......................................................... 15
4.3.2 快速實驗 ......................................................... 15
4.3.3 自定義模型 ...................................................... 16
五、實驗結果與分析 21
5.1 實驗環境、參數細節及設定 .......................................... 21
5.2 資料集 ..................................................................... 21
5.3 實驗結果 .................................................................. 25
5.3.1 二元分類 ......................................................... 25
5.3.2 多元分類 ......................................................... 26
5.3.3 高維度特徵分類 ................................................ 27
5.3.4 大型資料集分類 ................................................ 28
5.4 討論 ........................................................................ 29
六、總結 30
6.1 結論 ........................................................................ 30
6.2 未來展望 .................................................................. 30
參考文獻 32
附錄 A 實驗程式碼 35
附錄 B 資料集 36

參考文獻

[1] D. Bahri, H. Jiang, Y. Tay, and D. Metzler, “Scarf: Self-supervised contrastive
learning using random feature corruption,” arXiv preprint arXiv:2106.15147, 2021.
[2] Y. Gorishniy, I. Rubachev, and A. Babenko, “On embeddings for numerical features
in tabular deep learning,” Advances in Neural Information Processing Systems,
vol. 35, pp. 24 991–25 004, 2022.
[3] T. Yao, X. Yi, D. Z. Cheng, et al., “Self-supervised learning for large-scale item
recommendations,” in Proceedings of the 30th ACM international conference on
information & knowledge management, 2021, pp. 4321–4330.
[4] Z. Wu, Y. Xiong, S. X. Yu, and D. Lin, “Unsupervised feature learning via nonparametric instance discrimination,” in Proceedings of the IEEE conference on
computer vision and pattern recognition, 2018, pp. 3733–3742.
[5] S. Purushwalkam and A. Gupta, “Demystifying contrastive self-supervised learning:
Invariances, augmentations and dataset biases,” Advances in Neural Information
Processing Systems, vol. 33, pp. 3407–3418, 2020.
[6] R. Gontijo-Lopes, S. J. Smullin, E. D. Cubuk, and E. Dyer, “Affinity and diversity:
Quantifying mechanisms of data augmentation,” arXiv preprint arXiv:2002.08973,
2020.
[7] R. G. Lopes, D. Yin, B. Poole, J. Gilmer, and E. D. Cubuk, “Improving robustness
without sacrificing accuracy with patch gaussian augmentation,” arXiv preprint
arXiv:1906.02611, 2019.
[8] L. Perez and J. Wang, “The effectiveness of data augmentation in image classification using deep learning,” arXiv preprint arXiv:1712.04621, 2017.
[9] D. S. Park, W. Chan, Y. Zhang, et al., “Specaugment: A simple data augmentation
method for automatic speech recognition,” arXiv preprint arXiv:1904.08779, 2019.
[10] A. J. Ratner, H. Ehrenberg, Z. Hussain, J. Dunnmon, and C. Ré, “Learning to compose domain-specific transformations for data augmentation,” Advances in neural
information processing systems, vol. 30, 2017.
32
[11] E. D. Cubuk, B. Zoph, J. Shlens, and Q. V. Le, “Randaugment: Practical automated
data augmentation with a reduced search space,” in Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition workshops, 2020, pp. 702–
703.
[12] D. Ho, E. Liang, X. Chen, I. Stoica, and P. Abbeel, “Population based augmentation: Efficient learning of augmentation policy schedules,” in International conference on machine learning, PMLR, 2019, pp. 2731–2741.
[13] S. Lim, I. Kim, T. Kim, C. Kim, and S. Kim, “Fast autoaugment,” Advances in
neural information processing systems, vol. 32, 2019.
[14] X. Zhang, Q. Wang, J. Zhang, and Z. Zhong, “Adversarial autoaugment,” arXiv
preprint arXiv:1912.11188, 2019.
[15] T. Tran, T. Pham, G. Carneiro, L. Palmer, and I. Reid, “A bayesian data augmentation approach for learning deep models,” Advances in neural information
processing systems, vol. 30, 2017.
[16] A. Tamkin, M. Wu, and N. Goodman, “Viewmaker networks: Learning views for
unsupervised representation learning,” arXiv preprint arXiv:2010.07432, 2020.
[17] J. Yoon, Y. Zhang, J. Jordon, and M. Van der Schaar, “Vime: Extending the
success of self-and semi-supervised learning to tabular domain,” Advances in Neural
Information Processing Systems, vol. 33, pp. 11 033–11 043, 2020.
[18] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in International conference on machine
learning, PMLR, 2020, pp. 1597–1607.
[19] A. Dosovitskiy, L. Beyer, A. Kolesnikov, et al., “An image is worth 16x16 words:
Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929,
2020.
[20] A. Vaswani, N. Shazeer, N. Parmar, et al., “Attention is all you need,” Advances
in neural information processing systems, vol. 30, 2017.
[21] Y. Gorishniy, I. Rubachev, V. Khrulkov, and A. Babenko, “Revisiting deep learning
models for tabular data,” Advances in Neural Information Processing Systems,
vol. 34, pp. 18 932–18 943, 2021.
[22] X. Huang, A. Khetan, M. Cvitkovic, and Z. Karnin, “Tabtransformer: Tabular data
modeling using contextual embeddings,” arXiv preprint arXiv:2012.06678, 2020.
[23] J. Kossen, N. Band, C. Lyle, A. N. Gomez, T. Rainforth, and Y. Gal, “Self-attention
between datapoints: Going beyond individual input-output pairs in deep learning,”
Advances in Neural Information Processing Systems, vol. 34, pp. 28 742–28 756,
2021.
33
參考文獻
[24] G. Somepalli, M. Goldblum, A. Schwarzschild, C. B. Bruss, and T. Goldstein,
“Saint: Improved neural networks for tabular data via row attention and contrastive
pre-training,” arXiv preprint arXiv:2106.01342, 2021.
[25] N. Rahaman, A. Baratin, D. Arpit, et al., “On the spectral bias of neural networks,”
in International conference on machine learning, PMLR, 2019, pp. 5301–5310.
[26] M. Tancik, P. Srinivasan, B. Mildenhall, et al., “Fourier features let networks learn
high frequency functions in low dimensional domains,” Advances in neural information processing systems, vol. 33, pp. 7537–7547, 2020.
[27] Y. Li, S. Si, G. Li, C.-J. Hsieh, and S. Bengio, “Learnable fourier features for
multi-dimensional spatial positional encoding,” Advances in Neural Information
Processing Systems, vol. 34, pp. 15 816–15 829, 2021.
[28] V. Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein, “Implicit neural
representations with periodic activation functions,” Advances in neural information
processing systems, vol. 33, pp. 7462–7473, 2020.
[29] B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and
R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,”
Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
[30] B. Bischl, G. Casalicchio, M. Feurer, et al., “Openml benchmarking suites,” arXiv:1708.03731v2
[stat.ML], 2019.

指導教授

陳弘軒(Hung-Hsuan Chen)

審核日期

2024-7-30

推文