An Efficient Cluster-Based Continual Learning with Gradient Episodic Cache Memory

以作者查詢圖書館館藏

、以作者查詢臺灣博碩士

、以作者查詢全國書目

、勘誤回報

、線上人數：23

、訪客IP：52.14.186.84

姓名

陳鈺欣(Yu-Hsin Chen) 查詢紙本館藏

畢業系所

資訊管理學系

論文名稱

(An Efficient Cluster-Based Continual Learning with Gradient Episodic Cache Memory)

相關論文

★ 台灣50走勢分析：以多重長短期記憶模型架構為基礎之預測	★ 以多重遞迴歸神經網路模型為基礎之黃金價格預測分析
★ 增量學習用於工業4.0瑕疵檢測	★ 遞回歸神經網路於電腦零組件銷售價格預測之研究
★ 長短期記憶神經網路於釣魚網站預測之研究	★ 基於深度學習辨識跳頻信號之研究
★ Opinion Leader Discovery in Dynamic Social Networks	★ 深度學習模型於工業4.0之機台虛擬量測應用
★ A Novel NMF-Based Movie Recommendation with Time Decay	★ 以類別為基礎sequence-to-sequence模型之POI旅遊行程推薦
★ A DQN-Based Reinforcement Learning Model for Neural Network Architecture Search	★ Neural Network Architecture Optimization Based on Virtual Reward Reinforcement Learning
★ 生成式對抗網路架構搜尋	★ 以漸進式基因演算法實現神經網路架構搜尋最佳化
★ Enhanced Model Agnostic Meta Learning with Meta Gradient Memory	★ 遞迴類神經網路結合先期工業廢水指標之股價預測研究

檔案

[Endnote RIS 格式]

[Bibtex 格式]

[相關文章]

[文章引用]

[完整記錄]

[館藏目錄]

至系統瀏覽論文 (2026-7-20以後開放)

摘要(中)

近年來，深度學習已變得越來越流行，它已被廣泛應用於各個領域並取得了優異的成績，但是深度學習通常在很少的訓練樣本的情況下無法達到預期的結果，並且它應該像人類一樣能夠利用過去的經驗來快速學習新任務，因此持續學習的重要性明顯增加，而主要目標是在不忘記過去所學知識的情況下學習新任務。首先，我們提出了一種名為Gradient Episodic Cache Memory的方法，結合了聚類技術來解決Gradient Episodic Memory的存儲和計算問題。其次，我們在CIFAR-10、CIFAR-100和MNIST Permutations數據集上評估模型，而實驗結果表明，GECM的性能優於其他的連續學習的模型，並且GECM在準確性和效率之間也取得了良好的平衡。

摘要(英)

In recent years, deep learning has become more and more popular. It has been widely used in various fields and has achieved outstanding results. However, deep learning usually fails to achieve the expected results in the condition of few training samples and it should be able to use past experience to quickly learn new tasks like human beings. Therefore, the importance of continuous learning increases significantly while the main goal is to learn new tasks without forgetting what has been learned in the past. First, we propose our method called Gradient Episodic Cache Memory (GECM), which is based on Gradient Episodic Memory framework and combines clustering techniques to resolve the memory and computation problems of Gradient Episodic Memory. Second, we evaluate our model on CIFAR-10, CIFAR-100, and MNIST permutations datasets. The experimental results show that GECM performs better than other state-of-the-art continual models and GECM has a good balance between accuracy and efficiency.

關鍵字(中)

★ 機器學習
★ 深度學習
★ 連續學習
★ 聚類分析

關鍵字(英)

★ Machine Learning
★ Deep Learning
★ Continual Learning
★ Cluster Analyze

論文目次

中文摘要 ii
Abstract iii
誌謝 iv
Table of contents v
List of Figures vi
List of Tables vii
1. Introduction 1
2. Related Work 5
2.1 Continual Learning 5
2.2 Meta-Learning 8
3. Methodology 9
3.1 Model Architecture 10
3.2 Data Cluster 11
3.3 GEM learning 12
4. Performance Evaluation 16
4.1 Analysis on Accuracy Performance 18
4.2 The comparison of the number of tasks 20
4.3 The comparison of episodic memory size and accuracy 23
4.4 The comparison of number of cluster and accuracy 26
4.5 The comparison of sample distance and accuracy 29
4.6 Discussion on Parameter Setting 31
5. Conclusion 34
Reference 35

參考文獻

[1] R. Aljundi, F. Babiloni, M. Elhoseiny, M. Rohrbach, and T. Tuyte-laars, “Memory aware synapses: Learning what (not) to forget,” Proceedings of the European Conference on Computer Vision (ECCV), pp. 139–154, 2018.
[2] R. Aljundi, P. Chakravarty, and T. Tuytelaars, “Expert gate: Life-long learning with a network of experts,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3366– 3375, 2017.
[3] R. Aljundi, K. Kelchtermans, and T. Tuytelaars, “Task-free continual learning,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11254–11263, 2019.
[4] R. Aljundi, M. Lin, B. Goujaud, and Y. Bengio, “Gradient based sample selection for online continual learning,” Proceedings of the 33rd Conference on Neural Information Processing Systems (NIPS), pp. 11816–11825, 2019.
[5] A. Antoniou, H. Edwards, and A. Storkey, “How to train your MAML,” Proceedings of the International Conference on Learning Representations (ICLR), 2019.
[6] A. Chaudhry, P. Dokania, T. Ajanthan, and P. H. Torr, “Riemannian walk for incremental learning: Understanding forgetting and intransigence,” Proceedings of the European Conference on Computer Vision (ECCV), pp. 532–547, 2018.
[7] A. Chaudhry, M. Ranzato, M. Rohrbach, M. Elhoseiny, “Efficient Lifelong Learning with A-GEM”, Proceedings of the International Conference on Learning Representations (ICLR), 2019.
[8] A. Chaudhry, M. Rohrbach, M. Elhoseiny, T. Ajanthan, P. K. Dokania, P. Torr, and M. Ranzato, “Continual learning with tiny episodic memories,” Preprint arXiv:1902.10486, 2019.
[9] Z. Chen and B. Liu, “Lifelong machine learning,” Synthesis Lectures on Artificial Intelligence and Machine Learning, vol. 12, no. 3, pp. 1–207, 2018.
[10] Z. Chen and D. Wang, “Multi-initialization meta-learning with domain adaptation,” Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021.
[11] G. Denevi, M. Pontil, and C. Ciliberto, “The advantage of conditional meta-learning for biased regularization and fine tuning,” Advances in Neural Information Processing Systems, 2020.
[12] C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 1126–1135, 2017.
[13] S. Flennerhag, A. Rusu, R. Pascanu, F. Visin, H. Yin, and R. Hadsell, “Meta-learning with warped gradient descent,” Proceedings of the International Conference on Learning Representations (ICLR), 2020.
[14] R. French, “Using semi-distributed representations to overcome catastrophic forgetting in connectionist networks,” Proceedings of the 13th Annual Cognitive Science Society Conference, pp. 173–178, 1991.
[15] S. Gidaris and N. Komodakis, “Dynamic few-shot visual learning without forgetting,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4367–4375, 2018.
[16] I. Goodfellow, M. Mirza, D. Xiao, A. Courville, Y. Bengio, “An empirical investigation of catastrophic forgetting in gradient-based neural networks,” Preprint arXiv:1312.6211, 2013.
[17] X. He, J. Sygnowski, A. Galashov, A. Rusu, Y. Teh, and R. Pascanu, “Task agnostic continual learning via meta learning,” Preprint arXiv:1906.05201, 2019.
[18] J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, et al. “Overcoming catastrophic forgetting in neural networks,” Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3521–3526, 2017.
[19] A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” https://www.cs.toronto.edu/ kriz/cifar.html, 2009.
[20] B. Lake, T. Ullman, J. Tenenbaum, and S. Gershman, “Building machines that learn and think like people,” Behavioral and Brain Sciences, pp. 1–101, 2016.
[21] Y. LeCun, C. Cortes, and C. J. Burges, “The MNIST database of handwritten digits,” http: //yann.lecun.com/exdb/mnist/, 1998.
[22] Z. Li, and D. Hoiem, “Learning without forgetting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 2935–2947, 2018.
[23] Z. Li, F. Zhou, F. Chen, and H. Li, “Meta-sgd: Learning to learn quickly for few shot learning,” Preprint arXiv:1707.09835, 2017.
[24] D. Lopez-Paz, and M. Ranzato, “Gradient Episodic Memory for Continuum Learning,” Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS), pp. 6467–6476, 2017.
[25] M. McCloskey and N. Cohen, “Catastrophic interference in connectionist networks: The sequential learning problem,” The psychology of learning and motivation, vol. 24, pp. 109–165, 1989.
[26] S. Mirzadeh, M. Farajtabar, and H. Ghasemzadeh, “Dropout as an implicit gating mechanism for continual learning,” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 232–233, 2020.
[27] S. Mirzadeh, M. Farajtabar, R. Pascanu, and H. Ghasemzadeh, “Understanding the role of training regimes in continual learning,” Advances in Neural Information Processing Systems, 2020.
[28] A. Nichol and J. Schulman, “Reptile: a scalable metalearning algorithm,” Preprint arXiv:1803.02999, 2018.
[29] B. Oreshkin, P. Lo ́pez, and A. Lacoste, “Tadam: Task dependent adaptive metric for improved few-shot learning,” In Advances in Neural Information Processing Systems, pp. 719–729, 2018.
[30] V. Ramasesh, E. Dyer, and M. Raghu, “Anatomy of catastrophic forgetting: Hidden representations and task semantics,” Proceedings of the International Conference on Learning Representations (ICLR), 2020.
[31] A. Rannen, R. Aljundi, M. Blaschko, and T. Tuytelaars, “Encoder based lifelong learning,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1320–1328, 2017.
[32] S. Ravi, and H. Larochelle, “Optimization as a model for few-shot learning,” In International Conference on Learning Representations (ICLR), 2017.
[33] S. Rebuffi, A. Kolesnikov, G. Sperl, and C. Lampert, “iCaRL: Incremental Classifier and Representation Learning,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2001–2010, 2017.
[34] M. Riemer, I. Cases, R. Ajemian, M. Liu, I. Rish, Y. Tu, G. Tesauro, “Learning to learn without forgetting by maximizing transfer and minimizing interference,” Proceedings of the International Conference on Learning Representations, 2019.
[35] M. Ring, “Child: A first step towards continual learning,” Machine Learning, vol. 28, no. 1, pp. 77–104, 1997.
[36] D. Rolnick, A. Ahuja, J. Schwarz, T. Lillicrap, G. Wayne, “Experience replay for continual learning,” Advances in Neural Information Processing Systems, pp. 350–360, 2019.
[37] A. Rusu, N. Rabinowitz, G. Desjardins, H. Soyer, J. Kirkpatrick, K. Kavukcuoglu, R. Pascanu, and R. Hadsell, “Progressive neural networks,” Preprint arXiv:1606.04671, 2016.
[38] J. Schmidhuber, “Evolutionary principles in selfreferential learning. On learning how to learn: The meta-meta-... hook,” Diploma thesis, Institut f. Informatik, Tech. Univ. Munich, 1987.
[39] H. Shin, J. Lee, J. Kim, and J. Kim, “Continual learning with deep generative replay,” Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS), pp. 2990–2999, 2017.
[40] K. Shmelkov, C. Schmid, K. Alahari, “Incremental Learning of Object Detectors without Catastrophic Forgetting,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 3400–3409, 2017.
[41] J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” In Advances in Neural Information Processing Systems, pp. 4077– 4087, 2017.
[42] F. Sung, Y. Yang, L. Zhang, T. Xiang, P. Torr, and T. Hospedales, “Learning to compare: Relation network for few-shot learning,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1199–1208, 2018.
[43] S. Thrun, “Lifelong learning algorithms,” Learning to learn, pp. 181–209, 1998.
[44] N. Tripuraneni, C. Jin, and M. Jordan, “Provable meta-learning of linear representations,” Preprint arXiv:2002.11684, 2020.
[45] P. Utgoff, “Shift of bias for inductive concept learning,” Machine learning: An artificial intelligence approach, pp. 107–148, 1986.
[46] O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, and D. Wierstra, “Matching networks for one shot learning,” In Advances in Neural Information Processing Systems, pp. 3630– 3638, 2016.
[47] D. Yin, M. Farajtabar, and A. Li, “SOLA: Continual learning with second-order loss approximation,” Preprint arXiv:2006.10974, 2020.
[48] J. Yoon, E. Yang, J. Lee, S. Hwang, “Lifelong Learning with Dynamically Expandable Networks”, Proceedings of the International Conference on Learning Representations (ICLR), 2018.
[49] F. Zenke, B. Poole, and S. Ganguli, “Continual learning through synaptic intelligence,” Proceedings of the 34th International Conference on Machine Learning (ICML), vol. 70, pp. 3987–3995, 2017.

指導教授

陳以錚(Yi-Cheng Chen)

審核日期

2021-7-20

推文