博碩士論文 107423038 詳細資訊

以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:10 、訪客IP:
姓名 劉旻融(Ming-Rong Liu)  查詢紙本館藏   畢業系所 資訊管理學系
(Enhanced Model Agnostic Meta Learning with Meta Gradient Memory)
★ 以漸進式基因演算法實現神經網路架構搜尋最佳化
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (2023-7-31以後開放)
摘要(中) 現今深度學習模型若要提升準確率至相當水準,經常動輒數千、數萬筆訓練資料才可達成,若要模型學習過去未見過之其他類別資料,往往需要將模型重新訓練。這些實務上的需求使得元學習和連續學習等領域逐漸受到重視,但元學習雖以良好的模型學習彈性著稱,但因訓練過程的高不穩定性使得效能並不可靠。另一方面,連續學習的高穩定性,降低其可學習的任務數量。因此本篇論文著重透過結合元學習與連續學習兩種小樣本學習上表現卓越的演算法,透過連續學習提升元學習的穩定性,同時也透過元學習改善連續學習的學習彈性。此外,過去在深度學習領域研究中,發現所謂的穩定性-彈性困境,意指為兩種效能表現經常會有取捨關係,無法兼得,然後在本篇研究的實驗結果中,該篇模型可在現今小樣本學習常見之資料集上,同時提高測試準確率和驗證準確率。
摘要(英) Recently, the importance of few shot learning field has obviously increased, and variety of famous learning methods, like Meta-learning and Continuous learning. These methods proposed to solve few shot learning, which main purpose is both training model with only few amounts of data and maintaining high generalization ability. MAML, which is an elegant and effective Meta-Learning method demonstrates its powerful performance in Omniglot and Mini-Imagenet N-way K-shot classification experiments. However, the recent research points out that the problems of instable performance of MAML and others model′s architecture problems. On the other hand, continuous learning models usually face the issue of catastrophic forgetting when the models not only learn new tasks but keep remembering the knowledge about previous tasks. Therefore, we propose our method, En-MAML, which is based on MAML framework, to combine the flexible adaptation characteristic from meta-learning with the stability performance from continual learning. We evaluate our model on Omniglot and Mini-Imagenet datasets, and follow the N-way K-shot experiment protocol. From our experiment results, our model demonstrates higher accuracy and stability on Omniglot and Mini-Imagenet.
關鍵字(中) ★ 深度學習
★ 機器學習
★ 元學習
★ 連續學習
論文目次 中文摘要 ii
Abstract iii
Table of Contents iv
I. Introduction 1
II. Related Work 5
2.1 Meta-Learning 5
2-1-1 Metric Based 5
2-1-2 Gradient Based 6
2.2 Continual Learning 7
III. Methodology 8
3.1 En-MAML architecture 10
3.2 Meta-Learning with Meta Gradient Algorithm 12
3.3 The loss function of En-MAML 13
IV. Performance Evaluation 14
4.1 Datasets 14
4.2 Performance Comparison 15
4.3 The Comparison of Stability and Accuracy with MAML 19
4.4 The Effectiveness of Combining Meta-Learning with Continual Learning 23
4.5 The Meta Gradient Buffer Operation and Setting 28
4.6 Hyper-Parameter Setting 29
V. Conclusion 31
References 32
參考文獻 [1] Alex Nichol and John Schulman. (2018). Reptile: a scalable metalearning algorithm. arXiv preprint arXiv:1803.02999.
[2] Andrei A Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, and Raia Hadsell. (2018). Meta-learning with latent embedding optimization. arXiv:1807.05960.
[3] Antreas Antoniou, Harrison Edwards, and Amos Storkey. (2019). How to train your MAM. In Proceedings of the International Conference on Learning Representations.
[4] Bengio, Samy, Bengio, Yoshua, Cloutier, Jocelyn, and Gecsei, Jan. (1992). On the optimization of a synaptic learning rule. In Optimality in Artificial and Biological Neural Networks. pp. 6–8.
[5] Boris Oreshkin, Pau Rodr´ıguez L´opez, and Alexandre Lacoste. (2018). Tadam: Task dependent adaptive metric for improved few-shot learning. In Advances in Neural Information Processing Systems.
[6] Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. (2015). Human-level concept learning through probabilistic program induction. Science, pp. 350(6266):1332–1338.
[7] Chelsea Finn, Pieter Abbeel, and Sergey Levine. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, JMLR. org,, pp. pp. 1126–1135.
[8] Deleu, T., Würfl, T., Samiei, M., Cohen, J. P., and Bengio, Y. (2019). Torchmeta: A Meta-Learning library for PyTorch.
[9] Edwards, Harrison and Storkey, Amos. (2017). Towards a neural statistician. International Conference on Learning Representations (ICLR).
[10] French., R. M. (1991). Using semi-distributed representations to overcome catastrophic forgetting in connectionist networks. In Proceedings of the 13th Annual Cognitive Science Society Conference, pp. pp. 173–178. Erlbaum.
[11] Gail A Carpenter and Stephen Grossberg. (1987). A massively parallel architecture for a self-organizing neural pattern recognition machine. Computer vision, graphics, and image processing, 37(1): 54–115.
[12] Guneet Singh Dhillon, Pratik Chaudhari, Avinash Ravichandran, and Stefano Soatto. (2020). A baseline for few-shot image classification. In ICLR.
[13] H. Shin, J. K. Lee, J. Kim, and J. Kim. (2017). Continual learning with deep generative replay. I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, , pp. pages 2990–2999.
[14] He, X., Sygnowski, J., Galashov, A., Rusu, A. A., Teh, Y. W., and Pascanu, R. (2019). Task agnostic continual learning via meta learning. arXiv preprint arXiv:1906.05201.
[15] Jake Snell, Kevin Swersky, and Richard Zemel. (2017). Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, pp. pp. 4077–4087.
[16] James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. (2017). Overcoming catastrophic forgetting in neural networks. Proceedings of the National Academy of Sciences, p. pp. 201611835.
[17] Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. (2014). How transferable are features in deep neural networks? Advances in neural information processing systems, pp. pages 3320–3328.
[18] Koch, Gregory. (2015). Siamese neural networks for one-shot image recognition. ICML Deep Learning Workshop.
[19] Lopez-Paz, D. and Ranzato, M. (2017). Gradient Episodic Memory for Continuum Learning. ArXiv.
[20] Lukasz, N. O. (2017). Learning to remember rare events. International Conference on Learning Representations (ICLR).
[21] Munkhdalai, Tsendsuren and Yu, Hong. (2017). Meta networks. International Conferecence on Machine Learning (ICML).
[22] Oriol Vinyals, Charles Blundell, Tim Lillicrap, Daan Wierstra, et al. (2016). Matching networks for oneshot learning. Advances in Neural Information Processing Systems, pp. pp. 3630–3638.
[23] R. Vuorio, D.-Y. Cho, D. Kim, and J. Kim. (2018). Meta continual learning. arXiv preprint arXiv:1806.06928.
[24] Sachin Ravi and Hugo Larochelle. (2016). Optimization as a model for few-shot learning.
[25] Schmidhuber, Jurgen. (1987). Evolutionary principles in selfreferential learning. On learning how to learn: The meta-meta-... hook.). Diploma thesis, Institut f. Informatik, Tech. Univ. Munich.
[26] Spyros Gidaris and Nikos Komodakis. (2018). Dynamic few-shot visual learning without forgetting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4367–4375.
[27] Sumit Chopra, Raia Hadsell, Yann LeCun, et al. . (2005). Learning a similarity metric discriminatively, with application to face verification. CVPR , pp. pp. 539–546.
[28] Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, and Christoph H Lampert. (2017). icarl: Incremental classifier and representation learning. CVPR.
[29] Thrun, Sebastian. (1998). Lifelong learning algorithms. In Learning to learn. pp. pp. 181–209 Springer.
[30] Utgoff, P. E. (1986). Shift of bias for inductive concept learning. Machine learning: An artificial intelligence approach. pp. 2:107–148.
[31] W. C. Abraham and A. Robins. Memory retention–the synaptic stability versus plasticity dilemma. Trends in neurosciences, 28(2):73–78, 2005.
[32] Y. Tu, , and G. Tesauro. (2019). Learning to learn without forgetting by maximizing transfer and minimizing interference. International Conference on Learning Representations.
[33] Zhenguo Li, Fengwei Zhou, Fei Chen, and Hang Li. (2017). Meta-sgd: Learning to learn quickly for few shot learning. CoRR, abs/1707.09835.
指導教授 陳以錚 周惠文 審核日期 2020-7-16
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明