博碩士論文 107522125 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:13 、訪客IP:3.142.250.114
姓名 涂珮榕(Pei-Jung Tu)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 基於實驗追蹤與模型回復的機器學習超參數優化設計與實作
(The Design and Implementation of Machine Learning Hyperparameter Optimization with Experiment Tracking and Model Restoring)
相關論文
★ 條件判斷式事件驅動程式設計之C語言擴充★ 基于小波变换的指纹活度检测,具有聚集 LPQ 和 LBP 特征
★ 應用自動化測試於異質環境機器學習管道之 MLOps 系統★ 設計具有可視化思維工具和程式作為單一步的 輔助學習程式之棋盤式遊戲
★ TOCTOU 漏洞的靜態分析與實作★ 用於繪製風力發電控制邏輯之特定領域語言
★ 在Java程式語言中以雙向結構表達數學公式間關聯之設計與實作★ 支援模組化規則製作之程式碼轉換工具
★ 基於替代語意的 pandas DataFrame 靜態型別檢查器★ 自動化時間複雜度分析的設計與實作–從軟體層面評估嵌入式系統的功率消耗
★ 以震波層析成像為應用之特定領域語言實作與分析★ 用特徵選擇減少疲勞偵測腦電圖通道數
★ 一個應用紙本運算與數位化於程式設計學習使程序性思維可視化的機制★ 基於抽象語法樹的陣列形狀錯誤偵測
★ 從合作學習角色分工獲得函式程式設計思維學習遞迴程式的機制★ 基於抽象語法樹的深度複製及彈性別名之所有權系統解決 Java 表示暴露問題
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 (2031-8-1以後開放)
摘要(中) 機器學習程式是透過不斷實驗、調整模型以訓練出一個理想模型為開發目標。為了使模型盡可能地符合期望,開發者需不斷執行超參數調整的過程,既存作法雖可達到在訓練期間調整超參數的功能,但仍有其不足之處。為了對比調整前後的差異,常見使用表格的方式去記錄實驗過程,然而此法較不易直觀地看出實驗之間的關連性。
對此,本研究提出一個輔助超參數調整工具:RETUNE,利用回呼函式和checkpoint機制,讓使用者可在訓練期間對優化器超參數進行調整,結合視覺化方式即時反饋模型評估指標給使用者,並自動記錄調整時的模型配置等相關訓練數據。同時,提供回溯功能使模型得以回復至先前的訓練狀態,以進行模型比較。最後,以樹狀圖的方式呈現先前調整歷程,協助使用者歸納過去的實驗,進而理解超參數對訓練產生的影響,更有效的進行優化器超參數調整與模型優化的過程。
摘要(英) Building machine learning model is an experiment-driven process. Tuning hyperparamters iteratively to meet the acceptance criteria usually results in tremendous trial models. There are some related research for tuning hyperparameters during training, but still have constraints for building a model. Moreover, most of the developers tend to manage these artifacts in tableau way, and extra effort must be spent with it. However, tables can not reveal the correlation between experiments.
In this paper, we implement a tool: RETUNE with callback function and checkpoint operation. It allows users to tune optimizer′s hyperparameters based on the visualized feedback of the model metrics during training, and automatically extracts model configuration from the experiment. With the feature to restore model from the previous training state, users would be able to compare from multiple potential models. Finally, the tuning process would be plotted as a tree graph which aimed at helping users understand the historical experiments and realize the relation between hyperparameters setting and training process, in order to effectively manipulate hyperparameter tuning and model optimization.
關鍵字(中) ★ 機器學習
★ 超參數調整
★ 互動式機器學習
★ 實驗追蹤
關鍵字(英) ★ Machine Learning
★ Hyperparameter tuning
★ Interactive machine learning
★ Experiment tracking
論文目次 一、 緒論 1
1.1 機器學習訓練過程 ...................................................... 1
1.2 超參數調整方法 ......................................................... 2
1.2.1 手動調整超參數 ................................................ 3
1.2.2 自動搜尋超參數 ................................................ 3
1.3 訓練期間調整超參數 ................................................... 7
1.3.1 Hyperparameter schedule..................................... 7
1.3.2 Interactive machine learning................................. 7
1.4 模型管理 .................................................................. 8
二、 動機 10
2.1 動機實例 .................................................................. 10
2.2 現有作法不足之處 ...................................................... 12
2.3 問題總結 .................................................................. 14
iii
三、 提案 16
3.1 整體架構 .................................................................. 16
3.2 回呼函式 .................................................................. 17
3.3 自動記錄模型設置 ...................................................... 18
3.4 訓練期間調整優化器超參數 .......................................... 19
3.5 儲存、回復模型狀態 ................................................... 19
3.6 樹狀圖呈現調整歷程 ................................................... 21
四、 實作 23
4.1 實作環境 .................................................................. 23
4.2 實作細節 .................................................................. 24
4.2.1 plot_metric ...................................................... 24
4.2.2 getter.............................................................. 25
4.2.3 setter .............................................................. 26
4.2.4 recorder ........................................................... 28
4.2.5 plot_tree ......................................................... 29
4.3 使用案例 .................................................................. 30
五、 評估 34
5.1 回到動機問題 ............................................................ 34
5.2 輔助開發的效果評估 ................................................... 36
5.2.1 使用 Retune 輔助開發的訓練結果 ....................... 36
5.2.2 比較 Retune 實現遞減學習率訓練效果 ................. 37
5.3 與相關實作進行功能面比較 .......................................... 38
5.3.1 訓練期間調整超參數- BIDMach .......................... 38 5.3.2 視覺化輔助超參數調整- HyperTuner.................... 40 5.4 研究限制 .................................................................. 42
iv

六、 相關研究 43
6.1 Interactive machine learning.......................................... 43
6.2 視覺化輔助超參數搜尋 ................................................ 44
6.3 模型管理 .................................................................. 44
七、 結論 45
7.1 結論 ........................................................................ 45
7.2 未來展望 .................................................................. 46
參考文獻
47
參考文獻 [1] A. Géron, Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: Concepts, tools, and techniques to build intelligent systems. O’Reilly Media, 2nd ed., 2019.
[2] T. Li, G. Convertino, W. Wang, H. Most, T. Zajonc, and Y.-H. Tsai, “Hyper- tuner: Visual analytics for hyperparameter tuning by professionals,” in Proc. Ma- chine Learning from User Interaction for Visualization and Analytics Workshop at IEEE VIS, 2018.
[3] G. I. Diaz, A. Fokoue-Nkoutche, G. Nannicini, and H. Samulowitz, “An effective algorithm for hyperparameter optimization of neural networks,” IBM Journal of Research and Development, vol. 61, no. 4/5, pp. 9–1, 2017.
[4] J. Bergstra and Y. Bengio, “Random search for hyper-parameter optimization.,” Journal of machine learning research, vol. 13, no. 2, 2012.
[5] D.E.GoldbergandJ.H.Holland,“Geneticalgorithmsandmachinelearning,”1988.
[6] D. Jönsson, G. Eilertsen, H. Shi, J. Zheng, A. Ynnerman, and J. Unger, “Visual analysis of the impact of neural network hyper-parameters,” 2020.
[7] J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesian optimization of ma- chine learning algorithms,” arXiv preprint arXiv:1206.2944, 2012.
[8] E. Brochu, V. M. Cora, and N. De Freitas, “A tutorial on bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning,” arXiv preprint arXiv:1012.2599, 2010.
[9] L. R. Rere, M. I. Fanany, and A. M. Arymurthy, “Simulated annealing algorithm for deep learning,” Procedia Computer Science, vol. 72, pp. 137–144, 2015.
[10] S. L. Smith, P.-J. Kindermans, C. Ying, and Q. V. Le, “Don’t decay the learning rate, increase the batch size,” arXiv preprint arXiv:1711.00489, 2017.
[11] L. N. Smith, “A disciplined approach to neural network hyper-parameters: Part 1–learning rate, batch size, momentum, and weight decay,” arXiv preprint arXiv:1803.09820, 2018.
[12] A. Varangaonkar, “What is interactive machine learning?.” hub.packtpub.com., https://hub.packtpub.com/what-is-interactive-machine-learning/, (ac- cessed: Apr. 28, 2021).
[13] S.Amershi,M.Cakmak,W.B.Knox,andT.Kulesza,“Powertothepeople:Therole of humans in interactive machine learning,” Ai Magazine, vol. 35, no. 4, pp. 105–120, 2014.
[14] B. Jiang and J. Canny, “Interactive machine learning via a gpu-accelerated toolkit,” in Proc. 22nd International Conference on Intelligent User Interfaces, pp. 535–546, 2017.
[15] G. Gharibi, V. Walunj, S. Rella, and Y. Lee, “Modelkb: towards automated manage- ment of the modeling lifecycle in deep learning,” in 2019 IEEE/ACM 7th Interna- tional Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE), pp. 28–34, IEEE, 2019.
[16] M. Vartak, H. Subramanyam, W.-E. Lee, S. Viswanathan, S. Husnoo, S. Madden, and M. Zaharia, “Modeldb: a system for machine learning model management,” in Proc. Workshop on Human-In-the-Loop Data Analytics, pp. 1–3, 2016.
[17] D. Erhan, A. Courville, Y. Bengio, and P. Vincent, “Why does unsupervised pre- training help deep learning?,” in Proc. 13th international conference on artificial intelligence and statistics, pp. 201–208, JMLR Workshop and Conference Proceed- ings, 2010.
[18] Y.Bengio,P.Simard,andP.Frasconi,“Learninglong-termdependencieswithgradi- ent descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157– 166, 1994.
[19] S. Amershi, J. Fogarty, A. Kapoor, and D. Tan, “Examining multiple potential models in end-user interactive concept learning,” in Proceedings of the SIGCHI Con- ference on Human Factors in Computing Systems, pp. 1357–1360, 2010.
[20] K. Patel, J. Fogarty, J. A. Landay, and B. L. Harrison, “Examining difficulties soft- ware developers encounter in the adoption of statistical machine learning.,” in AAAI, pp. 1563–1566, 2008.
[21] Scott Chacon, “Git branching - branches in a nutshell.” git-scm.com, https://git- scm.com/book/en/v2/Git-Branching-Branches-in-a-Nutshell, (accessed: May 28, 2021).
[22] S. Eisler and J. Meyer, “Visual analytics and human involvement in machine learn- ing,” arXiv preprint arXiv:2005.06057, 2020.
[23] ml-tooling, “best-of-ml-python.” GitHub repository., https://github.com/ml- tooling/best-of-ml-python#machine-learning-frameworks, (accessed: Apr. 22, 2021).
[24] wiki.python.org, “Globalinterpreterlock.” wiki.python.org, https://wiki.python. org/moin/GlobalInterpreterLock, (accessed: May 7, 2021).
[25] F. Pérez and B. E. Granger, “IPython: a system for interactive scientific computing,” Computing in Science and Engineering, vol. 9, pp. 21–29, May 2007.
[26] zeromq, “zeromq/ pyzmq.” GitHub repository., https://github.com/zeromq/ pyzmq, (accessed: May 22, 2021).
[27] P. T. Inc., “Collaborative data science,” 2015.
[28] D. Bau, S. Liu, T. Wang, J.-Y. Zhu, and A. Torralba, “Rewriting a deep generative
model,” in European Conference on Computer Vision, pp. 351–369, Springer, 2020.
[29] J. A. Fails and D. R. Olsen Jr, “Interactive machine learning,” in Proc. 8th interna-
tional conference on Intelligent user interfaces, pp. 39–45, 2003.
[30] D. Guo, “Coordinating computational and visual approaches for interactive fea- ture selection and multivariate clustering,” Information Visualization, vol. 2, no. 4, pp. 232–246, 2003.
[31] ANeuralNetworkPlayground-TensorFlow,“Aneuralnetworkplayground-tensor- flow.” playground.tensorflow.org, https://playground.tensorflow.org, (accessed: Apr. 22, 2021).
[32] A. Kapoor, B. Lee, D. Tan, and E. Horvitz, “Interactive optimization for steering machine classification,” in Proc. SIGCHI Conference on Human Factors in Comput- ing Systems, pp. 1343–1352, 2010.
[33] J. Tsay, T. Mummert, N. Bobroff, A. Braz, P. Westerink, and M. Hirzel, “Run- way: machine learning model experiment management tool,” in Conf. Systems and Machine Learning (SysML), 2018.
[34] M. Zaharia, A. Chen, A. Davidson, A. Ghodsi, S. A. Hong, A. Konwinski, S. Murch- ing, T. Nykodym, P. Ogilvie, M. Parkhe, et al., “Accelerating the machine learning lifecycle with mlflow.,” IEEE Data Eng. Bull., vol. 41, no. 4, pp. 39–45, 2018.
[35] S. Schelter, J.-H. Boese, J. Kirschnick, T. Klein, and S. Seufert, “Automatically tracking metadata and provenance of machine learning experiments,” in Machine Learning Systems Workshop at NIPS, pp. 27–29, 2017.
指導教授 莊永裕 審核日期 2021-7-21
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明