博碩士論文 110522009 詳細資訊




以作者查詢圖書館館藏 以作者查詢臺灣博碩士 以作者查詢全國書目 勘誤回報 、線上人數:3 、訪客IP:18.224.37.68
姓名 黃宗泓(Zong-Hong Huang)  查詢紙本館藏   畢業系所 資訊工程學系
論文名稱 應用自動化測試於異質環境機器學習管道之 MLOps 系統
(An MLOps system that applies automated testing on ML pipelines in heterogeneous environments.)
檔案 [Endnote RIS 格式]    [Bibtex 格式]    [相關文章]   [文章引用]   [完整記錄]   [館藏目錄]   至系統瀏覽論文 ( 永不開放)
摘要(中) 在將機器學習方案應用於實際環境時, 我們面臨著各種挑戰。
MLOps(Machine Learning Operations)的實踐建議可以幫助維運人員快
速將機器學習方案部署到生產環境中。但事實上,現有的MLOps 平台
功能仍不完善。在建置管道的過程中,需要將每個機器學習模組打包成
容器,而無法容器化的模組存在使用限制,並且現有平台也未提供測試
功能來檢測管道的正確性,因此需要進行人工端到端測試。
為了解決這些問題,我們提出了一個新的MLOps 平台。該平台能
夠在異質環境上建立管道元件,從而讓更多的機器學習方法可以透過平
台建置成管道;此外,我們的平台還提供不同等級的自動化測試功能,
以測試機器學習管道的正確性。
本文將通過比較現有平台,來闡述我們平台在加速機器學習管道部
署方面的優勢。同時,我們將通過一個實際的機器學習部署案例來說明,
我們平台提供的功能在該案例的部署過程中所帶來的效益。
摘要(英) When applying ML(Machine Learning) solutions in production environments,
we face various challenges. The recommendations of MLOps
(Machine Learning Operations) can assist operators in rapidly deploying
ML solutions to production. However, the existing MLOps platforms is
still incomplete. In the process of building pipelines, it is necessary to
containerize each ML module, and modules that can’t be containerized
have usage limitations. Additionally, the current platforms don’t provide
testing functionality to verify the correctness of pipelines, thus requiring
manual end-to-end testing. We propose a new MLOps platform to address
these issues. This platform enables more ML methods to be built as
pipelines in heterogeneous environments. Furthermore, our platform offers
automated testing functionality at different levels to test ML pipelines. In
this paper, we will illustrate the advantages of our platform in accelerating
the deployment of ML pipelines by comparing it with existing platforms.
Additionally, we will demonstrate the benefits of our platform during the
deployment process through a practical case study.
關鍵字(中) ★ MLOps
★ 機器學習部署
★ 機器學習管道
★ 管道測試
關鍵字(英) ★ MLOps
★ ML Deployment
★ ML pipeline
★ Pipeline test
論文目次 摘要vi
Abstract vii
目錄viii
圖目錄x
表目錄xi
一、緒論1
1-1 研究背景. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1-2 研究動機. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1-3 研究目的. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1-4 論文架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
二、相關研究5
2-1 背景知識. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2-1-1 Docker Swarm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2-1-2 GitLab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2-1-3 MinIO. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2-1-4 MLFlow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2-1-5 Deep Lake. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2-1-6 Harbor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2-2 MLOps 相關研究. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2-2-1 MLOps 平台. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2-2-2 部署機器學習管道. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
三、系統設計與流程設計10
3-1 系統架構. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3-2 機器學習管道建置與執行. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3-3 機器學習管道測試. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3-4 系統使用流程設計. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3-4-1 開發人員操作流程. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3-4-2 維運人員操作流程. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3-4-3 機器學習使用者操作流程. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
四、討論18
4-1 功能比較. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4-2 案例討論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
五、總結25
5-1 結論. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5-2 未來展望. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
參考文獻26
附錄A 測試案例29
參考文獻 [1] D. Sculley, G. Holt, D. Golovin, et al., “Hidden technical debt in machine learning
systems,” in Advances in Neural Information Processing Systems, vol. 28, Curran
Associates, Inc., 2015. (visited on 04/19/2023).
[2] D. Sato, A. Wider, and C. Windheuser, Continuous delivery for machine learning,
https://martinfowler.com/articles/cd4ml.html. (visited on 03/16/2023).
[3] Mlops: Continuous delivery and automation pipelines in machine learning | cloud architecture
center, https://cloud.google.com/architecture/mlops-continuous-deliveryand-
automation-pipelines-in-machine-learning, Google Cloud. (visited on 09/20/2022).
[4] Vertex ai, https://cloud.google.com/vertex-ai. (visited on 04/21/2023).
[5] Á. López García, J. M. De Lucas, M. Antonacci, et al., “A cloud-based framework
for machine learning workloads and applications,” IEEE Access, vol. 8, pp. 18 681–
18 692, 2020. doi: 10.1109/ACCESS.2020.2964386.
[6] A. R. Patel and S. Tyagi, “The state of test automation in devops: A systematic
literature review,” in Proceedings of the 2022 Fourteenth International Conference
on Contemporary Computing, ser. IC3-2022, New York, NY, USA: Association for
Computing Machinery, 10 月24, 2022, pp. 689–695, isbn: 978-1-4503-9675-2. doi:
10.1145/3549206.3549321. (visited on 04/08/2023).
[7] Independent report highlights esri as leader in global gis market, https://www.
esri.com/about/newsroom/announcements/independent-report-highlightsesri-
as-leader-in-global-gis-market/, 2015. (visited on 04/29/2023).
[8] V. Apolinario, Nano server x server core x server - which base image is the right
one for you? https://techcommunity.microsoft.com/t5/containers/nano-server-xserver-
core-x-server-which-base-image-is-the-right/ba-p/2835785, 2021. (visited on
04/21/2023).
[9] B. B. N. de França, H. Jeronimo, and G. H. Travassos, “Characterizing devops
by hearing multiple voices,” in Proceedings of the XXX Brazilian Symposium on
Software Engineering, ser. SBES ’16, New York, NY, USA: Association for Computing
Machinery, 9 月19, 2016, pp. 53–62, isbn: 978-1-4503-4201-8. doi: 10 .
1145/2973839.2973845. (visited on 06/11/2023).
[10] R. Lourenço, J. Freire, and D. Shasha, “Debugging machine learning pipelines,”
in Proceedings of the 3rd International Workshop on Data Management for Endto-
End Machine Learning, ser. DEEM’19, New York, NY, USA: Association for
Computing Machinery, 6 月30, 2019, pp. 1–10, isbn: 978-1-4503-6797-4. doi: 10.
1145/3329486.3329489. (visited on 05/03/2023).
[11] B. Jandl-Scherf, H. Lernbeiss, C. Derler, P. Mohr, and M. Pockl, “Software engineering
in the light of evolving standards in cbrn disaster management,” in 2016
3rd International Conference on Information and Communication Technologies for
Disaster Management (ICT-DM), Vienna, Austria: IEEE, 2016, pp. 1–8, isbn: 978-
1-5090-5234-9. doi: 10.1109/ICT-DM.2016.7857217. (visited on 05/30/2023).
[12] Swarm mode overview, https://docs.docker.com/engine/swarm/, 2023. (visited on
06/07/2023).
[13] Docker engine overview, https://docs.docker.com/engine/, 2023. (visited on 06/14/2023).
[14] Gitlab documentation, https://docs.gitlab.com/. (visited on 06/07/2023).
[15] Minio object storage for container — minio object storage for container, https://
min.io/docs/minio/container/index.html. (visited on 06/07/2023).
[16] S. Samundiswary and N. M. Dongre, “Object storage architecture in cloud for
unstructured data,” in 2017 International Conference on Inventive Systems and
Control (ICISC), 2017, pp. 1–6. doi: 10.1109/ICISC.2017.8068716.
[17] Mlflow documentation — mlflow 2.4.0 documentation, https://mlflow.org/docs/
latest/index.html. (visited on 06/07/2023).
[18] Deep lake docs, https://docs.activeloop.ai/?utm_source=deeplakeweb&utm_medium=web&utm_(visited on 06/08/2023).
[19] N. Miloslavskaya and A. Tolstoy, “Big data, fast data and data lake concepts,”
Procedia Computer Science, 7th Annual International Conference on Biologically
Inspired Cognitive Architectures, BICA 2016, Held July 16 to July 19, 2016 in New
York City, NY, USA, vol. 88, pp. 300–305, 2016. doi: 10.1016/j.procs.2016.
07.439. (visited on 06/08/2023).
[20] J. Wang, X. Yi, R. Guo, et al., “Milvus: A purpose-built vector data management
system,” in Proceedings of the 2021 International Conference on Management of
Data, ser. SIGMOD ’21, New York, NY, USA: Association for Computing Machinery,
6 月18, 2021, pp. 2614–2627, isbn: 978-1-4503-8343-1. doi: 10.1145/3448016.
3457550. (visited on 06/08/2023).
[21] Harbor, https://goharbor.io/. (visited on 06/08/2023).
[22] OneAI-main | zh - HackMD, https://man.twcc.ai/@twccdocs/doc-oneai-main-zh/
https%3A%2F%2Fman.twcc.ai%2F%40twccdocs%2Foneai-overview-zh. (visited on
04/22/2023).
[23] Primehub ai 平台- infuseai, https://tw.infuseai.io/products/primehub-aiplatform.
(visited on 04/22/2023).
[24] Y. Liu, Z. Ling, B. Huo, B. Wang, T. Chen, and E. Mouine, “Building a platform for
machine learning operations from open source frameworks,” IFAC-PapersOnLine,
3rd IFAC Workshop on Cyber-Physical & Human Systems CPHS 2020, vol. 53,
no. 5, pp. 704–709, 2020. doi: 10 . 1016 / j . ifacol . 2021 . 04 . 161. (visited on
12/19/2022).
[25] K. Katsiapis and K. Haas, “Towards ml engineering with tensorflow extended (tfx),”
in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge
Discovery & Data Mining, ser. KDD ’19, New York, NY, USA: Association for
Computing Machinery, 7 月25, 2019, p. 3182, isbn: 978-1-4503-6201-6. doi: 10.
1145/3292500.3340408. (visited on 06/14/2023).
[26] E. Bisong, “Kubeflow and kubeflow pipelines,” in Building Machine Learning and
Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for
Beginners, E. Bisong, Ed., Berkeley, CA: Apress, 2019, pp. 671–685, isbn: 978-1-
4842-4470-8. doi: 10.1007/978-1-4842-4470-8_46. (visited on 06/14/2023).
[27] I. Karamitsos, S. Albarhami, and C. Apostolopoulos, “Applying devops practices
of continuous automation for machine learning,” Information, vol. 11, no. 7, p. 363,
2020. doi: 10.3390/info11070363. (visited on 12/18/2022).
[28] A. Posoldova, “Machine learning pipelines: From research to production,” IEEE
Potentials, vol. 39, no. 6, pp. 38–42, 2020. doi: 10.1109/MPOT.2020.3016280.
[29] C. Weiss, R. Premraj, T. Zimmermann, and A. Zeller, “How long will it take to
fix this bug?” In Fourth International Workshop on Mining Software Repositories
(MSR’07:ICSE Workshops 2007), 2007, pp. 1–1. doi: 10.1109/MSR.2007.13.
[30] G. Rodriguez-Perez, G. Robles, and J. M. Gonzalez-Barahona, “How much time did
it take to notify a bug? two case studies: Elasticsearch and nova,” in 2017 IEEE/
ACM 8th Workshop on Emerging Trends in Software Metrics (WETSoM), Buenos
Aires: IEEE, 2017, pp. 29–35, isbn: 978-1-5386-2807-2. doi: 10 . 1109 / WETSoM .
2017.6. (visited on 06/01/2023).
[31] X. Han, N. Zhang, W. He, K. Zhang, and L. Tang, “Automated warship software
testing system based on loadrunner automation api,” in 2018 IEEE International
Conference on Software Quality, Reliability and Security Companion (QRS-C),
2018, pp. 51–55. doi: 10.1109/QRS-C.2018.00023.
指導教授 梁德容 莊永裕(Deron Liang YungYu Zhuang) 審核日期 2023-7-31
推文 facebook   plurk   twitter   funp   google   live   udn   HD   myshare   reddit   netvibes   friend   youpush   delicious   baidu   
網路書籤 Google bookmarks   del.icio.us   hemidemi   myshare   

若有論文相關問題,請聯絡國立中央大學圖書館推廣服務組 TEL:(03)422-7151轉57407,或E-mail聯絡  - 隱私權政策聲明