姓名 李承祐(Cheng-Yu Lee) 畢業系所 企業管理學系 論文名稱 透過機器學習預測美國職棒大聯盟球員薪資
摘要(中) 美國職棒大聯盟(MLB, Major League Baseball)是全世界具有龐大關注度的運動之
的方法,如極限梯度提升(XGBoost)、支援向量機(SVM)與 K 鄰近法(KNN)建構分類
據。摘要(英) Major League Baseball is one of the most watched sports in the world. In recent years, in
addition to focusing on the performance of a player and his team, a player′s salary has also been a
focus of fan discussion, always generating discussion and beginning to examine whether a player′s
performance really matches his worth.
Therefore, how to evaluate the salary of players has always been a hot topic. The most direct basis
is the performance of players in the game. In addition to the statistical performance of players on
the field, many scholars have also proposed some variables that may affect the salary of players. At
present, there have been many studies on the salary of major league baseball, and there are many
reasons for the influence of salary. Some scholars even divide the players into pitcher and hitter for
Therefore, this study focused on the players into the compensation to the annual salary increase do
interval, using machine learning methods, such as limit gradient (XGBoost) and support vector
machine (SVM) and K Nearest Neighbor (KNN) to do a classficiation prediction model, in addition
to build models of forecasting player salary increase, also use limit gradient to validate our new
variables in this research institute, the results show that the new variables can be predicted as salary
in our study.關鍵字(中) ★ 美國職棒
★ 限梯度提升
★ 支援向量機
★ 鄰近法
★ 薪資預測
★ 分類關鍵字(英) ★ MLB
★ XGBoost
★ Predicting Salaries
★ Classification論文目次 中文摘要................................................................................................ i
ABSTRACT......................................................................................... ii
目錄...................................................................................................... iii
圖目錄................................................................................................... v
表目錄.................................................................................................. vi
第一章 緒論......................................................................................... 1
1-1 研究背景.................................................................................................................1
1-2 研究動機.................................................................................................................2
1-3 研究目的...............................................................................................................3
1-4 論文結構...............................................................................................................5
第二章 文獻探討................................................................................. 6
2-1 美國職棒薪水變數的文獻探討..............................................................................6
第三章 研究方法............................................................................... 13
3-1 研究設計...............................................................................................................13
3-2 分類模型...............................................................................................................14
3-2-1 極限梯度提升(XGboost)...................................................................................14
3-2-2 支援向量機(SVM)...........................................................................................16
3-2-3 K 鄰近算法(KNN)............................................................................................17iv
第四章 研究分析............................................................................... 19
4-1 美國職棒概述.......................................................................................................19
4-2 資料來源與資料集...............................................................................................22
4-3 資料預處理...........................................................................................................27
4-4 結果驗證...............................................................................................................30
4-4-1 XGBoost 模型預測結果.....................................................................................30
4-4-2 SVM 模型預測結果...........................................................................................37
4-4-3 KNN 模型預測結果...........................................................................................41
4-5 準確度的比較.......................................................................................................46
第五章 結論與建議........................................................................... 47
5-1 研究結論...............................................................................................................47
5-2 研究限制與建議...................................................................................................48
指導教授 許秉瑜(Ping-Yu Hsu) 審核日期 2022-6-29
