English  |  正體中文  |  简体中文  |  全文筆數/總筆數 : 78818/78818 (100%)
造訪人次 : 34757622      線上人數 : 1756
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜尋範圍 查詢小技巧:
  • 您可在西文檢索詞彙前後加上"雙引號",以獲取較精準的檢索結果
  • 若欲以作者姓名搜尋,建議至進階搜尋限定作者欄位,可獲得較完整資料
  • 進階搜尋


    請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/92894


    題名: 基於時間序列模型之MLB勝負預測
    作者: 張孝宇;Chang, Hsiao-Yu
    貢獻者: 數學系
    關鍵詞: 美國職棒大聯盟;時間序列預測;特徵選取;Major League Baseball;time series forecasting;feature selection
    日期: 2023-07-24
    上傳時間: 2023-10-04 16:13:03 (UTC+8)
    出版者: 國立中央大學
    摘要: 棒球勝負是一個複雜且多變的問題,此問題受到眾多因素的影響,例
    如球員表現、隊伍實力、比賽場地等等。在過去分析這類問題時,並未
    使用時間序列模型來做分析,因此本研究嘗試使用這種類型的模型,用
    於進行數據分析。
    本研究所使用的資料取自 Baseball Reference 網站,從中獲取了 2011
    年至 2022 年各隊伍的投手和打者統計數據,本研究將此數據集經過數據
    預處理後,採用了 2013 年到 2022 年的資料,其中不包含 2020 年的資
    料,之後,依場次進行切割,其目的是利用歷史比賽數據來預測未來比
    賽的勝負,最後,觀察各隊伍訓練及測試之結果,並分析探討影響預測
    結果的因素。
    本研究採用了循環神經網絡(Recurrent Neural Network, RNN)、長
    短期記憶(Long Short-Term Memory, LSTM)、門控循環單元(Gated
    Recurrent Unit, GRU)這三個時間序列模型,來做訓練並觀察其結果。
    最終結果是透過有無特徵選取,各個模型架構及資料形態下的結果,
    來進行比較,其中最好的是,沒有做特徵選取,長短期記憶架構下,用前
    6 場預測下 1 場資料型態的結果,其準確率有 57% 左右,而 ROC 曲面
    下面積則有 52% 左右。
    ;Baseball winning or losing is a complex and dynamic problem, which is affected by many factors, such as player performance, team strength, playing field, and so on. When analyzing such problems in the past, time series models were not used for analysis, so this study attempts to use this type of model for data analysis.
    The data used in this study were obtained from the Baseball Reference website, comprising statistical data for pitchers and batters of each team from 2011 to 2022. After data preprocessing, the study focused on the data from 2013 to 2022, excluding the data from 2020. Subsequently, the data was segmented based on individual games. The main objective was to utilize historical game data to predict future games. The study then presents the test results, and analyzes and discusses the factors influencing the prediction outcomes.
    In this study, three time series models, namely Recurrent Neural Net work (RNN), Long Short-Term Memory (LSTM), and Gated Recurrent Unit (GRU), were employed for training and evaluating the results.
    The final results were compared based on the presence or absence of feature selection, various model architectures, and data formats. Among them, the best-performing approach was using LSTM architecture without feature selection, where the model predicted the outcome of one game based on the previous six games. The accuracy achieved in this setting was around 57%, and the area under the ROC curve was around 52%.
    顯示於類別:[數學研究所] 博碩士論文

    文件中的檔案:

    檔案 描述 大小格式瀏覽次數
    index.html0KbHTML44檢視/開啟


    在NCUIR中所有的資料項目都受到原著作權保護.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明