利用機器學習法預測土壤含水量的變化

、線上人數：56

、訪客IP：3.129.42.59

姓名	康筑(Chu Kang) 查詢紙本館藏	畢業系所	水文與海洋科學研究所
論文名稱	利用機器學習法預測土壤含水量的變化 (Using machine learning methods to predict changes in soil water content)
檔案	[Endnote RIS 格式] [Bibtex 格式] [相關文章] [文章引用] [完整記錄] [館藏目錄] 至系統瀏覽論文 (2026-12-31以後開放)
摘要(中)	氣候變遷影響日益加劇，其中受到衝擊之一的便是農業。因為土壤水分直接影響植物生長和農業產量，對土地生態系統的穩定性和水資源的可持續利用也都具有一定的影響性。因此若能提高對於土壤含水量變化趨勢的預測準確性，對於農業決策有很大的幫助。本研究旨在利用機器學習中的隨機森林方法來預測深度10公分和20公分的土壤含水量變化情況。以臺中霧峰農試所的歷史土壤水分數據，結合其他環境因素變數如時雨量、累積降雨量組合等，構建隨機森林模式。在模式訓練過程中，由人工窮舉找出較佳的參數組合，如訓練天數、預測天數等。此場址之預測結果顯示，累積降雨量對模式的影響最大。不論是考慮全部資料時間段、僅考慮雨季時期或是透過擬合曲線，均可以發現在深度10公分和20公分下，累積降雨期為6到8天時預測結果較準確，除了在深度10公分時僅考慮雨季時期無法得出最佳降雨天數，其餘皆可得出。考慮全部資料時間段深度10公分和20公分時選擇下降轉折點作為最佳累積天數，MAPE(%)值為25.18和5.13；僅考慮雨季時期，在深度20公分其MAPE(%)為6.63；透過擬合曲線在深度10、20公分，預測與訓練誤差皆小的條件下其預測結果之RMSE(%)值可達2.37和2.03。於未來研究中可以考慮添加更多氣象變數，或是將隨機森林模式結果與水文物理模式相比較，或者進一步探討乾旱時期的應用，以提高預測準確性，為農業應用提供更好的數據供決策者參考。
摘要(英)	The impact of climate change is increasingly severe, particularly in the agricultural sector. Soil moisture, a key factor in plant growth and agricultural yield, also plays a significant role in the stability of land ecosystems and the sustainable use of water resources. Therefore, enhancing the accuracy of soil moisture prediction trends is crucial for informed agricultural decision-making. This study, utilizing the Random Forest method in machine learning, aims to predict soil moisture changes at depths of 10 cm and 20 cm. By leveraging historical soil moisture data from the Wufeng Agricultural Research Station in Taichung, along with other environmental variables such as hourly rainfall and cumulative rainfall, a Random Forest model was meticulously constructed. The model training process involved determining optimal parameter combinations, such as training days and prediction days, through a careful process of manual trial and error, ensuring the reliability of the study′s findings. The prediction results for this site indicate that cumulative rainfall has the greatest impact on the model. Whether considering the entire data period, only the rainy season, or fitting a curve, it can be observed that at depths of 10 cm and 20 cm, predictions are more accurate when the cumulative rainfall period is 6 to 8 days. The exception is at a depth of 10 cm during the rainy season, where an optimal rainfall period could not be determined. When considering the entire data period at depths of 10 cm and 20 cm, choosing the inflection point of the decline as the optimal cumulative days, the MAPE (%) values are 25.18 and 5.13, respectively. During the rainy season, at a depth of 20 cm, the MAPE (%) is 6.63. At 10 cm and 20 cm depths, the prediction RMSE (%) values are 2.37 and 2.03 for the appropriate fitting range concerning the difference between the training and predicting results, respectively. Future research could consider adding more meteorological variables, comparing the results of the Random Forest model with hydrological and physical models, or further exploring applications during drought periods to improve prediction accuracy. This would provide better data for agricultural decision-makers to reference.
關鍵字(中)	★ 隨機森林 ★ 土壤含水量 ★ 累積降雨量	關鍵字(英)	★ Random Forest ★ Soil Water Content ★ Cumulative Rainfall
論文目次	第一章、緒論 1 1.1 研究動機 1 1.2 研究目的 1 1.3 研究流程 3 第二章、文獻回顧 5 第三章、研究方法 8 3.1 決策樹 8 3.2 隨機森林模式 13 3.2.1 擬合 19 3.2.2 模式評估指標 21 第四章、研究場址、資料與參數 26 4.1 研究場址 26 4.2 霧峰場址原始資料 28 4.3 資料轉換處理 33 4.4 隨機森林演算法參數選擇 34 4.5 模式輸入變數資料 37 第五章、結果與討論 43 5.1 隨機森林模擬結果 43 5.2 模式參數對土壤含水量模擬之影響 45 5.3 模式輸入變數對土壤含水量模擬之影響 57 5.4 雨季時期資料對土壤含水量模擬之影響 64 5.5 模式欠擬合和過擬合問題 71 第六章、結論與建議 76 參考文獻 78
參考文獻	1. Ali, K. M., & Pazzani, M. J. (1996). Error reduction through learning multiple descriptions. Machine learning, 24, 173-202. 2. Biau, G. e. (2012). Analysis of a Random Forests Model. 3. Breiman, L. (1984). Classification and regression trees. Routledge. 4. Breiman, L. (1996). Bagging predictors. Machine learning, 24, 123-140. 5. Cha, K.-J., Lee, J.-B., Ozger, M., & Lee, W.-H. (2023). When Wireless Localization Meets Artificial Intelligence: Basics, Challenges, Synergies, and Prospects. Applied Sciences, 13(23), 12734. 6. Chen, L., Xiang, L., Young, M. H., Yin, J., Yu, Z., & van Genuchten, M. T. (2015). Optimal parameters for the Green-Ampt infiltration model under rainfall conditions. Journal of Hydrology and Hydromechanics, 63(2), 93-101. 7. Chen, X., & Ishwaran, H. (2012). Random forests for genomic data analysis. Genomics, 99(6), 323-329. https://doi.org/10.1016/j.ygeno.2012.04.003 8. Cutler, A., & Zhao, G. (2001). Pert-perfect random tree ensembles. Computing Science and Statistics, 33(4), 90-94. 9. Cutler, D. R., Edwards Jr, T. C., Beard, K. H., Cutler, A., Hess, K. T., Gibson, J., & Lawler, J. J. (2007). Random forests for classification in ecology. Ecology, 88(11), 2783-2792. 10. Dietterich, T. G. (2000). Ensemble methods in machine learning. International workshop on multiple classifier systems, 11. Everitt, B. S., & Skrondal, A. (2010). The Cambridge dictionary of statistics. 12. Gray, J., & Shenoy, P. (2000). Rules of thumb in data engineering. Proceedings of 16th International Conference on Data Engineering (Cat. No. 00CB37073), 13. Gu, Y., Wylie, B. K., Boyte, S. P., Picotte, J., Howard, D. M., Smith, K., & Nelson, K. J. (2016). An optimal sample data usage strategy to minimize overfitting and underfitting effects in regression tree models based on remotely-sensed data. Remote Sensing, 8(11), 943. 14. Hastie, T., Tibshirani, R., Friedman, J. H., & Friedman, J. H. (2009). The elements of statistical learning: data mining, inference, and prediction (Vol. 2). Springer. 15. Hawkins, D. M. (2004). The problem of overfitting. Journal of chemical information and computer sciences, 44(1), 1-12. 16. Ho, T. K. (2002). A data complexity analysis of comparative advantages of decision forest constructors. Pattern Analysis & Applications, 5, 102-112. 17. Huang, J.-C., Tsai, Y.-C., Wu, P.-Y., Lien, Y.-H., Chien, C.-Y., Kuo, C.-F., Hung, J.-F., Chen, S.-C., & Kuo, C.-H. (2020). Predictive modeling of blood pressure during hemodialysis: A comparison of linear model, random forest, support vector regression, XGBoost, LASSO regression and ensemble method. Computer methods and programs in biomedicine, 195, 105536. 18. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 112). Springer. 19. Jia, Y., Jin, S., Savi, P., Yan, Q., & Li, W. (2020). Modeling and theoretical analysis of GNSS-R soil moisture retrieval based on the random forest and support vector machine learning approach. Remote Sensing, 12(22), 3679. 20. Khaledian, Y., & Miller, B. A. (2020). Selecting appropriate machine learning methods for digital soil mapping. Applied Mathematical Modelling, 81, 401-418. 21. Kreuzberger, D., Kühl, N., & Hirschl, S. (2023). Machine learning operations (mlops): Overview, definition, and architecture. IEEE Access. 22. Lin, Y., & Jeon, Y. (2012). Random Forests and Adaptive Nearest Neighbors. Journal of the American Statistical Association, 101(474), 578-590. https://doi.org/10.1198/016214505000001230 23. Liu, D., & Sun, K. (2019). Random forest solar power forecast based on classification optimization. Energy, 187, 115940. 24. Louppe, G. (2014). Understanding random forests: From theory to practice. arXiv preprint arXiv:1407.7502. 25. Parmenter, R. R., Yates, T. L., Anderson, D. R., Burnham, K. P., Dunnum, J. L., Franklin, A. B., Friggens, M. T., Lubow, B. C., Miller, M., & Olson, G. S. (2003). Small‐mammal density estimation: a field comparison of grid‐based vs. web‐based density estimators. Ecological monographs, 73(1), 1-26. 26. Polikar, R. (2012). Ensemble learning. Ensemble machine learning: Methods and applications, 1-34. 27. Ramos, A. P. M., Osco, L. P., Furuya, D. E. G., Gonçalves, W. N., Santana, D. C., Teodoro, L. P. R., da Silva Junior, C. A., Capristo-Silva, G. F., Li, J., & Baio, F. H. R. (2020). A random forest ranking approach to predict yield in maize with uav-based vegetation spectral indices. Computers and Electronics in Agriculture, 178, 105791. 28. Schapire, R. E. (2001). Random Forests. 29. Segal, M., & Xiao, Y. (2011). Multivariate random forests. WIREs Data Mining and Knowledge Discovery, 1(1), 80-87. https://doi.org/10.1002/widm.12 30. Sharifani, K., Amini, M., Akbari, Y., & Aghajanzadeh Godarzi, J. (2022). Operating machine learning across natural language processing techniques for improvement of fabricated news model. International Journal of Science and Information System Research, 12(9), 20-44. 31. Steel, R. G. D., & Torrie, J. H. (1960). Principles and procedures of statistics. Principles and procedures of statistics. 32. Sutton, C. D. (2005). Classification and regression trees, bagging, and boosting. Handbook of statistics, 24, 303-329. 33. Tramblay, Y., & Seguí, P. Q. (2022). Estimating soil moisture conditions for drought monitoring with random forests and a simple soil moisture accounting scheme. Natural Hazards and Earth System Sciences, 22, 1325-1334. 34. Verikas, A., Gelzinis, A., & Bacauskiene, M. (2011). Mining data with random forests: A survey and results of new tests. Pattern Recognition, 44(2), 330-349. https://doi.org/10.1016/j.patcog.2010.08.011 35. Willcock, S., Martínez-López, J., Hooftman, D. A., Bagstad, K. J., Balbi, S., Marzo, A., Prato, C., Sciandrello, S., Signorello, G., & Voigt, B. (2018). Machine learning for ecosystem services. Ecosystem services, 33, 165-174. 36. Xie, X., Wu, T., Zhu, M., Jiang, G., Xu, Y., Wang, X., & Pu, L. (2021). Comparison of random forest and multiple linear regression models for estimation of soil extracellular enzyme activities in agricultural reclaimed coastal saline land. Ecological Indicators, 120, 106925. 37. Yaseen, Z. M. (2021). An insight into machine learning models era in simulating soil, water bodies and adsorption heavy metals: Review, challenges and solutions. Chemosphere, 277, 130126. 38. Zhang, Q.-y., Chen, W.-w., & Zhang, Y.-m. (2019). Modification and evaluation of Green–Ampt model: Dynamic capillary pressure and broken-line wetting profile. Journal of Hydrology, 575, 1123-1132. 39. Zounemat-Kermani, M., Batelaan, O., Fadaee, M., & Hinkelmann, R. (2021). Ensemble machine learning paradigms in hydrology: A review. Journal of Hydrology, 598, 126266. 40. Lewis, C. D. (1982). Industrial and business forecasting methods: A practical guide to exponential smoothing and curve fitting. 41. Carranza, C., Nolet, C., Pezij, M., & van der Ploeg, M. (2021). Root zone soil moisture estimation with Random Forest. Journal of hydrology, 593, 125840. 42. Pekel, E. (2020). Estimation of soil moisture using decision tree regression. Theoretical and Applied Climatology, 139(3), 1111-1119. 43. 許家寅，2023。利用地電阻影像法推估降雨入滲範圍：以臺中霧峰農地為例。國立中央大學地球科學學系地球物理研究所碩士論文。 44. 陳均，2023。耦合水熱電模式優化降雨入滲模擬。國立中央大學地球科學學系地球物理研究所碩士論文。 45. 陳建志等人，2023。人工智慧架構的地球環境災害防減抗策略TWAI(湍)。國家科學及技術委員會。 46. 簡均任，2013。乾旱指標結合氣候統計降尺度預報於石門水庫供水之乾旱預警應用。國立中央大學水文與海洋科學研究所碩士論文。 47. 黃琮智，2019。隨機森林結合基因演算法於鐵達尼分類問題。臺北市立大學資訊科學系研究所碩士論文。 48. 許淨嵐，2023。兩階段基於成本敏感學習框架之客端品質檢測於 TFT-LCD 產業。國立清華大學工業工程與工程管理研究所碩士論文。 49. 國立中興大學土壤科學系，1976。臺中縣南投縣土壤調查報告： Retrieved from 臺中市南區 50. 行政院農業委員會農業試驗所，2022。發展及應用二維地電阻層析成像技術推估農地之土壤水文特性。財團法人工業技術研究院。 51. 經濟部水利署，2020。水文年報。經濟部水利署。https://gweb.wra.gov.tw/wrhygis/ebooks/getebook.asp 52. 莊啟宏，2023。深度學習-使用TensorFlow 2.x。全華圖書股份有限公司，第六章-神經網路的優化與調教。 53. 行政院農業委員會農業試驗所，2016。農業試驗所土壤資料供應查詢平台。Retrieved from https://tssurgo.tari.gov.tw/Tssurgo/Map 54. 交通部中央氣象局 & 行政院農業委員會，2023。農業氣象觀測網監測系統。 Retrieved from https://agr.cwb.gov.tw/NAGR/history/station_hour
指導教授	陳沛芫陳建志(Pei-Yuan Chen Chien-Chih Chen)	審核日期	2024-8-5
推文	facebook plurk twitter funp google live udn HD myshare reddit netvibes friend youpush delicious baidu
網路書籤	Google bookmarks del.icio.us hemidemi myshare

博碩士論文 110626006 詳細資訊