Modeling Temporal Persistence in Daily Pan Evaporation Using Interpretable Machine Learning
DOI:
https://doi.org/10.46488/Keywords:
Pan evaporation, temporal memory, Evaporation persistence, Machine learning, CatBoostAbstract
Accurate prediction of daily pan evaporation (Ep) is critical for irrigation management and water resource planning, yet remains challenging due to strong temporal variability and persistence. The majority of available models are based mainly on the contemporaneous meteorological variables and do not take into account inherent memory of evaporation. This paper systematically explores the impact of the temporal memory on daily Ep by explicitly separating the evaporation persistence and meteorological memory through a machine learning framework. A Categorical Boosting (CatBoost) model was used to analyze multiple year data of a semi-arid to sub-humid area in western-central India. The model used lagged Ep and meteorological variables at 1, 3, 7 and 14 days time scale. Coefficient of determination (R²), root mean square error (RMSE), and mean absolute error (MAE), were used to evaluate model performance while Random Forest and XGBoost were used for benchmarking. The feature importance and SHAP analysis offered interpretability. The findings indicate that the addition of a lag of 1 day pan evaporation significantly enhances predictive performance with RMSE dropping to 0.83mm. This enhancement indicates a presence of high short-term persistence in day-to-day evaporation. Longer evaporation lags degrade performance, whereas lagged meteorological variables provide modest additional improvement beyond evaporation memory. CatBoost has the most balanced performance among the models that are evaluated. Overall, the findings confirm that short-term evaporation persistence dominates temporal dependence in daily Ep, with meteorological memory playing a secondary role, offering a physically interpretable and efficient framework for data-driven evaporation modeling.