SELECTING TRANSFER CONDITIONS BASED ON XGBOOST TO IMPROVE WATER QUALITY PREDICTION CAPACITY OF THE LSTM MODEL
-
摘要: 准确预测河流水质变化是流域水环境管理的重要基础。目前常用的基于数据驱动的深度学习模型依赖大量的监测数据训练,然而很多河流数据缺乏,无法满足水质预测精度要求。提出了一种基于极端梯度提升模型(XGBoost)的迁移条件选择方法,利用全国河流自动监测站点的水质参数(水温、pH、溶解氧、总氮)数据集,研究建立长短期记忆神经网络(LSTM)模型库,通过迁移学习条件的优化,提升LSTM模型的预测能力。结果表明:1)采用不同源域和迁移方式训练出的模型,其预测精度有很大差异;2)基于XGBoost模型选择最佳迁移条件,迁移模型的预测误差(RMSE)降低了9.6%~28.9%,LSTM模型预测精度明显提升;3)选取合适的迁移方式、选用性质接近的源域数据、增加训练数据量均可以提升迁移模型的预测精度。该建模方法可应用于实测数据少的河流水质预测,为流域水环境精细化管理提供技术支持。Abstract: Accurate prediction of river water quality change is an important basis for watershed water environment management. Currently, training of the commonly used data-driven deep learning model relies on large amounts of monitoring data. However, many rivers lack monitoring data so they can't meet the accuracy requirements of water quality prediction. In this study, we have developed an approach of selecting transfer conditions based on the XGBoost model. Water quality data(temperature, pH, dissolved oxygen, total nitrogen) from automatic monitoring stations across the major river in China are used for the establishment of long and short-term memory neural network(LSTM) models. The prediction ability of the LSTM model was improved by optimizing transfer learning conditions. The results showed that: 1) the prediction accuracy of the models trained by different source domains and transfer modes was quite different; 2) when the optimal transfer conditions were selected based on the XGBoost model, the prediction error(RMSE) of the transfer model was reduced by 9.6% to 28.9%, indicating that the prediction accuracy of selected LSTM model was significantly improved. 3) selecting appropriate transfer mode, using source domain data with similar properties, and increasing the amount of training data can improve the prediction accuracy of the transfer model. The modeling approach proposed in this paper can be directly applied to the prediction of river water quality with little monitoring data, which can support watershed water environment management.
-
Key words:
- water quality prediction /
- LSTM model /
- transfer learning /
- XGBoost model
-
[1] FALCONI T M A,KULINKINA A V,MOHAN V R,et al.Quantifying tap-to-household water quality deterioration in urban communities in Vellore,India:the impact of spatial assumptions[J].International Journal of Hygiene and Environmental Health,2017,220(1):29-36. [2] PETER L B,NELSON T A,VAN A K L,et al.Comparison of green algal bloom intensity and related water quality parameters at paired "bloom" and "non-bloom" sites[J].Journal of Phycology,2007,43:33-34. [3] VOEROESMARTY C J,MCINTYRE P B,GESSNER M O,et al.Global threats to human water security and river biodiversity[J].Nature,2010,467(7315):555-561. [4] TANER M U,CARLETON J N,WELLMAN M.Integrated model projections of climate change impacts on a North American lake[J].Ecological Modelling,2011,222(18):3380-3393. [5] COSTA C,MARQUES L D,ALMEIDA A K,et al.Applicability of water quality models around the world-a review[J].Environmental Science and Pollution Research,2019,26(36):36141-36162. [6] 陈能汪,余镒琦,陈纪新,等.人工神经网络模型在水质预警中的应用研究进展[J].环境科学学报,2021,41(12):4771-4782. [7] ZHI W,FENG D P,TSAI W P,et al.From hydrometeorology to river water quality:can a deep learning model predict dissolved oxygen at the continental scale?[J].Environmental Science & Technology,2021,55(4):2357-2368. [8] LU J,BEHBOOD V,HAO P,et al.Transfer learning using computational intelligence:a survey[J].Knowledge-Based Systems,2015,80:14-23. [9] LI X C,ZHAN D C,YANG J Q,et al.Towards understanding transfer learning algorithms using meta transfer features[C]//24th Pacific-Asia Conference on Knowledge Discovery and Data Mining,Singapore,2020. [10] RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].Journal of Machine Learning Research,2020,21:5485-5551. [11] ALAWAD M,YOON H J,GAO S,et al.Privacy-preserving deep learning nlp models for cancer registries[J].IEEE Transactions on Emerging Topics in Computing,2021,9(3):1219-1230. [12] AYANA G,DESE K,CHOE S W.Transfer learning in breast cancer diagnoses via ultrasound imaging[J].Cancers,2021,13(4):1-15. [13] HERATH S,FERNANDO B,HARANDI M.Using temporal information for recognizing actions from still images[J].Pattern Recognition,2019,96:1-11. [14] ZHOU J,CHEN Y,XIAO F,et al.Water quality prediction method based on transfer learning and echo state network[J].Journal of Circuits Systems and Computers,2021,30(14):1-12. [15] CHEN Z,XU H,JIANG P,et al.A transfer learning-based LSTM strategy for imputing large-scale consecutive missing data and its application in a water quality prediction system[J].Journal of Hydrology,2021,602:1-16. [16] PENG L,WU H,GAO M,et al.TLT:recurrent fine-tuning transfer learning for water quality long-term prediction[J].Water Research,2022,225:1-12. [17] MICHIELETTO L,OUYANG B,WILLS P.Investigation of water quality using transfer learning,phased LSTM and correntropy loss[C]//Conference on Big Data Ⅱ-Learning,Analytics,and Applications,SPIE,2020. [18] MA J,CHENG J C P,LIN C,et al.Improving air quality prediction accuracy at larger temporal resolutions using deep learning and transfer learning techniques[J].Atmospheric Environment,2019,214:1-9. [19] WILLARD J D,READ J S,APPLING A P,et al.Predicting water temperature dynamics of unmonitored lakes with meta-transfer learning[J].Water Resources Research,2021,57(7):1-11. [20] MA J,LI Z,CHENG J C P,et al.Air quality prediction at new stations using spatially transferred bidirectional long short-term memory network[J].Science of the Total Environment,2020,705:1-12. [21] GUI L,XU R,LU Q,et al.Negative transfer detection in transductive transfer learning[J].International Journal of Machine Learning and Cybernetics,2018,9(2):185-197. [22] WANG S,ZHOU Y,YOU X,et al.Quantification of the antagonistic and synergistic effects of Pb2+,Cu2+,and Zn2+bioaccumulation by living Bacillus subtilis biomass using XGBoost and SHAP[J].Journal of Hazardous Materials,2023,446:1-9. [23] HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural computation,1997,9(8):1735-1780. [24] ZHOU Y L.Real-time probabilistic forecasting of river water quality under data missing situation:deep learning plus post-processing techniques[J].Journal of Hydrology,2020,589:1-10. [25] YANG Y,XIONG Q,WU C,et al.A study on water quality prediction by a hybrid CNN-LSTM model with attention mechanism[J].Environmental Science and Pollution Research,2021,28(39):55129-55139. [26] FANG X,LI X Y,ZHANG Y F,et al.Random forest-based understanding and predicting of the impacts of anthropogenic nutrient inputs on the water quality of a tropical lagoon[J].Environmental Research Letters,2021,16(5):1-12. [27] PAN S J,YANG Q A.A survey on transfer learning[J].Ieee Transactions on Knowledge and Data Engineering,2010,22(10):1345-1359. [28] WEI Y,ZHANG Y,HUANG J Z,et al.Transfer learning via learning to transfer[C]//35th International Conference on Machine Learning,Sweden,2018. [29] BHAGAT S K,TUNG T M,YASEEN Z M.Heavy metal contamination prediction using ensemble model:case study of Bay sedimentation,Australia[J].Journal of Hazardous Materials,2021,403:1-13. [30] BENTEJAC C,CSORGO A,MARTINEZ-Munoz G.A comparative analysis of gradient boosting algorithms[J].Artificial Intelligence Review,2021,54(3):1937-1967. [31] DUPAS R,TAVENARD R,FOVET O,et al.Identifying seasonal patterns of phosphorus storm dynamics with dynamic time warping[J].Water Resources Research,2015,51(11):8868-8882. [32] LI L,QIAO J,YU G,et al.Interpretable tree-based ensemble model for predicting beach water quality[J].Water Research,2022,211:1-12. [33] IOVANAC N C,SAVOIE B M.Simpler is better:how linear prediction tasks improve transfer learning in chemical autoencoders[J].Journal of Physical Chemistry A,2020,124(18):3679-3685. [34] WU X T,MANTON J H,AICKELIN U,et al.Online transfer learning:negative transfer and effect of prior knowledge[C]//IEEE International Symposium on Information Theory,Australia,2021. [35] 邓建军.基于Attention-LSTM与XGBoost集成机制的中国商品期货投资策略研究[D].成都:四川大学,2022. [36] 黄心裕.基于数值模拟和XGBoost算法的海南清澜红树林消浪分析[D].大连:大连理工大学,2022.
点击查看大图
计量
- 文章访问数: 90
- HTML全文浏览量: 9
- PDF下载量: 6
- 被引次数: 0