A COMPARATIVE STUDY ON EDICTIVE EFFECT OF PM2.5 IN BEIJING BASED ON TREE MODELS
-
摘要: 在城市空气质量预测中,ρ(PM2.5)会受到气象条件和时间周期的影响。选取北京市全市为实验区域,对多种污染物浓度特征、时间特征及天气特征等进行分析,采用2019年33个空气质量监测站逐小时数据开展PM2.5预测实验,建立了基于特征的LightGBM (light gradient boosting machine) PM2.5质量浓度预测模型,分别与随机森林模型(RF)、梯度提升树模型(GBDT)、 XGBoost模型3个PM2.5浓度预测模型进行对比。结果表明:在PM2.5浓度预测精度方面,LightGBM模型最高,XGBoost模型次之,RF模型最差。LightGBM模型的PM2.5污染浓度预测准确率高于其他模型,R2为0.9614,且具有训练快、内存少等优点。LightGBM模型的5个评估指标均优于其他模型,说明其在PM2.5逐时预测上具有很好的稳定性和应用前景。Abstract: In urban air quality forecast, the mass concentrations of PM2.5 were influenced by the meteorological conditions and time period. This article selected Beijing as the experimental area, analysing a variety of pollutants concentration characteristics, time characteristics and weather characteristics. The data by hour of 33 air quality monitoring stations in 2019 were used to carry out the PM2.5 forecast experiments, based on characteristics of LightGBM(light gradient boosting machine) PM2.5 mass concentration prediction model. The results showed that compared with random forests model(RF), gradient boosting decision tree model(GBDT), XGBoost model, LightGBM model had the highest prediction accuracy of PM2.5 concentration, XGBoost model came next, random forest model was the lowest. The accuracy of LightGBM model PM2.5 prediction was higher than other models, R2 was 0.9614, and training LightGBM model was fast and RAM needed less. LightGBM model on the five indicators were better than the rest of the model, and LightGBM model on PM2.5 hourly prediction had better stability and application prospects.
-
Key words:
- periodic characteristics /
- machine learning /
- influencing factors of PM2.5 /
- LightGBM /
- PM2.5 prediction
-
[1] 薛骅骎.大气颗粒物的化学组成、来源识别和污染评价研究[D].合肥:中国科学技术大学,2019. [2] 林承勇.人工神经网络在预测PM2.5浓度中的研究[D].成都:电子科技大学,2016. [3] 冯科展,解建军,张玫,等.灰色模型在PM2.5预测中的应用[J].绵阳师范学院学报,2015,34(5):75-79. [4] 陈志文,刘立.基于BP神经网络的PM2.5预测[J].电子技术与软件工程,2019(5):143-144. [5] 曲悦,钱旭,宋洪庆,等.基于机器学习的北京市PM2.5浓度预测模型及模拟分析[J].工程科学学报,2019,41(3):401-407. [6] 康俊锋,黄烈星,张春艳,等.多机器学习模型下逐小时PM2.5预测及对比分析[J].中国环境科学,2020,40(5):1895-1905. [7] LIU X L,TAN W A,TANG S,et al.A Bagging-GBDT ensemble learning model for city air pollutant concentration prediction[C]//4th International Conference on Advances in Energy Resources and Environment Engineering,2019. [8] 任才溶,谢刚.基于随机森林和气象参数的PM2.5浓度等级预测[J].计算机工程与应用,2019,55(2):213-220. [9] 夏润,张晓龙.基于改进集成学习算法的在线空气质量预测[J].武汉科技大学学报,2019,42(1):61-67. [10] 王亚男.大数据背景下PM2.5浓度预测的研究[D].曲阜:曲阜师范大学,2019. [11] 刘杰,杨鹏,吕文生,等.基于气象因素的PM2.5质量浓度预测模型[J].山东大学学报(工学版),2015,45(6):76-83. [12] ZHANG Y,ZHANG R R,MA Q F,et al.A feature selection and multi-model fusion-based approach of predicting air quality[J].ISA Transactions,2020,100:210-220. [13] WONG K I,WONG P K,CHEUNG C S,et al.Modeling and optimization of biodiesel engine performance using advanced machine learning methods[J].Energy,2013,55:519-528. [14] HUANG W,CHENG X.Multiple Regression Method for Estimating Concentration of PM2.5 using Remote Sensing and Meteorological Data[J].Journal of Environmental Protection and Ecology,2017,18(2):417-424. [15] 沙靖岚.基于LightGBM与XGBoost算法的P2P网络借贷违约预测模型的比较研究[D].大连:东北财经大学,2017. [16] 马晓君,沙靖岚,牛雪琪.基于LightGBM算法的P2P项目信用评级模型的设计及应用[J].数量经济技术经济研究,2018,35(5):144-160. [17] 程杏杏.典型机器学习算法在PM2.5浓度预测研究中的实现与比较[D].南昌:江西财经大学,2020. [18] 赵文怡,夏丽莎,高广阔,等.基于加权KNN-BP神经网络的PM2.5浓度预测模型研究[J].环境工程技术学报,2019,9(1):14-18. [19] 张纯曦,阿丽亚·拜都热拉,刘丽,等.乌鲁木齐市PM2.5分布特征及其预测模型研究[J].石河子大学学报(自然科学版),2020,38(5):648-654. [20] 付倩娆.基于多元线性回归的雾霾预测方法研究[J].计算机科学,2016,43(增刊1):526-528.
点击查看大图
计量
- 文章访问数: 168
- HTML全文浏览量: 48
- PDF下载量: 13
- 被引次数: 0