基于树模型的北京市PM<sub>2.5</sub>预测效果对比分析

李志生; 梁锡冠; 金宇凯; 张华刚; 欧耀春

doi:10.13205/j.hjgc.202106016

基于树模型的北京市PM_2.5预测效果对比分析

doi: 10.13205/j.hjgc.202106016

广东工业大学土木与交通工程学院, 广州 510006

基金项目:

广东省自然科学基金（S2011040003755）；广东工业大学产学研合作项目"工业厂房室内甲醛与PM_2.5治理项目"（18HK0031）

详细信息

作者简介:
李志生(1972-),男,博士,副教授,主要研究方向为建筑环境与室内污染防治。chinalzs@sina.com

通讯作者:
李志生(1972-),男,博士,副教授,主要研究方向为建筑环境与室内污染防治。chinalzs@sina.com

计量
- 文章访问数: 320
- HTML全文浏览量: 95
- PDF下载量: 14
- 被引次数: 0
出版历程
- 收稿日期: 2020-07-22
- 网络出版日期: 2022-01-18

A COMPARATIVE STUDY ON EDICTIVE EFFECT OF PM_2.5 IN BEIJING BASED ON TREE MODELS

School of Civil and Transportation Engineering, Guangdong University of Technology, Guangzhou 510006, China

摘要

摘要: 在城市空气质量预测中，ρ（PM_2.5）会受到气象条件和时间周期的影响。选取北京市全市为实验区域，对多种污染物浓度特征、时间特征及天气特征等进行分析，采用2019年33个空气质量监测站逐小时数据开展PM_2.5预测实验，建立了基于特征的LightGBM （light gradient boosting machine） PM_2.5质量浓度预测模型，分别与随机森林模型（RF）、梯度提升树模型（GBDT）、 XGBoost模型3个PM_2.5浓度预测模型进行对比。结果表明：在PM_2.5浓度预测精度方面，LightGBM模型最高，XGBoost模型次之，RF模型最差。LightGBM模型的PM_2.5污染浓度预测准确率高于其他模型，R²为0.9614，且具有训练快、内存少等优点。LightGBM模型的5个评估指标均优于其他模型，说明其在PM_2.5逐时预测上具有很好的稳定性和应用前景。
- 周期特征 /
- 机器学习 /
- PM_2.5影响因素 /
- LightGBM /
- PM_2.5预测
Abstract: In urban air quality forecast, the mass concentrations of PM_2.5 were influenced by the meteorological conditions and time period. This article selected Beijing as the experimental area, analysing a variety of pollutants concentration characteristics, time characteristics and weather characteristics. The data by hour of 33 air quality monitoring stations in 2019 were used to carry out the PM_2.5 forecast experiments, based on characteristics of LightGBM(light gradient boosting machine) PM_2.5 mass concentration prediction model. The results showed that compared with random forests model(RF), gradient boosting decision tree model(GBDT), XGBoost model, LightGBM model had the highest prediction accuracy of PM_2.5 concentration, XGBoost model came next, random forest model was the lowest. The accuracy of LightGBM model PM_2.5 prediction was higher than other models, R² was 0.9614, and training LightGBM model was fast and RAM needed less. LightGBM model on the five indicators were better than the rest of the model, and LightGBM model on PM_2.5 hourly prediction had better stability and application prospects.
- periodic characteristics /
- machine learning /
- influencing factors of PM_2.₅ /
- LightGBM /
- PM_2.5 prediction

HTML全文

参考文献(20)

[1]	薛骅骎.大气颗粒物的化学组成、来源识别和污染评价研究[D].合肥:中国科学技术大学,2019.
[2]	林承勇.人工神经网络在预测PM_2.5浓度中的研究[D].成都:电子科技大学,2016.
[3]	冯科展,解建军,张玫,等.灰色模型在PM_2.5预测中的应用[J].绵阳师范学院学报,2015,34(5):75-79.
[4]	陈志文,刘立.基于BP神经网络的PM_2.5预测[J].电子技术与软件工程,2019(5):143-144.
[5]	曲悦,钱旭,宋洪庆,等.基于机器学习的北京市PM_2.5浓度预测模型及模拟分析[J].工程科学学报,2019,41(3):401-407.
[6]	康俊锋,黄烈星,张春艳,等.多机器学习模型下逐小时PM_2.5预测及对比分析[J].中国环境科学,2020,40(5):1895-1905.
[7]	LIU X L,TAN W A,TANG S,et al.A Bagging-GBDT ensemble learning model for city air pollutant concentration prediction[C]//4th International Conference on Advances in Energy Resources and Environment Engineering,2019.
[8]	任才溶,谢刚.基于随机森林和气象参数的PM_2.5浓度等级预测[J].计算机工程与应用,2019,55(2):213-220.
[9]	夏润,张晓龙.基于改进集成学习算法的在线空气质量预测[J].武汉科技大学学报,2019,42(1):61-67.
[10]	王亚男.大数据背景下PM_2.5浓度预测的研究[D].曲阜:曲阜师范大学,2019.
[11]	刘杰,杨鹏,吕文生,等.基于气象因素的PM_2.5质量浓度预测模型[J].山东大学学报(工学版),2015,45(6):76-83.
[12]	ZHANG Y,ZHANG R R,MA Q F,et al.A feature selection and multi-model fusion-based approach of predicting air quality[J].ISA Transactions,2020,100:210-220.
[13]	WONG K I,WONG P K,CHEUNG C S,et al.Modeling and optimization of biodiesel engine performance using advanced machine learning methods[J].Energy,2013,55:519-528.
[14]	HUANG W,CHENG X.Multiple Regression Method for Estimating Concentration of PM_2.5 using Remote Sensing and Meteorological Data[J].Journal of Environmental Protection and Ecology,2017,18(2):417-424.
[15]	沙靖岚.基于LightGBM与XGBoost算法的P2P网络借贷违约预测模型的比较研究[D].大连:东北财经大学,2017.
[16]	马晓君,沙靖岚,牛雪琪.基于LightGBM算法的P2P项目信用评级模型的设计及应用[J].数量经济技术经济研究,2018,35(5):144-160.
[17]	程杏杏.典型机器学习算法在PM_2.5浓度预测研究中的实现与比较[D].南昌:江西财经大学,2020.
[18]	赵文怡,夏丽莎,高广阔,等.基于加权KNN-BP神经网络的PM_2.5浓度预测模型研究[J].环境工程技术学报,2019,9(1):14-18.
[19]	张纯曦,阿丽亚·拜都热拉,刘丽,等.乌鲁木齐市PM_2.5分布特征及其预测模型研究[J].石河子大学学报(自然科学版),2020,38(5):648-654.
[20]	付倩娆.基于多元线性回归的雾霾预测方法研究[J].计算机科学,2016,43(增刊1):526-528.