Ensemble of ensembles for fine particulate matter pollution prediction using big data analytics and IoT emission sensors : WestminsterResearch

Publication dates
Title	Ensemble of ensembles for fine particulate matter pollution prediction using big data analytics and IoT emission sensors
Type	Journal article
Authors	Christian Nnaemeka Egwim, Hafiz Alaka, Youlu Pan, Balogun, H., Saheed Ajayi, Abdul Hye and Oluwapelumi Oluwaseun Egunjobi
Abstract	Purpose The study aims to develop a multilayer high-effective ensemble of ensembles predictive model (stacking ensemble) using several hyperparameter optimized ensemble machine learning (ML) methods (bagging and boosting ensembles) trained with high-volume data points retrieved from Internet of Things (IoT) emission sensors, time-corresponding meteorology and traffic data. Design/methodology/approach For a start, the study experimented big data hypothesis theory by developing sample ensemble predictive models on different data sample sizes and compared their results. Second, it developed a standalone model and several bagging and boosting ensemble models and compared their results. Finally, it used the best performing bagging and boosting predictive models as input estimators to develop a novel multilayer high-effective stacking ensemble predictive model. Findings Results proved data size to be one of the main determinants to ensemble ML predictive power. Second, it proved that, as compared to using a single algorithm, the cumulative result from ensemble ML algorithms is usually always better in terms of predicted accuracy. Finally, it proved stacking ensemble to be a better model for predicting PM2.5 concentration level than bagging and boosting ensemble models. Research limitations/implications A limitation of this study is the trade-off between performance of this novel model and the computational time required to train it. Whether this gap can be closed remains an open research question. As a result, future research should attempt to close this gap. Also, future studies can integrate this novel model to a personal air quality messaging system to inform public of pollution levels and improve public access to air quality forecast. Practical implications The outcome of this study will aid the public to proactively identify highly polluted areas thus potentially reducing pollution-associated/ triggered COVID-19 (and other lung diseases) deaths/ complications/ transmission by encouraging avoidance behavior and support informed decision to lock down by government bodies when integrated into an air pollution monitoring system Originality/value This study fills a gap in literature by providing a justification for selecting appropriate ensemble ML algorithms for PM2.5 concentration level predictive modeling. Second, it contributes to the big data hypothesis theory, which suggests that data size is one of the most important factors of ML predictive capability. Third, it supports the premise that when using ensemble ML algorithms, the cumulative output is usually always better in terms of predicted accuracy than using a single algorithm. Finally developing a novel multilayer high-performant hyperparameter optimized ensemble of ensembles predictive model that can accurately predict PM2.5 concentration levels with improved model interpretability and enhanced generalizability, as well as the provision of a novel databank of historic pollution data from IoT emission sensors that can be purchased for research, consultancy and policymaking.
Keywords	Air pollution
	big data analytics
	ensemble of ensembles
	Iot
	machine learning
Journal	Journal of Engineering, Design and Technology
Journal citation	23 (2), pp. 640-665
ISSN	1726-0531
Year	2025
Publisher	Emerald Publishing Limited
Digital Object Identifier (DOI)	https://doi.org/10.1108/JEDT-07-2022-0379
Published online	07 Nov 2023
Published in print	2025
Supplemental file	File Access Level Controlled (open metadata, closed files)

Related outputs

Boruta-grid-search least square support vector machine for NO2 pollution prediction using big data analytics and IoT emission sensors
Balogun, H., Alaka, H. and Egwim, C.N. 2025. Boruta-grid-search least square support vector machine for NO2 pollution prediction using big data analytics and IoT emission sensors. Applied Computing and Informatics. 21 (1/2), pp. 101-113. https://doi.org/10.1108/aci-04-2021-0092

Artificial intelligence for deconstruction: Current state, challenges, and opportunities
Balogun, H., Alaka, H., Demir, E., Egwim, C.N, Olu-Ajayi, R., Sulaimon, I. and Oseghale, R. 2024. Artificial intelligence for deconstruction: Current state, challenges, and opportunities. Automation in Construction. 166 105641. https://doi.org/10.1016/j.autcon.2024.105641

Critical factors for assessing building deconstructability: Exploratory and confirmatory factor analysis
Habeeb Balogun, Hafiz Alaka, Saheed Ajayi and Christian Nnaemeka Egwim 2024. Critical factors for assessing building deconstructability: Exploratory and confirmatory factor analysis. Cleaner Engineering and Technology. 21 100790. https://doi.org/10.1016/j.clet.2024.100790

Artificial Intelligence in the Construction Industry: A Systematic Review of the Entire Construction Value Chain Lifecycle
Christian Nnaemeka Egwim,, Hafiz Alaka,, Eren Demir, Habbeb Balogun, Razak Olu-Ajayi, Ismail Sulaimon, Godoyon Wusu, Wasiu Yusuf and Adegoke A. Muideen 2024. Artificial Intelligence in the Construction Industry: A Systematic Review of the Entire Construction Value Chain Lifecycle. Energies. 17 (1) 182. https://doi.org/10.3390/en17010182

Exploratory Analysis of Machine Learning Methods for Total Organic Carbon Prediction Using Well-Log Data of Kolmani Field
Longman, Fodio S., Balogun, Habeeb, Ojulari, Rasheed O., Olatomiwa, Olaniyi J., Balarabe, Husaini J., Edeh, Ifeanyichukwu S. and Joshua, Olabisi O. 2024. Exploratory Analysis of Machine Learning Methods for Total Organic Carbon Prediction Using Well-Log Data of Kolmani Field. IEEE 14th International Conference on Pattern Recognition Systems (ICPRS). London, United Kingdom 15 - 18 Jul 2024 IEEE . https://doi.org/10.1109/icprs62101.2024.10677822

Extraction of underlying factors causing construction projects delay in Nigeria
Egwim, C.N., Alaka, H., Toriola-Coker, L.O., Balogun, H., Ajayi, S. and Oseghale, R. 2023. Extraction of underlying factors causing construction projects delay in Nigeria. Journal of Engineering, Design and Technology. 21 (5), pp. 1323-1342. https://doi.org/10.1108/jedt-04-2021-0211

Systematic review of drivers influencing building deconstructability: Towards a construct-based conceptual framework
Balogun, H., Alaka, H., Egwim, C.N. and Ajayi, S. 2023. Systematic review of drivers influencing building deconstructability: Towards a construct-based conceptual framework. Waste Management and Research. 41 (3), pp. 512-530. https://doi.org/10.1177/0734242x221124078

Building energy performance prediction: A reliability analysis and evaluation of feature selection methods
Olu-Ajayi, R., Alaka, H., Sulaimon, I., Balogun, H., Wusu, G., Yusuf, W. and Adegoke, M. 2023. Building energy performance prediction: A reliability analysis and evaluation of feature selection methods. Expert Systems with Applications. 225 120109. https://doi.org/10.1016/j.eswa.2023.120109

Applied artificial intelligence for predicting construction projects delay
Christian Nnaemeka Egwim, Hafiz Alaka, Luqman Olalekan Toriola-Coker, Habeeb Balogun and Funlade Sunmola 2021. Applied artificial intelligence for predicting construction projects delay. Machine Learning with Applications. 6 100166. https://doi.org/10.1016/j.mlwa.2021.100166

Permalink - https://westminsterresearch.westminster.ac.uk/item/wq3w7/ensemble-of-ensembles-for-fine-particulate-matter-pollution-prediction-using-big-data-analytics-and-iot-emission-sensors

Ensemble of ensembles for fine particulate matter pollution prediction using big data analytics and IoT emission sensors

Related outputs

Share this

Usage statistics

Export as