Abstract | This study presents evidence of a developed ensemble of ensembles predictive model for delay prediction – a global phenomenon that has continued to strangle the construction sector despite considerable mitigation efforts. At first, a review of the existing body of knowledge on influencing factors of construction project delay was used to survey experts to approach its quantitative data collection. Secondly, data cleaning, feature selection, and engineering, hyperparameter optimization, and algorithm evaluation were carried out using the quantitative data to train ensemble machine learning algorithms (EMLA) – bagging, boosting, and naïve bayes, which in turn was used to develop hyperparameter optimized predictive models: Decision Tree, Random Forest, Bagging, Extremely Randomized Trees, Adaptive Boosting (CART), Gradient Boosting Machine, Extreme Gradient Boosting, Bernoulli Naive Bayes, Multinomial Naive Bayes, and Gaussian Naive Bayes. Finally, a multilayer high performant ensemble of ensembles (stacking) predictive model was developed to maximize the overall performance of the EMLA combined. Results from the evaluation metrics: accuracy score, confusion matrix, precision, recall, f1 score, and Compute Area Under the Receiver Operating Characteristic Curve (ROC AUC) indeed proved that ensemble algorithms are capable of improving the predictive force relative to the use of a single algorithm in predicting construction projects delay. |
---|