| Abstract | Machine learning models are increasingly being used to predict flight processes, such as the early estimation of the flight block time, that is, the gate-to-gate time (the time from leaving the departure airport until the arrival at their destination). These predictions support airline operations through early identification of possible disruptions. The performance of machine learning models tends to be described by metrics that represent their overall quality, not capturing the uncertainty of individual predictions. However, modelling and considering the uncertainties of individual processes is fundamental when integrating these models into decision support tools. This is particularly relevant in the air transport domain due to the non-linearities on delay propagation (for flights and passengers) and cost, as some events can trigger a sharp increase in these, e.g. with passengers missing their connections. This article presents a generic approach to describe the level of uncertainty of each prediction based on the combined use of two models. This methodology could be applied to many transport indicators and expert systems; in this article, the target variable used to illustrate the methodology is the block time of flights. The approach consists of the combination of two models: the first model produces a first estimation of the target value (with a regression); this estimation is then corrected by the outcome of a second model, which characterises (with a probabilistic classifier) the error of the first model for this estimation. The outcome of the combined models is a probabilistic distribution of the target indicator. The performance of the models generated in this manner is studied through parametric analysis and using three metrics: accuracy, uncertainty and prediction interval coverage probability (PICP). The VIKOR methodology is used to assist and streamline the decision-making process of the end user by filtering and ranking Pareto alternatives across various modelling parameters. This approach is compared with alternatives such as considering a Gaussian distribution of error for all estimations, quantile regression modelling and bootstrapping. |
|---|