Abstract | Concept drift in data streams can be unpredictable and complex, and it can have a significant impact on predictive models. As the severity and complexity of the drift increases, so does the likelihood of a negative effect on the model itself. It is, therefore, critical to calculate the complexity of concept drift in data streams to predict the performance of pattern recognition and model’s theoretical lifespan. This paper presents a calculation approach that can provide a score for the overall complexity of concept drift in a dataset. Additionally it aids in predicting a model’s theoretical lifespan while measuring performance degradation and fluctuations in the complexity of concept drift. Even though concept drift detection has been researched extensively, concept drift complexity itself as a separate phenomenon has been under-researched even though this is an essential distinction to make, especially when using the value of complexity as a baseline for the overall complexity of drift in a dataset. |
---|