Abstract | In this paper, we present a Generic Predictive Computational Model (GPCM) and apply it by building a Use Case for the FTSE 100 index forecasting. This involves the mining of heterogeneous data based on semantic methods (ontology), graph-based methods (knowledge graphs, graph databases) and advanced Machine Learning methods. The main focus of our research is data pre-processing aimed at a more efficient selection of input features. The GPCM model pipeline’s cycles involve the propagation of the (initially raw) data to the Graph Database structured by an ontology and regular updates of the features’ weights in the Graph Database by the feedback loop from the Machine Learning Engine. The Graph Database queries output the most valuable features that, in turn, serve as the input for the Machine Learning-based prediction. The end-product of this process is fed back to the Graph Database to update the weights. We report on practical experiments evaluating the effectiveness of the GPCM application in forecasting the FTSE 100 index. The underlying dataset contains multiple parameters related to predicting time-series data, where Long Short-Term Memory (LSTM) is known to be one of the most efficient machine learning methods. The most challenging task here has been to overcome the known restrictions of LSTM, which is capable of analysing one input parameter only. We solved this problem by combining several parallel LSTMs, a Concatenation unit, which merges the LSTMs’ outputs (into a time-series matrix), and a Linear Regression Unit, which produces the final result |
---|
Keywords | semantic data analysis, graph database, ontol- ogy, stock analysis, computational model, neural network, linear regression, FTSE 100, LSTM, Prot ́eg ́e, Neo4j, Python |
---|