Stock Market Time Series Forecasting using Transformer Models
Abstract
This thesis examines the effectiveness of various machine learning (ML) models in predicting stock prices, with a special focus on ensemble methods that integrate multiple predictive models. This study evaluates the performance of traditional models like ARIMA and Linear Regression against advanced ML models such as Long Short-Term Memory (LSTM) networks, Prophet, and Transformers based on historical stock price data imported from Yahoo Finance. The goal of such comparisons is to determine which model provides the best stock price predictions in terms of prediction accuracy, reliability, and computational efficiency. By use of modeling performance metrics such as Mean Absolute Error (MAE), Mean Squared Error (MSE), and Root Mean Squared Error (RMSE), the research evaluates the predictive accuracy of these models. Ensemble learning techniques, which combine predictions from different models in order to enhance forecasting accuracy, are also examined in this study. To mitigate limitations of individual models, the ensemble methods leverage their unique strengths to reduce bias and variance specific to each model. Despite the fact that individual models may offer particular advantages, ensemble methods are generally more accurate and robust across a wide range of market conditions. Combined with Linear Regression in a stacked fashion, the transformer model achieves superior prediction performance. Linear Regression and Ensemble Stacked Transformers both perform well on the AAL, AAME, and AAPL indices, with MAPE values of 2.264, 2.2437, and 1.3081, respectively. The findings obtained through this study may provide impactful insights into the stock closing price model selection and development in the area of investment in stock markets and portfolio management. Optimizing ML models for stock price prediction requires hyperparameter tuning. Models like ARIMA, Transformer, and various ensemble configurations are tuned to enhance their predictive capabilities. To maximize accuracy while minimizing computational costs, different tuning strategies such as grid search, bayesian search, and random search are investigated in this thesis. It allows predictive models to adapt to diverse and dynamic market conditions, thereby improving their utility in real-world trading scenarios due to their robustness. Optimization of hyperparameters in the models is essential for improving stock price prediction accuracy and performance. This ensures that models are effective and also efficient in capturing the complexities and volatility of real-world financial markets by properly adjusting the key parameters such as learning rates, number of layers, and batch sizes. This thesis contributes to the field of financial technology (fintech) by providing a clearer understanding of how various ML models can be used jointly to accurately predict stock prices. We found that it is important to choose appropriate model architectures based on specific market characteristics, and that ensemble learning methods can improve the accuracy and reliability of financial time series forecasts. Therefore, this research enables investors, financial analysts, and policymakers to make better-informed decisions, ultimately enhancing financial market stability and profitability.