Making Use of the Factor Zoo: An unpretentious attempt to predict asset returns using machine learning methods.
Att nyttja "the Factor Zoo": Ett opretentiöst försök för att prediktera avkastning på tillgångar genom maskininlärning.
Factor modeling with the purpose of estimating assets returns is a dynamic and ever changing subject within finance. Recent literature has presented over 300 different fac-tors that seem to have significance in predicting asset returns. This new phenomenon in factor modeling has been dubbed ”The Factor Zoo”. At first, the subject was marked by optimism, but soon researchers with a more pessimistic view entered the debate crit-icizing performed studies mainly concerning the choice of methodology. A presented solution to the issue of imperfect models is to utilize Machine Learning methods as they can handle the extensive nature of the ”Factor Zoo”. Feature selection is the idea of reducing the dimensionality in data sets to increase performance, interpretability and lower computational time. This thesis presents the evaluation of an ensemble of different feature selection models on a financial data set comprised of 30 firms from the S&P500 index and 123 firm characteristics during a ten-year period spanning from 2011-07 to 2021-07. Several established models handling feature selection were chosen. By compar-ing the models’ performance the thesis reached confidence in the selection of final model. The proposed final model was the Lasso which outperformed both the other regressor models but also the benchmark as in Fama French Five Factor Model. By analysing the selection of firm characteristics by the Lasso, important features such as ’relative strength index’, ’price/sales’ and ’price rel 52 week high’ showed great significance regardless of the chosen portfolio at hand.