Efficient training of interpretable, non-linear regression models
Abstract
Regression, the process of estimating functions from data, comes in many flavors. One of the most commonly used regression models is linear regression, which is computationally efficient and easy to interpret, but lacks in flexibility. Non-linear regression methods, such as kernel regression and artificial neural networks, tend to be much more flexible, but also harder to interpret and more difficult, and computationally heavy, to train.
In the five papers of this thesis, different techniques for constructing regression models that combine flexibility with interpretability and computational efficiency, are investigated. In Papers I and II, sparsely regularized neural networks are used to obtain flexible, yet interpretable, models for additive modeling (Paper I) and dimensionality reduction (Paper II). Sparse regression, in the form of the elastic net, is also covered in Paper III, where the focus is on increased computational efficiency by replacing explicit regularization with iterative optimization and early stopping. In Paper IV, inspired by Jacobian regularization, we propose a computationally efficient method for bandwidth selection for kernel regression with the Gaussian kernel. Kernel regression is also the topic of Paper V, where we revisit efficient regularization through early stopping, by solving kernel regression iteratively. Using an iterative algorithm for kernel regression also enables changing the kernel during training, which we use to obtain a more flexible method, resembling the behavior of neural networks.
In all five papers, the results are obtained by carefully selecting either the regularization strength or the bandwidth. Thus, in summary, this work contributes with new statistical methods for combining flexibility with interpretability and computational efficiency based on intelligent hyperparameter selection.
Parts of work
I. Allerbo, O., Jörnsten, R. (2022). Flexible, Non-parametric Modeling Using Regularized Neural Networks. Computational Statistics, 37(4), 2029-2047. https://doi.org/10.1007/s00180-021-01190-4 II. Allerbo, O., Jörnsten, R. (2021). Non-linear, Sparse Dimensionality Reduction via Path Lasso Penalized Autoencoders. The Journal of Machine Learning Research, 22(283), 1-28. https://jmlr.org/papers/v22/21-0203.html III. Allerbo, O., Jonasson, J., Jörnsten, R. (2023). Elastic Gradient Descent, an Iterative Optimization Method Approximating the Solution Paths of the Elastic Net. The Journal of Machine Learning Research, 24(277), 1-53. https://jmlr.org/papers/v24/22-0119.html IV. Allerbo, O., Jörnsten, R. (2023). Bandwidth Selection for Gaussian Kernel Ridge Regression via Jacobian Control. https://doi.org/10.48550/arXiv.2205.11956 V. Allerbo, O., Jörnsten, R. (2023). Solving Kernel Ridge Regression with Gradient-Based Optimization Methods. https://doi.org/10.48550/arXiv.2306.16838
Degree
Doctor of Philosophy
University
University of Gothenburg
Institution
Department of Mathematical Sciences ; Institutionen för matematiska vetenskaper
Disputation
Fredagen den 22 september, kl. 13.15, Pascal, Matematiska vetenskaper, Chalmers tvärgata 3
Date of defence
2023-09-22
allerbo@chalmers.se
Date
2023-06-30Author
Allerbo, Oskar
Keywords
sparse regression
kernel regression
neural network regression
early stopping
bandwidth selection
Publication type
Doctoral thesis
ISBN
978-91-8069-337-0 (TRYCKT)
978-91-8069-338-7 (PDF)
Language
eng