âSurprises in High-Dimensional Ridgeless Least Squares Interpolationâ, 2019-03-19 (; similar)â :
Interpolatorsâestimators that achieve zero training errorâhave attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type.
In this paper, we study minimum đ2 norm (âridgelessâ) interpolation in high-dimensional least squares regression. We consider two different models for the feature distribution: a linear model, where the feature vectors xi â âp are obtained by applying a linear transform to a vector of i.i.d. entries, xi = â1â2zi (with zi â âp); and a nonlinear model, where the feature vectors are obtained by passing the input through a random one-layer neural network, xi = Ď(Wzi) (with zi â âd, W â âpĂd a matrix of i.i.d. entries, and Ď an activation function acting component-wise on Wzi).
We recoverâin a precise quantitative wayâseveral phenomena that have been observed in large-scale neural networks and kernel machines, including the âdouble descentâ behavior of the prediction risk, and the potential benefits of overparametrization.