Ridge Regression: Structure, Cross-Validation, and Sketching

Sifan Liu, Edgar Dobriban

Keywords: regression, regularization

Wed Session 4 (17:00-19:00 GMT) [Live QA] [Cal]
Wed Session 5 (20:00-22:00 GMT) [Live QA] [Cal]
Wednesday: Theory

Abstract: We study the following three fundamental problems about ridge regression: (1) what is the structure of the estimator? (2) how to correctly use cross-validation to choose the regularization parameter? and (3) how to accelerate computation without losing too much accuracy? We consider the three problems in a unified large-data linear model. We give a precise representation of ridge regression as a covariance matrix-dependent linear combination of the true parameter and the noise. We study the bias of $K$-fold cross-validation for choosing the regularization parameter, and propose a simple bias-correction. We analyze the accuracy of primal and dual sketching for ridge regression, showing they are surprisingly accurate. Our results are illustrated by simulations and by analyzing empirical data.

Similar Papers