Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin

Colin Wei; Tengyu Ma

Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin

Colin Wei, Tengyu Ma

Keywords: adversarial, deep learning theory, generalization

Abstract Paper Reviews Chat

Tues Session 4 (17:00-19:00 GMT) [Live QA] [Cal]

Tues Session 5 (20:00-22:00 GMT) [Live QA] [Cal]

Abstract: For linear classifiers, the relationship between (normalized) output margin and generalization is captured in a clear and simple bound – a large output margin implies good generalization. Unfortunately, for deep models, this relationship is less clear: existing analyses of the output margin give complicated bounds which sometimes depend exponentially on depth. In this work, we propose to instead analyze a new notion of margin, which we call the “all-layer margin.” Our analysis reveals that the all-layer margin has a clear and direct relationship with generalization for deep models. This enables the following concrete applications of the all-layer margin: 1) by analyzing the all-layer margin, we obtain tighter generalization bounds for neural nets which depend on Jacobian and hidden layer norms and remove the exponential dependency on depth 2) our neural net results easily translate to the adversarially robust setting, giving the first direct analysis of robust test error for deep networks, and 3) we present a theoretically inspired training algorithm for increasing the all-layer margin. Our algorithm improves both clean and adversarially robust test performance over strong baselines in practice.

Improved Sample Complexities for Deep Neural Networks and Robust Classification via an All-Layer Margin

Colin Wei, Tengyu Ma

Similar Papers

Understanding Generalization in Recurrent Neural Networks

Zhuozhuo Tu, Fengxiang He, Dacheng Tao,

MMA Training: Direct Input Space Margin Maximization through Adversarial Training

Gavin Weiguang Ding, Yash Sharma, Kry Yik Chau Lui, Ruitong Huang,

On Generalization Error Bounds of Noisy Gradient Methods for Non-Convex Learning

Jian Li, Xuanyuan Luo, Mingda Qiao,

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

Wei Hu, Zhiyuan Li, Dingli Yu,