Adversarially robust transfer learning

Ali Shafahi; Parsa Saadatpanah; Chen Zhu; Amin Ghiasi; Christoph Studer; David Jacobs; Tom Goldstein

Abstract: Transfer learning, in which a network is trained on one task and re-purposed on another, is often used to produce neural network classifiers when data is scarce or full-scale training is too costly. When the goal is to produce a model that is not only accurate but also adversarially robust, data scarcity and computational limitations become even more cumbersome. We consider robust transfer learning, in which we transfer not only performance but also robustness from a source model to a target domain. We start by observing that robust networks contain robust feature extractors. By training classifiers on top of these feature extractors, we produce new models that inherit the robustness of their parent networks. We then consider the case of "fine tuning" a network by re-training end-to-end in the target domain. When using lifelong learning strategies, this process preserves the robustness of the source network while achieving high accuracy. By using such strategies, it is possible to produce accurate and robust models with little data, and without the cost of adversarial training. Additionally, we can improve the generalization of adversarially trained models, while maintaining their robustness.

Adversarially robust transfer learning

Ali Shafahi, Parsa Saadatpanah, Chen Zhu, Amin Ghiasi, Christoph Studer, David Jacobs, Tom Goldstein

Similar Papers

Jacobian Adversarially Regularized Networks for Robustness

Alvin Chan, Yi Tay, Yew Soon Ong, Jie Fu,

Robust Local Features for Improving the Generalization of Adversarial Training

Chuanbiao Song, Kun He, Jiadong Lin, Liwei Wang, John E. Hopcroft,

Fast is better than free: Revisiting adversarial training

Eric Wong, Leslie Rice, J. Zico Kolter,

Intriguing Properties of Adversarial Training at Scale

Cihang Xie, Alan Yuille,