Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks

Alejandro Molina; Patrick Schramowski; Kristian Kersting

Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks

Alejandro Molina, Patrick Schramowski, Kristian Kersting

Keywords: robustness

Abstract Paper Code Reviews Chat

Thurs Session 2 (08:00-10:00 GMT) [Live QA] [Cal]

Thurs Session 5 (20:00-22:00 GMT) [Live QA] [Cal]

Abstract: The performance of deep network learning strongly depends on the choice of the non-linear activation function associated with each neuron. However, deciding on the best activation is non-trivial and the choice depends on the architecture, hyper-parameters, and even on the dataset. Typically these activations are fixed by hand before training. Here, we demonstrate how to eliminate the reliance on first picking fixed activation functions by using flexible parametric rational functions instead. The resulting Padé Activation Units (PAUs) can both approximate common activation functions and also learn new ones while providing compact representations. Our empirical evidence shows that end-to-end learning deep networks with PAUs can increase the predictive performance. Moreover, PAUs pave the way to approximations with provable robustness.

Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks

Alejandro Molina, Patrick Schramowski, Kristian Kersting

Similar Papers

BinaryDuo: Reducing Gradient Mismatch in Binary Activation Network by Coupling Binary Activations

Hyungjun Kim, Kyungsu Kim, Jinseok Kim, Jae-Joon Kim,

Effect of Activation Functions on the Training of Overparametrized Neural Nets

Abhishek Panigrahi, Abhishek Shetty, Navin Goyal,

Piecewise linear activations substantially shape the loss surfaces of neural networks

Fengxiang He, Bohan Wang, Dacheng Tao,

Enhancing Adversarial Defense by k-Winners-Take-All

Chang Xiao, Peilin Zhong, Changxi Zheng,