RNNs Incrementally Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients?

Anil Kag; Ziming Zhang; Venkatesh Saligrama

RNNs Incrementally Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients?

Anil Kag, Ziming Zhang, Venkatesh Saligrama

Keywords: rnn

Abstract Paper Reviews Chat

Mon Session 1 (05:00-07:00 GMT) [Live QA] [Cal]

Mon Session 2 (08:00-10:00 GMT) [Live QA] [Cal]

Abstract: Recurrent neural networks (RNNs) are particularly well-suited for modeling long-term dependencies in sequential data, but are notoriously hard to train because the error backpropagated in time either vanishes or explodes at an exponential rate. While a number of works attempt to mitigate this effect through gated recurrent units, skip-connections, parametric constraints and design choices, we propose a novel incremental RNN (iRNN), where hidden state vectors keep track of incremental changes, and as such approximate state-vector increments of Rosenblatt's (1962) continuous-time RNNs. iRNN exhibits identity gradients and is able to account for long-term dependencies (LTD). We show that our method is computationally efficient overcoming overheads of many existing methods that attempt to improve RNN training, while suffering no performance degradation. We demonstrate the utility of our approach with extensive experiments and show competitive performance against standard LSTMs on LTD and other non-LTD tasks.

RNNs Incrementally Evolving on an Equilibrium Manifold: A Panacea for Vanishing and Exploding Gradients?

Anil Kag, Ziming Zhang, Venkatesh Saligrama

Similar Papers

Training Recurrent Neural Networks Online by Learning Explicit State Variables

Somjit Nath, Vincent Liu, Alan Chan, Xin Li, Adam White, Martha White,

Improved memory in recurrent neural networks with sequential non-normal dynamics

Emin Orhan, Xaq Pitkow,

Efficient and Information-Preserving Future Frame Prediction and Beyond

Wei Yu, Yichao Lu, Steve Easterbrook, Sanja Fidler,

One-Shot Pruning of Recurrent Neural Networks by Jacobian Spectrum Evaluation

Shunshi Zhang, Bradly C. Stadie,