Functional Regularisation for Continual Learning with Gaussian Processes

Michalis K. Titsias, Jonathan Schwarz, Alexander G. de G. Matthews, Razvan Pascanu, Yee Whye Teh

Keywords: bayesian inference, continual learning, distributed, gaussian process, gaussian processes, incremental learning, lifelong learning

Thurs Session 1 (05:00-07:00 GMT) [Live QA] [Cal]
Thurs Session 2 (08:00-10:00 GMT) [Live QA] [Cal]

Abstract: We introduce a framework for Continual Learning (CL) based on Bayesian inference over the function space rather than the parameters of a deep neural network. This method, referred to as functional regularisation for Continual Learning, avoids forgetting a previous task by constructing and memorising an approximate posterior belief over the underlying task-specific function. To achieve this we rely on a Gaussian process obtained by treating the weights of the last layer of a neural network as random and Gaussian distributed. Then, the training algorithm sequentially encounters tasks and constructs posterior beliefs over the task-specific functions by using inducing point sparse Gaussian process methods. At each step a new task is first learnt and then a summary is constructed consisting of (i) inducing inputs – a fixed-size subset of the task inputs selected such that it optimally represents the task – and (ii) a posterior distribution over the function values at these inputs. This summary then regularises learning of future tasks, through Kullback-Leibler regularisation terms. Our method thus unites approaches focused on (pseudo-)rehearsal with those derived from a sequential Bayesian inference perspective in a principled way, leading to strong results on accepted benchmarks.

Similar Papers

Continual Learning with Bayesian Neural Networks for Non-Stationary Data
Richard Kurle, Botond Cseke, Alexej Klushyn, Patrick van der Smagt, Stephan Günnemann,
Continual Learning with Adaptive Weights (CLAW)
Tameem Adel, Han Zhao, Richard E. Turner,
Empirical Bayes Transductive Meta-Learning with Synthetic Gradients
Shell Xu Hu, Pablo Moreno, Yang Xiao, Xi Shen, Guillaume Obozinski, Neil Lawrence, Andreas Damianou,
A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning
Soochan Lee, Junsoo Ha, Dongsu Zhang, Gunhee Kim,