Jethro's Braindump

Surrogate Gradient Learning In Spiking Neural Networks


Spiking Neural Networks enable power-efficient network models, which have become of increasing importance in embedded and auto-motive applications. The power efficiency stems from dispensing of expensive floating-point computations.

Surrogate gradient methods overcome the difficulties associated with the discontinuous non-linearity. Rather than changing the neuronal model (Smoothed Spiking Neural Networks), surrogate gradients are introduced to allow for numerical optimisation.

Surrogate gradients can also improve the memory access overhead of the learning process. For example, a global loss can be replaced by a number of local loss functions. Surrogate gradient methods also allow for end-to-end training without specifying a coding scheme in the hidden layers.

There are many different available surrogate functions used, and all have reportedly some success (Neftci, Mostafa, and Zenke, n.d.). All of the functions used are non-linear and monotonically increasing towards the firing threshold. This suggests that the details of the surrogate are not crucial in ensuring success of the method.


Neftci, Emre O., Hesham Mostafa, and Friedemann Zenke. n.d. “Surrogate Gradient Learning in Spiking Neural Networks.”

Links to this note