Speeding Up the Training of Deep Neural Networks
BytePS Jiang et al., n.d.
BytePS provides a unifying framework for All-reduce and parameter server architectures, showing communication optimality. Intra-machine communication is optimized. It also proposes a “Summation Service”, which accelerates DNN training by running gradient summation on CPUs, while performing parameter updates on GPUs.