Gpipe is a scalable pipeline parallelism library published by Google Brain, which allows for efficient training of large, memory-consuming models (Huang et al., 2018). Pipeline parallelism allows for §fast_nn_training.
In Gpipe, neural networks with sequential layers are partitioned across accelerators. The pipeline parallelism divides each input mini-batch into smaller micro-batches, enabling different accelerators to work on different micro-batches simultaneously. This is especially useful in §large_batch_training.