Gpipe is a scalable pipeline parallelism library published by Google Brain, which allows for efficient training of large, memory-consuming models (Huang et al. 2018). Pipeline parallelism allows for Fast Neural Network Training.
In Gpipe, neural networks with sequential layers are partitioned across accelerators. The pipeline parallelism divides each input mini-batch into smaller micro-batches, enabling different accelerators to work on different micro-batches simultaneously. This is especially useful in Large Batch Training.