layout: single title: "Training a Trillion Parameters with Pipeline Parallelism" excerpt: "" categories: news new_post: true
DeepSpeed includes new support for pipeline parallelism! DeepSpeed's training engine provides hybrid 3D parallelism for training models with over a trillion parameters. In addition to scaling to the extreme, we have demonstrated that hybrid parallelism accelerates training on clusters with low-bandwidth network by up to 7x.