Author | SHA1 Message | Date |
---|---|---|
Jeff Rasley | 41db1c2f03 ZeRO-Offload release (#391) | 4 years ago |
Jeff Rasley | abe2204ddd Support fp32 grad clipping and fix max_grad_norm confusion (#232) | 4 years ago |
Jeff Rasley | f2ac7eafd5 ZeRO-2 (#217) | 4 years ago |
Olatunji Ruwase | 6d60206586 Support legacy optimizer fusion as config option (#75) | 4 years ago |
Elton Zheng | 98f5131bb6 add test model Megatron_GPT2 | 4 years ago |