Yejing-Lai 8ea995ee1f enable yuan autotp & add conv tp (#5428) 4 months ago
..
containers d9e12d3a68 Fix attention mask handling in the Hybrid Engine Bloom flow (#5101) 7 months ago
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
auto_tp.py 8ea995ee1f enable yuan autotp & add conv tp (#5428) 4 months ago
auto_tp_model_utils.py c20f6fa4e0 support baichuan model: (#4721) 10 months ago
fusedqkv_utils.py 8ea995ee1f enable yuan autotp & add conv tp (#5428) 4 months ago
inject.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 year ago
layers.py 8ea995ee1f enable yuan autotp & add conv tp (#5428) 4 months ago
load_checkpoint.py 567f97b264 load linear layer weight with given dtype (#4044) 8 months ago
module_quantize.py 430510bfce Checks for user injection policy (#3052) 1 year ago
policy.py 0a61d5d664 Hybrid Engine Refactor and Llama Inference Support (#3425) 1 year ago
replace_module.py 8ea995ee1f enable yuan autotp & add conv tp (#5428) 4 months ago
replace_policy.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 year ago
tp_shard.py 3dd7ccff81 enable phi3_mini autotp (#5501) 5 months ago
utils.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 year ago