Elsa Granger 4fc2c8e7d5 Fix llama meta tensor loading in AutoTP and kernel injected inference (#3608) 1 年之前
..
containers 4fc2c8e7d5 Fix llama meta tensor loading in AutoTP and kernel injected inference (#3608) 1 年之前
__init__.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
auto_tp.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 年之前
auto_tp_model_utils.py 5e16eb2c93 enable autoTP for mpt in huggingface model hub without trust_remote_code (#4062) 1 年之前
fusedqkv_utils.py 042115c80b Fix fused qkv sizing for bloom (#4161) 1 年之前
inject.py b361c72761 Update DeepSpeed copyright license to Apache 2.0 (#3111) 1 年之前
layers.py ad661b8e35 Remove print of weight parameter in RMS norm (#4031) 1 年之前
load_checkpoint.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 年之前
module_quantize.py 430510bfce Checks for user injection policy (#3052) 1 年之前
policy.py 0a61d5d664 Hybrid Engine Refactor and Llama Inference Support (#3425) 1 年之前
replace_module.py b9d719a6d3 Pass base_dir to model files can be loaded for auto-tp/meta-tensor. (#4348) 1 年之前
replace_policy.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 年之前
utils.py 468882fb68 Add the policy to run llama model from the official repo (#4313) 1 年之前