.. |
containers
|
4fc2c8e7d5
Fix llama meta tensor loading in AutoTP and kernel injected inference (#3608)
|
1 年之前 |
__init__.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
auto_tp.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 年之前 |
auto_tp_model_utils.py
|
5e16eb2c93
enable autoTP for mpt in huggingface model hub without trust_remote_code (#4062)
|
1 年之前 |
fusedqkv_utils.py
|
042115c80b
Fix fused qkv sizing for bloom (#4161)
|
1 年之前 |
inject.py
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
layers.py
|
ad661b8e35
Remove print of weight parameter in RMS norm (#4031)
|
1 年之前 |
load_checkpoint.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 年之前 |
module_quantize.py
|
430510bfce
Checks for user injection policy (#3052)
|
1 年之前 |
policy.py
|
0a61d5d664
Hybrid Engine Refactor and Llama Inference Support (#3425)
|
1 年之前 |
replace_module.py
|
b9d719a6d3
Pass base_dir to model files can be loaded for auto-tp/meta-tensor. (#4348)
|
1 年之前 |
replace_policy.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 年之前 |
utils.py
|
468882fb68
Add the policy to run llama model from the official repo (#4313)
|
1 年之前 |