Ammar Ahmad Awan
|
b9d719a6d3
Pass base_dir to model files can be loaded for auto-tp/meta-tensor. (#4348)
|
1 年之前 |
Satpal Singh Rathore
|
430510bfce
Checks for user injection policy (#3052)
|
1 年之前 |
Dino Chen
|
0712e29920
add meta onDevice support for LLAMA2 (#4147)
|
1 年之前 |
digger yu
|
4cde5da88e
fix typo: change polciies to policies (#4090)
|
1 年之前 |
Lev Kurilenko
|
1ba4098918
Fix Stable Diffusion Injection (#4078)
|
1 年之前 |
Molly Smith
|
94c7233a8b
Refactor autoTP inference for HE (#4040)
|
1 年之前 |
mzl
|
6b877d2dbc
autoTP for fused qkv weight (#3844)
|
1 年之前 |
Wang, Yi
|
0bafeac491
enable autoTP for MPT (#3861)
|
1 年之前 |
Wang, Yi
|
76953a37b7
fix opt-350m shard loading issue in AutoTP (#3600)
|
1 年之前 |
Dino Chen
|
f3943cf910
add llama2 autoTP support in replace_module (#4022)
|
1 年之前 |
digger yu
|
ce535945e6
fix: change ==NONE to is (#3923)
|
1 年之前 |
Reza Yazdani
|
f3c93b056d
Add FALCON Auto-TP Support (#3640)
|
1 年之前 |
Ma, Guokai
|
1f72082fc0
[CPU] Support Intel CPU inference (#3041)
|
1 年之前 |
Wang, Yi
|
b31b46c0d1
fix regression in shard checkpoint loading in AutoTP Path caused by qkv_copy() is deleted and add UT case for shard checkpoint loading in AutoTP (#3457)
|
1 年之前 |
Wang, Yi
|
d10b8ca011
add sharded checkpoint loading for AutoTP path to reduce the peak mem… (#3102)
|
1 年之前 |
Connor Holmes
|
0a61d5d664
Hybrid Engine Refactor and Llama Inference Support (#3425)
|
1 年之前 |
Reza Yazdani
|
3e8564645d
Add HE support for the rest of model containers (#3191)
|
1 年之前 |
Molly Smith
|
496a9a3a62
Diffusers 0.15.0 bug fix (#3345)
|
1 年之前 |
Olatunji Ruwase
|
47f9f13bd3
DeepSpeed Chat (#3186)
|
1 年之前 |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 年之前 |
Molly Smith
|
9ea0fdc2ce
Assert mp_size is factor of model dimensions (#2891)
|
1 年之前 |
Heyang Qin
|
dc01cee5ca
using container when loading inference checkpoints (#2875)
|
1 年之前 |
Jeff Rasley
|
da84e60d98
add missing license info to top of all source code (#2889)
|
1 年之前 |
Lev Kurilenko
|
fd1449c766
Port Reza's INT8-quantization fix to container architecture (#2725)
|
1 年之前 |
Molly Smith
|
46784cb58e
Fix auto TP for duplicate modules with different gems (#2784)
|
1 年之前 |
Lev Kurilenko
|
10f3c301a0
Add container load checkpoint error reporting + refactor (#2792)
|
1 年之前 |
Lev Kurilenko
|
0a73e6e613
Container param cleanup + remove qkv_merging (#2780)
|
1 年之前 |
Reza Yazdani
|
9f41ffe4a6
Reset KV-cache at the beginning of text-generation (#2669)
|
1 年之前 |
Ma, Guokai
|
98cc35b6a8
Abstract accelerator (step 3) (#2677)
|
1 年之前 |