Ammar Ahmad Awan
|
867da307d0
Inference Refactor (replace_with_policy, model_implementations) (#2554)
|
1 年之前 |
Reza Yazdani
|
95d9a1b6c3
Fix Opt injection (#2541)
|
1 年之前 |
Jeff Rasley
|
d9b788d773
tweaks to ds-attn, distilbert policy, and mup (#2649)
|
1 年之前 |
Michael Wyatt
|
ccb8eb81fb
Add checkpoint sharding unit tests (#2561)
|
1 年之前 |
Lev Kurilenko
|
731965db33
Fix MegatronLayerPolicy to have megatron_v2=True (#2579)
|
1 年之前 |
Reza Yazdani
|
35b350b28c
Fix quantized-inference & Add generic support of checkpoint loading (#2547)
|
1 年之前 |
Connor Holmes
|
e7e7595502
Stable Diffusion Enhancements (#2491)
|
1 年之前 |
Jeff Rasley
|
ec13da6ba7
add SD injection policy (#2381)
|
2 年之前 |
Ammar Ahmad Awan
|
993264388d
Inference profiling updates/fixes (#2348) (#2349)
|
2 年之前 |
Arash Bakhtiari
|
fae896ef60
Make OPT policy backward compatible with pre-OPT transformers versions (#2254)
|
2 年之前 |
Arash Bakhtiari
|
8b2a63717a
Add support of OPT models (#2205)
|
2 年之前 |
Reza Yazdani
|
556f005152
Fix random token-generation issue + MP-checkpoint loading/saving (#2132)
|
2 年之前 |
Alex Hedges
|
316c4a43e0
Add flake8 to pre-commit checks (#2051)
|
2 年之前 |
Michael Wyatt
|
ee7ea3b805
use HF NeoX (#2087)
|
2 年之前 |
Reza Yazdani
|
aa88137b8d
Add Inference support for running the BigScience-BLOOM Architecture (#2083)
|
2 年之前 |
Reza Yazdani
|
8164ea9e6d
Fixing several bugs in the inference-api and the kernels (#1951)
|
2 年之前 |
Jeff Rasley
|
b4fcd98ff0
Inference PP changes for neox (#1899)
|
2 年之前 |
Samyam Rajbhandari
|
c13457b756
Supporting multiple modules injection with a single policy when they have identical architectures (#1869)
|
2 年之前 |
Reza Yazdani
|
94de0229fb
Fix inference api & add more description on inference engine tutorial (#1711)
|
2 年之前 |
Jeff Rasley
|
e46d808a1b
MoE inference + PR-MoE model support (#1705)
|
2 年之前 |
Reza Yazdani
|
289c3f9ba4
GPT-J inference support (#1670)
|
2 年之前 |
Jeff Rasley
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 年之前 |
Hyunwoong Ko
|
429cbc89af
Fix bugs about non-contiguous tensor broadcasting (#1168)
|
3 年之前 |
Jeff Rasley
|
96eb5b12e3
delay imports for replace policies and fix missing req (#1100)
|
3 年之前 |
Reza Yazdani
|
ed3de0c21b
Quantization + inference release (#1091)
|
3 年之前 |