Jeff Rasley
|
9878c95836
edits
|
2 years ago |
Reza Yazdani
|
33549ebf7f
fix formatting
|
2 years ago |
Reza Yazdani
|
b41fa35f41
fix some minor issues
|
2 years ago |
Reza Yazdani
|
e48cf49cdd
add moe-inference tutorial
|
2 years ago |
Jeff Rasley
|
e46d808a1b
MoE inference + PR-MoE model support (#1705)
|
2 years ago |
Jeff Rasley
|
3293cf72a0
[ZeRO] Default disable elastic ckpt in stage 1+2 and reduce CPU memory overhead during ckpt load (#1525)
|
2 years ago |
Jeff Rasley
|
e4cf40d617
force clear stashed tensors (#1698)
|
2 years ago |
liamcli
|
fead387f78
support module and no python args for launcher (#1690)
|
2 years ago |
Jeff Rasley
|
a85dce0728
add -lcurand to fix torch-nightly issue w. JIT (#1688)
|
2 years ago |
Jeff Rasley
|
3a4cb04243
[docs] switch to transparent dark logo
|
2 years ago |
Reza Yazdani
|
762e697a03
fix the half-precision version of rotary_pos_emb kernel (#1683)
|
2 years ago |
Reza Yazdani
|
289c3f9ba4
GPT-J inference support (#1670)
|
2 years ago |
Jeff Rasley
|
7e857aab9a
[docs] add gh-dark-mode logo
|
2 years ago |
Jeff Rasley
|
9c5cf3a5d4
[docs] add light-mode logo
|
2 years ago |
Jeff Rasley
|
2422ec4885
add segfault guard for cpu-adam/adagrad (#1681)
|
2 years ago |
Olatunji Ruwase
|
cef116f82c
Copy grads to cpu in z1-offload (#1679)
|
2 years ago |
Jeff Rasley
|
c2735996c0
[docs] add logo (#1676)
|
2 years ago |
Conglong Li
|
aca647991f
update results about public Pile dataset (#1675)
|
2 years ago |
Olatunji Ruwase
|
4354c3cc67
Fix largest param numel calculation (#1623)
|
2 years ago |
Victor
|
74493b2bee
support CPU Adam and Adagrad on Windows with SDK 10.0.22000 (#1634)
|
2 years ago |
Jeff Rasley
|
b6f0ac97ae
bump to 0.5.10
|
2 years ago |
Manuel R. Ciosici
|
d0ab722427
Various small documentation text improvements (#1665)
|
2 years ago |
Reza Yazdani
|
559c4ce11a
Convert the fp16_params to group of parameters (#1651)
|
2 years ago |
Manuel R. Ciosici
|
40ce131caa
Replace calls to print() with calls to logger (#1664)
|
2 years ago |
Stas Bekman
|
317400eafc
[save_fp16_model] return status (#1663)
|
2 years ago |
Jeff Rasley
|
d93d924a77
follow-up to #1652, resolved a100-80gb issue (#1655)
|
2 years ago |
Jeff Rasley
|
cbd68dc480
add backup cpu-arch detection if py-cpuinfo fails (#1652)
|
2 years ago |
Conglong Li
|
752319c782
New feature contribution guideline (#1646)
|
2 years ago |
Alex Hedges
|
8bbf081ad8
Add torchvision to requirements-dev.txt (#1642)
|
2 years ago |
Minjia Zhang
|
f2c433f03e
Updating autotuner readme file to add hyperparameter adjustment suggestions (#1641)
|
2 years ago |