Stas Bekman
|
fa8d6c0a54
[build] support cuda-11.5 (#1558)
|
2 年之前 |
Olatunji Ruwase
|
7567c76c05
Update offload parameter names (#1536)
|
2 年之前 |
Cheng Li
|
9caa74e577
Autotuning (#1554)
|
2 年之前 |
Manuel R. Ciosici
|
b7cc7c8ea7
Add documentation for bfloat16 (git commit 648f7bfa5009484b822064d0c28d377da6dd71a0) (#1516)
|
2 年之前 |
Olatunji Ruwase
|
488105ebd2
Fix zinf none swapper (#1550)
|
2 年之前 |
Baizhou Huang
|
76847f42cf
Add warmup_type arguments in WarmupLR and WarmupDecayLR (#1530)
|
2 年之前 |
Reza Yazdani
|
3ed77304df
Fix sparse attention for small block-sizes (#1545)
|
2 年之前 |
Reza Yazdani
|
9ce00a2171
Tensor-Parallelism general support (#1512)
|
2 年之前 |
Conglong Li
|
b16dd943a4
backward compatibility (#1549)
|
2 年之前 |
Jeff Rasley
|
fa9d3e8452
bump to 0.5.7
|
2 年之前 |
Jeff Rasley
|
2665c8b149
Fix 1bit extra issue (#1542)
|
2 年之前 |
Olatunji Ruwase
|
bd3ebddf36
Use cuda tensors for allgather (#1548)
|
2 年之前 |
Reza Yazdani
|
af443f63f4
CPU-Adam: Fix compile Issue (#1537)
|
2 年之前 |
Chunyang Wen
|
f0122007df
Modify inference engine (#1520)
|
3 年之前 |
Jeff Rasley
|
0af15b985d
[unit tests] allow unique port for tests
|
3 年之前 |
Chunyang Wen
|
93c71831c7
fstr for multnode_runner (#1532)
|
3 年之前 |
Stas Bekman
|
76f2b5e51d
[docs] fix 404 (#1531)
|
3 年之前 |
Nathan Frey
|
2c62d657a4
typo in profiler.py (#1527)
|
3 年之前 |
alexandremuzio
|
2887349cd4
Adding Tutel to MoE layer (#1528)
|
3 年之前 |
Chunyang Wen
|
cf1f16016f
Use fstr in launcher (#1521)
|
3 年之前 |
Cheng Li
|
f9b378012e
make conv flops counting general for 1,2,3d (#1518)
|
3 年之前 |
Jeff Rasley
|
426dd2b5e4
bump to 0.5.6
|
3 年之前 |
Alex Hedges
|
91defd7cfe
Prevent creation of local temp directory (#1494)
|
3 年之前 |
Chunyang Wen
|
df5b0884c7
Unify use f str (#1511)
|
3 年之前 |
Stas Bekman
|
bf1725bb57
[code readability] pipe (#1510)
|
3 年之前 |
Chunyang Wen
|
85ce85dd5f
Remove redundant pass (#1509)
|
3 年之前 |
Rana Ali Amjad
|
648f7bfa50
Bfloat16 zero2 (#1398)
|
3 年之前 |
Reza Yazdani
|
2c5bba6dc1
Transformer kernel - fix unit test (#1503)
|
3 年之前 |
Zhen Zhang
|
c0eeb69dfb
ZeRO3, improved parameter all-gather operation (#1188)
|
3 年之前 |
Conglong Li
|
7f5a3addb6
update CL doc (#1506)
|
3 年之前 |