Reza Yazdani
|
259936a76c
Fix cpu-adam AVX performance (#1637)
|
2 年之前 |
Cheng Li
|
082f392a93
Add tensor methods in flops counting and separate macs and flops (#1591)
|
2 年之前 |
Jeff Rasley
|
7f58853c2e
[testing] 3x faster unit tests (#1636)
|
2 年之前 |
Jeff Rasley
|
1d295ff5f8
Refactor ZeRO naming to reduce confusion (#1607)
|
2 年之前 |
Gary Miguel
|
07887f6630
sharded_moe: make top1gating ONNX-exportable (#1578)
|
2 年之前 |
Victor
|
64c2946a23
use py-cpuinfo to detect SIMD_WIDTH in platform-independent way (#1616)
|
2 年之前 |
Ammar Ahmad Awan
|
feb6afb049
remove print (#1626)
|
2 年之前 |
Conglong Li
|
c6ace162c4
MoE for NLG tutorial (#1633)
|
2 年之前 |
Jeff Rasley
|
88d26e0b8d
[docs] update readme (#1632)
|
2 年之前 |
Jeff Rasley
|
df9e064d6b
[docs] add MoE NLG announcement to news
|
2 年之前 |
Jeff Rasley
|
3ffeaa4999
MoE for NLG announcement (#1628)
|
2 年之前 |
Olatunji Ruwase
|
91e15593ea
Control ds_report output (#1622)
|
2 年之前 |
Jeff Rasley
|
3488b8cdd3
[readme] remove stats badge until PyPI is fixed
|
2 年之前 |
Pierce Stegman
|
cda7c71895
Sparse Attention: Fix Triton errors (#1608)
|
2 年之前 |
Jeff Rasley
|
4b854a37cb
[zero-3] set default device during zero.Init (#1605)
|
2 年之前 |
Reza Yazdani
|
8e891aa568
Transformer kernel/fix layer norm (#1587)
|
2 年之前 |
Alex Hedges
|
fc2f378ece
Improve pre-commit hooks (#1602)
|
2 年之前 |
Jeff Rasley
|
8159c1bc5b
bump to 0.5.9
|
2 年之前 |
Alex Hedges
|
9aa288d745
Remove unused import of ssl.OP_ENABLE_MIDDLEBOX_COMPAT (#1601)
|
2 年之前 |
Jeff Rasley
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 年之前 |
Paige Wang
|
c3f1d82b07
Update engine.py (#1596)
|
2 年之前 |
Stas Bekman
|
7a132a9f4b
port OVERFLOW log to ZeRO-2 (#1593)
|
2 年之前 |
Mikhail Druzhinin
|
d14baad940
allreduce_always_fp16 (#1487)
|
2 年之前 |
Jeff Rasley
|
52c7889b01
allow external control of gradient accumulation boundary (#1588)
|
2 年之前 |
eltonzheng
|
51d42ab9ec
fix partition activations issue when mp=2 and pp=2 (#1589)
|
2 年之前 |
Mikhail Druzhinin
|
499800caa8
Fix return code on error (#1540)
|
2 年之前 |
Wenhao Hu
|
a637cc2cd5
Enable AVX256 on AMD CPU (#1360)
|
2 年之前 |
alexandremuzio
|
1bc13fe83f
Removing `ImportError` from tutel import try/except (#1583)
|
2 年之前 |
Chunyang Wen
|
e2b39ded9f
Replace brute force and add log (#1560)
|
2 年之前 |
Manuel R. Ciosici
|
e1b4aa8f3b
Add documentation for TensorBoard logging (#1577)
|
2 年之前 |