Reza Yazdani
|
8e891aa568
Transformer kernel/fix layer norm (#1587)
|
2 年之前 |
Alex Hedges
|
fc2f378ece
Improve pre-commit hooks (#1602)
|
2 年之前 |
Jeff Rasley
|
8159c1bc5b
bump to 0.5.9
|
2 年之前 |
Alex Hedges
|
9aa288d745
Remove unused import of ssl.OP_ENABLE_MIDDLEBOX_COMPAT (#1601)
|
2 年之前 |
Jeff Rasley
|
a10e4811fe
force set lf instead of crlf (https://github.com/pre-commit/pre-commit-hooks#mixed-line-ending) (#1598)
|
2 年之前 |
Paige Wang
|
c3f1d82b07
Update engine.py (#1596)
|
2 年之前 |
Stas Bekman
|
7a132a9f4b
port OVERFLOW log to ZeRO-2 (#1593)
|
2 年之前 |
Mikhail Druzhinin
|
d14baad940
allreduce_always_fp16 (#1487)
|
2 年之前 |
Jeff Rasley
|
52c7889b01
allow external control of gradient accumulation boundary (#1588)
|
2 年之前 |
eltonzheng
|
51d42ab9ec
fix partition activations issue when mp=2 and pp=2 (#1589)
|
2 年之前 |
Mikhail Druzhinin
|
499800caa8
Fix return code on error (#1540)
|
2 年之前 |
Wenhao Hu
|
a637cc2cd5
Enable AVX256 on AMD CPU (#1360)
|
2 年之前 |
alexandremuzio
|
1bc13fe83f
Removing `ImportError` from tutel import try/except (#1583)
|
2 年之前 |
Chunyang Wen
|
e2b39ded9f
Replace brute force and add log (#1560)
|
2 年之前 |
Manuel R. Ciosici
|
e1b4aa8f3b
Add documentation for TensorBoard logging (#1577)
|
2 年之前 |
Stas Bekman
|
bcf2bdde89
remove debug prints (#1585)
|
2 年之前 |
Jeff Rasley
|
8220674d86
bump to 0.5.8
|
2 年之前 |
Jeff Rasley
|
a8a17f234a
Several fixes for our read-the-docs build (#1579)
|
2 年之前 |
Jeff Rasley
|
2332cb31a7
Enables ZeRO-3 inference (#1514)
|
2 年之前 |
Stas Bekman
|
74baf5bbb9
[CI] transformers@master has been fixed (#1573)
|
2 年之前 |
Jeff Rasley
|
236890d6f3
switch bin files to use python3 instead of python (#1185)
|
2 年之前 |
James Reed
|
fafc827d64
Render docs for pipe.ProcessTopology (#1505)
|
2 年之前 |
Jeff Rasley
|
a90497ecff
Remove hard tensorboardX requirement (#1571)
|
2 年之前 |
Stas Bekman
|
e3c2d7b16f
[launcher/runner] respect CUDA_VISIBLE_DEVICES for a single node (#960)
|
2 年之前 |
Jeff Rasley
|
938449e34a
[autotuning] guard tabulate package import (#1569)
|
2 年之前 |
Aswin John Mathews
|
4a0b1032cb
Enforce nccl/rccl alignment of start location of each shard (#1564)
|
2 年之前 |
Jeff Rasley
|
4625add689
bump DSE commit
|
2 年之前 |
Jeff Rasley
|
da7bff40d0
set hf hash (#1568)
|
2 年之前 |
Mikhail Druzhinin
|
4bf4ab7ac5
Fix partial recovery of sparse_tensor_module_names and dynamically check if gradient data is sparse (#1562)
|
2 年之前 |
Cheng Li
|
bda3d0e6b9
Add autotuning news post (#1565)
|
2 年之前 |