Quentin Anthony
|
0411a9f871
Expose Consecutive Hysteresis to Users (#3553)
|
1 年之前 |
Olatunji Ruwase
|
47f9f13bd3
DeepSpeed Chat (#3186)
|
1 年之前 |
Michael Wyatt
|
b361c72761
Update DeepSpeed copyright license to Apache 2.0 (#3111)
|
1 年之前 |
Jeff Rasley
|
91d63e0228
update formatter version and style settings (#3098)
|
1 年之前 |
Quentin Anthony
|
ac2c9ffae4
Improve loss overflow logs (#3008)
|
1 年之前 |
Olatunji Ruwase
|
80d8fcbdb3
Improve overflow handling (#2944)
|
1 年之前 |
Quentin Anthony
|
44085856a8
Add loss scale guard to avoid inf loop (#1958)
|
2 年之前 |
Stas Bekman
|
29853c3eed
less scary overflow notice (#833)
|
3 年之前 |
Jeff Rasley
|
e5bbc2e559
Sparse attn + ops/runtime refactor + v0.3.0 (#343)
|
4 年之前 |