TongXU
|
d8ed3ce445
remove the print line in _einsum_flops_compute function (#1885)
|
2 年之前 |
Stas Bekman
|
c487372257
add now required `-lcurand` to solve `undefined symbol: curandCreateGenerator` (#1879)
|
2 年之前 |
Jeff Rasley
|
0149bd4fc2
[docs] fix dead links (#1877)
|
2 年之前 |
Samyam Rajbhandari
|
c13457b756
Supporting multiple modules injection with a single policy when they have identical architectures (#1869)
|
2 年之前 |
Jeff Rasley
|
4a356d0896
[docs] add moe paper (#1875)
|
2 年之前 |
Jeff Rasley
|
b9e1529678
[docs] add amd blog to website (#1874)
|
2 年之前 |
Olatunji Ruwase
|
d6db651052
Fix broken links (#1873)
|
2 年之前 |
matherit
|
b47e25bf95
Add support for AWS SageMaker. (#1868)
|
2 年之前 |
Blaine Rogers
|
73b5d9833a
Fix setup.py crash when torch is not installed. (#1866)
|
2 年之前 |
Jeff Rasley
|
398f06035a
bump to 0.6.2
|
2 年之前 |
Samyam Rajbhandari
|
ebbcfd5273
qkv_out can be a single tensor or a list. Handling these cases separetely. (#1850)
|
2 年之前 |
Karthikeyan Singaravelan
|
c7af747ce0
Import ABC from collections.abc for Python 3.10 compatibility. (#1851)
|
2 年之前 |
Jeff Rasley
|
208d45bbf7
fix dead MoQ link (#1855)
|
2 年之前 |
Sayed Hadi Hashemi
|
b61d7199c4
Update config-json.md (#1853)
|
2 年之前 |
Ammar Ahmad Awan
|
788e1c40e8
deepscale --> deepspeed in prints. (#1854)
|
2 年之前 |
Jeff Rasley
|
0eb2c763b4
add amd release blog (#1848)
|
2 年之前 |
Michael Wyatt
|
2e1847d6c8
Add concurrency policy to CI workflow (#1844)
|
2 年之前 |
Olatunji Ruwase
|
b84edef23f
Track only trainable parameters (#1780)
|
2 年之前 |
shjwudp
|
1e61c7a860
fix: Fix undefined variable in _create_expert_data_and_model_parallel and make it easier to understand (#1826)
|
2 年之前 |
shjwudp
|
5fb4256a7a
fix: fix undefined variable in MoE top2gating (#1827)
|
2 年之前 |
Ramya Ramineni
|
b4e8f18c27
THCGeneral.h header file is deprecated (#1842)
|
2 年之前 |
Jeff Rasley
|
28434c0026
[ZeRO-1] fix bug w. cpu-offload + > 1 GPU (#1841)
|
2 年之前 |
Stas Bekman
|
18ea8b7904
[build] support cuda-11.6 (#1836)
|
2 年之前 |
Michael Wyatt
|
41d90830e2
Split github action unit tests by platform/GPU (#1828)
|
2 年之前 |
Jeff Rasley
|
a773996d97
[launcher] validate passwordless-ssh works when using hostfile launching (#1832)
|
2 年之前 |
Olatunji Ruwase
|
89b8c8872b
Reduce specified gradients (#1818)
|
2 年之前 |
Cheng Li
|
8a35daf061
Tune webpage width (#1829)
|
2 年之前 |
Jeff Rasley
|
9bc296e15b
[website] remove extra news header
|
2 年之前 |
Cheng Li
|
908d616072
Website posts and tutorial improvements (#1799)
|
2 年之前 |
Ramya Ramineni
|
7bcb4fabeb
Enable CG headers on ROCm (#1821)
|
2 年之前 |