提交历史

作者 SHA1 备注 提交日期
  Joe Mayer 18713c6838 Updating API docs (#2586) 1 年之前
  Siddharth Singh 5fe9d61065 Tensor parallelism for Mixture of Experts (#2074) 2 年之前
  Alex Hedges 316c4a43e0 Add flake8 to pre-commit checks (#2051) 2 年之前
  Jianfeng Liu b4513f6310 fix softmax dim of Residual MoE in moe/layer.py (#2110) 2 年之前
  Karim Foda 735406e536 fix import errors (#2026) 2 年之前
  Ammar Ahmad Awan 36ad3119d5 DeepSpeed comm backend v1 (#1985) 2 年之前
  shjwudp 1e61c7a860 fix: Fix undefined variable in _create_expert_data_and_model_parallel and make it easier to understand (#1826) 2 年之前
  Ammar Ahmad Awan c0af6d90f7 Refactor MoE and Groups API to simplify model creation and mangement (#1798) 2 年之前
  Jeff Rasley e46d808a1b MoE inference + PR-MoE model support (#1705) 2 年之前
  alexandremuzio 2887349cd4 Adding Tutel to MoE layer (#1528) 3 年之前
  Ammar Ahmad Awan 56635d5b6c enable/disable moe token dropping. (#1492) 3 年之前
  Ammar Ahmad Awan 9f5939d2a7 Remove dropout as client code can do it independently. (#1354) 3 年之前
  Jeff Rasley 9cb64a1fc5 MoE read the docs update (#1312) 3 年之前
  Ammar Ahmad Awan f28432441b DeepSpeed MoE (#1310) 3 年之前