提交历史

作者 SHA1 备注 提交日期
  Sven Mika d5bfb7b7da [RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652) 2 年之前
  Artur Niederfahrenhorst d07e50e957 [RLlib] Replay buffer API (cleanups; docstrings; renames; move into `rllib/execution/buffers` dir) (#20552) 2 年之前
  Sven Mika cf21c634a3 [RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982) 3 年之前
  Sven Mika 2d24ef0d32 [RLlib] Add all simple learning tests as `framework=tf2`. (#19273) 3 年之前
  Sven Mika 1f0646f658 [RLlib] Issue 18418: SAC w/ dict space not working. (#19101) 3 年之前
  Sven Mika b4300dd532 [RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937) 3 年之前
  Sven Mika ed85f59194 [RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879) 3 年之前
  Sven Mika e3e6ed7aaa [RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358) 3 年之前
  Sven Mika 599e589481 [RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 3 年之前
  Sven Mika 4888d7c9af [RLlib] Replay buffers: Add config option to store contents in checkpoints. (#17999) 3 年之前
  Sven Mika 924f11cd45 [RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371) 3 年之前
  Sven Mika 8a844ff840 [RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use `ray.get_gpu_ids()` (b/c no GPUs assigned by ray). (#17444) 3 年之前
  Julius Frost d7a5ec1830 [RLlib] SAC tuple observation space fix (#17356) 3 年之前
  Sven Mika 90b21ce27e [RLlib] De-flake 3 test cases; Fix `config.simple_optimizer` and `SampleBatch.is_training` warnings. (#17321) 3 年之前
  Sven Mika 5a313ba3d6 [RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 3 年之前
  Sven Mika bc09e75b78 [RLlib] Fix 3 flakey test cases. (#15785) 3 年之前
  Sven Mika bb8a286cbc [RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684) 3 年之前
  Sven Mika cecfc3b43b [RLlib] Multi-GPU support for Torch algorithms. (#14709) 3 年之前
  Sven Mika 9c5a0cfd7a [RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386) 3 年之前
  Raphael CHEN 93d4244d9c [RLlib] Correctly get bytes size of SampleBatch (#14801) 3 年之前
  Sven Mika ef944bc5f0 [RLlib] Re-enable placement group support for RLlib. (#14384) 3 年之前
  Richard Liaw a2d2275ee1 Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289)" (#14360) 3 年之前
  Sven Mika 6cd0cd3bd9 [RLlib + Tune] Add placement group support to RLlib. (#14289) 3 年之前
  Sven Mika 37c7daa3c0 [RLlib] DDPG: Support simplex action space. (#14011) 3 年之前
  Sven Mika 52c94b7ee9 [RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522) 3 年之前
  Sven Mika b7dbbfbf41 [RLlib] Issue 11591: SAC loss does not use PR-weights in critic loss term. (#12394) 3 年之前
  Sven Mika 62c7ab5182 [RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 4 年之前
  Sven Mika 291c172d83 [RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909) 4 年之前
  Sven Mika d9f1874e34 [RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609) 4 年之前
  Sven Mika f5e2cda68a [RLlib] SAC: log_alpha not being learnt when on GPU. (#11298) 4 年之前