Sven Mika
|
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652)
|
2 年之前 |
Artur Niederfahrenhorst
|
d07e50e957
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into `rllib/execution/buffers` dir) (#20552)
|
2 年之前 |
Sven Mika
|
cf21c634a3
[RLlib] Fix deprecated warning for torch_ops.py (soft-replaced by torch_utils.py). (#19982)
|
3 年之前 |
Sven Mika
|
2d24ef0d32
[RLlib] Add all simple learning tests as `framework=tf2`. (#19273)
|
3 年之前 |
Sven Mika
|
1f0646f658
[RLlib] Issue 18418: SAC w/ dict space not working. (#19101)
|
3 年之前 |
Sven Mika
|
b4300dd532
[RLlib] Issue 18812: Torch multi-GPU stats not protected against race conditions. (#18937)
|
3 年之前 |
Sven Mika
|
ed85f59194
[RLlib] Unify all RLlib Trainer.train() -> results[info][learner][policy ID][learner_stats] and add structure tests. (#18879)
|
3 年之前 |
Sven Mika
|
e3e6ed7aaa
[RLlib] Issues 17844, 18034: Fix n-step > 1 bug. (#18358)
|
3 年之前 |
Sven Mika
|
599e589481
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065)
|
3 年之前 |
Sven Mika
|
4888d7c9af
[RLlib] Replay buffers: Add config option to store contents in checkpoints. (#17999)
|
3 年之前 |
Sven Mika
|
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371)
|
3 年之前 |
Sven Mika
|
8a844ff840
[RLlib] Issues: 17397, 17425, 16715, 17174. When on driver, Torch|TFPolicy should not use `ray.get_gpu_ids()` (b/c no GPUs assigned by ray). (#17444)
|
3 年之前 |
Julius Frost
|
d7a5ec1830
[RLlib] SAC tuple observation space fix (#17356)
|
3 年之前 |
Sven Mika
|
90b21ce27e
[RLlib] De-flake 3 test cases; Fix `config.simple_optimizer` and `SampleBatch.is_training` warnings. (#17321)
|
3 年之前 |
Sven Mika
|
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
3 年之前 |
Sven Mika
|
bc09e75b78
[RLlib] Fix 3 flakey test cases. (#15785)
|
3 年之前 |
Sven Mika
|
bb8a286cbc
[RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684)
|
3 年之前 |
Sven Mika
|
cecfc3b43b
[RLlib] Multi-GPU support for Torch algorithms. (#14709)
|
3 年之前 |
Sven Mika
|
9c5a0cfd7a
[RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386)
|
3 年之前 |
Raphael CHEN
|
93d4244d9c
[RLlib] Correctly get bytes size of SampleBatch (#14801)
|
3 年之前 |
Sven Mika
|
ef944bc5f0
[RLlib] Re-enable placement group support for RLlib. (#14384)
|
3 年之前 |
Richard Liaw
|
a2d2275ee1
Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289)" (#14360)
|
3 年之前 |
Sven Mika
|
6cd0cd3bd9
[RLlib + Tune] Add placement group support to RLlib. (#14289)
|
3 年之前 |
Sven Mika
|
37c7daa3c0
[RLlib] DDPG: Support simplex action space. (#14011)
|
3 年之前 |
Sven Mika
|
52c94b7ee9
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522)
|
3 年之前 |
Sven Mika
|
b7dbbfbf41
[RLlib] Issue 11591: SAC loss does not use PR-weights in critic loss term. (#12394)
|
3 年之前 |
Sven Mika
|
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
4 年之前 |
Sven Mika
|
291c172d83
[RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909)
|
4 年之前 |
Sven Mika
|
d9f1874e34
[RLlib] Minor fixes (torch GPU bugs + some cleanup). (#11609)
|
4 年之前 |
Sven Mika
|
f5e2cda68a
[RLlib] SAC: log_alpha not being learnt when on GPU. (#11298)
|
4 年之前 |