Sven Mika
|
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652)
|
2 年之前 |
Sven Mika
|
c4636c7c05
[RLlib] Issue 21633: SimpleQ should not use a prio. replay buffer. (#21665)
|
2 年之前 |
Jun Gong
|
7517aefe05
[RLlib] Bring back BC and Marwil learning tests. (#21574)
|
2 年之前 |
Sven Mika
|
90c6b10498
[RLlib] Decentralized multi-agent learning; PR #01 (#21421)
|
2 年之前 |
Sven Mika
|
188324c5c7
[RLlib] Issue 21552: `unsquash_action` and `clip_action` (when None) cause wrong actions computed by `Trainer.compute_single_action`. (#21553)
|
2 年之前 |
Matti Picus
|
ec6a33b736
[tune] fixes to allow tune/tests/test_commands.py to run on windows (#21342)
|
2 年之前 |
Sven Mika
|
f94bd99ce4
[RLlib] Issue 21044: Improve error message for "multiagent" dict checks. (#21448)
|
2 年之前 |
Sven Mika
|
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420)
|
2 年之前 |
Sven Mika
|
853d10871c
[RLlib] Issue 18499: PGTrainer with training_iteration fn does not support multi-GPU. (#21376)
|
2 年之前 |
Sven Mika
|
9e6b871739
[RLlib] Better utils for flattening complex inputs and enable prev-actions for LSTM/attention for complex action spaces. (#21330)
|
2 年之前 |
Sven Mika
|
62dbf26394
[RLlib] POC: Run PGTrainer w/o the distr. exec API (Trainer's new training_iteration method). (#20984)
|
2 年之前 |
Jun Gong
|
767f78eaf8
[RLlib] Always attach latest eval metrics. (#21011)
|
2 年之前 |
Sven Mika
|
db058d0fb3
[RLlib] Rename `metrics_smoothing_episodes` into `metrics_num_episodes_for_smoothing` for clarity. (#20983)
|
2 年之前 |
Sven Mika
|
596c8e2772
[RLlib] Experimental no-flatten option for actions/prev-actions. (#20918)
|
2 年之前 |
Sven Mika
|
f814c2af89
[RLlib; Docs] Docs API reference pages: `rllib/execution`, `rllib/evaluation`, `rllib/models`, `rllib/offline`. (#20538)
|
2 年之前 |
Sven Mika
|
b4790900f5
[RLlib] Sub-class `Trainer` (instead of `build_trainer()`): All remaining classes; soft-deprecate `build_trainer`. (#20725)
|
2 年之前 |
Sven Mika
|
60b2219d72
[RLlib] Allow for evaluation to run by `timesteps` (alternative to `episodes`) and add auto-setting to make sure train doesn't ever have to wait for eval (e.g. long episodes) to finish. (#20757)
|
2 年之前 |
Sven Mika
|
9e38f6f613
[RLlib] Trainer sub-class DDPG/TD3/APEX-DDPG (instead of `build_trainer`). (#20636)
|
2 年之前 |
Sven Mika
|
3d2e27485b
[RLlib] Trainer sub-class DQN/SimpleQ/APEX-DQN/R2D2 (instead of using `build_trainer`). (#20633)
|
2 年之前 |
Sven Mika
|
c07d8c4c22
[RLlib] Trainer sub-class A2C/A3C (instead of `build_trainer`). (#20635)
|
2 年之前 |
Sven Mika
|
49cd7ea6f9
[RLlib] Trainer sub-class PPO/DDPPO (instead of `build_trainer()`). (#20571)
|
2 年之前 |
Sven Mika
|
9d2fe5756c
[RLlib] Trainer sub-class for APPO (instead of using `build_trainer()`). (#20424)
|
2 年之前 |
Artur Niederfahrenhorst
|
d07e50e957
[RLlib] Replay buffer API (cleanups; docstrings; renames; move into `rllib/execution/buffers` dir) (#20552)
|
2 年之前 |
Sven Mika
|
56619b955e
[RLlib; Documentation] Some docstring cleanups; Rename RemoteVectorEnv into RemoteBaseEnv for clarity. (#20250)
|
2 年之前 |
Avnish Narayan
|
dc17f0a241
Add error messages for missing tf and torch imports (#20205)
|
2 年之前 |
Kai Fricke
|
3e6ba5d6d2
Revert "Revert [RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`." (#20285)
|
2 年之前 |
Kai Fricke
|
246787cdd9
Revert "[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)" (#20284)
|
2 年之前 |
Sven Mika
|
6f85af435f
[RLlib] POC: `PGTrainer` class that works by sub-classing, not `trainer_template.py`. (#20055)
|
2 年之前 |
Sven Mika
|
ebd56b57db
[RLlib; documentation] "RLlib in 60sec" overhaul. (#20215)
|
2 年之前 |
Kai Fricke
|
9c2b8c8501
[tune] Deprecate DurableTrainable (#19880)
|
2 年之前 |