Avnish Narayan
|
43210e0190
[RLlib] Change placement group strategy for learner (#36929)
|
1 year ago |
Sven Mika
|
e14c9b1da5
[RLlib] Remove `vtrace_drop_last_ts` option and add proper vf bootstrapping to IMPALA and APPO. (#36013)
|
1 year ago |
Sven Mika
|
827ab91741
[RLlib] Replace remaining mentions of "trainer" by "algorithm". (#36557)
|
1 year ago |
Sven Mika
|
f1f714c69e
[RLlib] Learner API enhancements and cleanups (prep. for DreamerV3). (#35877)
|
1 year ago |
Avnish Narayan
|
7d52c2fa69
[RLlib] Make learner group with non blocking update return multiple results (#35858)
|
1 year ago |
Avnish Narayan
|
b81b1be339
[RLlib] Make resource requests for multi gpu learners not request cpu IMPALA (#35679)
|
1 year ago |
Sven Mika
|
0fd06ad6fd
[RLlib] Learner API (+DreamerV3 prep): `Learner.register_metrics` API, cleanup, etc.. (#35573)
|
1 year ago |
Sven Mika
|
384ad04987
[RLlib] APPO+new-stack (Atari benchmark) - Preparatory PR 04 - LearnerAPI changes/tf-tracing fixes. (#34959)
|
1 year ago |
Sven Mika
|
78b58a959a
[RLlib] Learner API: Policies using RLModules (for sampler only) do not need loss/stats/mixins. (#34445)
|
1 year ago |
Sven Mika
|
adfdbbdfa2
[RLlib] APPO+new-stack (Atari benchmark) - Preparatory PR 03 - PyTorch. (#34779)
|
1 year ago |
Sven Mika
|
e399fb8037
[RLlib] APPO+new-stack (Atari benchmark) - Preparatory PR 02. (#34777)
|
1 year ago |
Sven Mika
|
25a5bcb269
[RLlib] Learner API: Fix and unify grad-clipping configs and behaviors. (#34464)
|
1 year ago |
Artur Niederfahrenhorst
|
8a80839843
[RLlib] Fix test backward compatibility test for RL Modules (#33857)
|
1 year ago |
Avnish Narayan
|
10d12f8220
[RLlib] APPO TF with RLModule and Learner API (#33310)
|
1 year ago |
Artur Niederfahrenhorst
|
f4a2044f0f
[RLlib] Implement Impala Torch in RLModule / Learner stack. (#33271)
|
1 year ago |
Avnish Narayan
|
46bd490b88
[RLlib] Move Learner Hp assignment to validate (#33392)
|
1 year ago |
Avnish Narayan
|
f697e118b8
[RLlib] Add option for running multiple sgd iters for impala learner api (#33316)
|
1 year ago |
Artur Niederfahrenhorst
|
8a9a176a24
[RLlib] Remove all default config objects and rllib/agents (#33242)
|
1 year ago |
Sven Mika
|
ac2230fd28
[RLlib] Forum 9306: No env on local worker (IMPALA). (#32359)
|
1 year ago |
kourosh hakhamaneshi
|
28d4eda64d
[RLlib] RLModule -- remove from_model_config (#33102)
|
1 year ago |
kourosh hakhamaneshi
|
a892241ca7
[RLlib] Fix test_repro_ppo learner when learner API is enabled (#32832)
|
1 year ago |
Avnish Narayan
|
aa274974df
[RLlib] IMPALA trainer tf (#32713)
|
1 year ago |
Artur Niederfahrenhorst
|
cacc982216
[RLlib] Add sample timer to all algorithms' `training_step()` methods (where it's simple to add). (#32475)
|
1 year ago |
Sven Mika
|
0cb8070450
[RLlib] AlgorithmConfig objects supported by all (internally used) `Algorithm.default_resource_request()` methods. (#31958)
|
1 year ago |
Jun Gong
|
4b8917084e
[RLlib] Fix worker state restoration. (#31644)
|
1 year ago |
Sven Mika
|
e0e75a81c6
[RLlib] Issue 31174: Move all checks into AlgorithmConfig.validate() (even simple ones) to avoid errors when using tune hyperopt objects. (#31396)
|
1 year ago |
Sven Mika
|
a9667e7b9d
[RLlib] Fix flakey 100-policies LRU cache test. (#30823)
|
1 year ago |
Jun Gong
|
76cb42c578
[RLlib] Fault tolerant and elastic WorkerSets used across RLlib's algorithms (for sampling and evaluation). (#30118)
|
1 year ago |
Max Pumperla
|
23f460b0fe
[RLlib] AlgorithmConfig docs (#29796)
|
1 year ago |
Sven Mika
|
756321145d
[RLlib] Add metrics to IMPALA/APPO/PPO (prototype) to measure off-policy'ness for performed updates. (#29983)
|
1 year ago |