Jun Gong
|
2317c693cf
[RLlib] Use SampleBrach instead of input dict whenever possible (#20746)
|
2 years ago |
gjoliver
|
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents (#19627)
|
3 years ago |
gjoliver
|
c3c42278e4
[RLlib] clean up all the SampleBatch['is_training'] deprecation warnings (#19652)
|
3 years ago |
Sven Mika
|
1f0646f658
[RLlib] Issue 18418: SAC w/ dict space not working. (#19101)
|
3 years ago |
Sven Mika
|
924f11cd45
[RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371)
|
3 years ago |
Julius Frost
|
d7a5ec1830
[RLlib] SAC tuple observation space fix (#17356)
|
3 years ago |
Sven Mika
|
eb0038612f
[RLlib] Extend on_learn_on_batch callback to allow for custom metrics to be added. (#13584)
|
3 years ago |
Sven Mika
|
52c94b7ee9
[RLlib] Allow SAC to use custom models as Q- or policy nets and deprecate "state-preprocessor" for image spaces. (#13522)
|
3 years ago |
Sven Mika
|
8726521604
[RLlib] JAXPolicy prep PR #2 (move get_activation_fn (backward-compatibly), minor fixes and preparations). (#13091)
|
3 years ago |
Sven Mika
|
291c172d83
[RLlib] Support Simplex action spaces for SAC (torch and tf). (#11909)
|
4 years ago |
Sven Mika
|
f5e2cda68a
[RLlib] SAC: log_alpha not being learnt when on GPU. (#11298)
|
4 years ago |
Sven Mika
|
805dad3bc4
[RLlib] SAC algo cleanup. (#10825)
|
4 years ago |
Sven Mika
|
0ba7472da9
[Testing] Fix LINT/sphinx errors. (#8874)
|
4 years ago |
Sven Mika
|
428516056a
[RLlib] SAC Torch (incl. Atari learning) (#7984)
|
4 years ago |