Sven Mika
|
d5bfb7b7da
[RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652)
|
2 years ago |
Sven Mika
|
92f030331e
[RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420)
|
2 years ago |
Sven Mika
|
bec719d823
[RLlib] Trainer sub-class IMPALA (instead of using `build_trainer()`). (#20570)
|
2 years ago |
Sven Mika
|
49cd7ea6f9
[RLlib] Trainer sub-class PPO/DDPPO (instead of `build_trainer()`). (#20571)
|
2 years ago |
Sven Mika
|
e6ae08f416
[RLlib] Optionally don't drop last ts in v-trace calculations (APPO and IMPALA). (#19601)
|
3 years ago |
gjoliver
|
99a0088233
[RLlib] Unify the way we create local replay buffer for all agents (#19627)
|
3 years ago |
gjoliver
|
9226f9bddc
[RLlib] Report timesteps_this_iter to Tune, so it can track/checkpoint/restore total timesteps trained. (#19264)
|
3 years ago |
Sven Mika
|
698b4eeed3
[RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669)
|
3 years ago |
Sven Mika
|
3803e796ff
[RLlib] Multi-GPU learner thread (IMPALA) error messages/comments/code-cleanup. (#18540)
|
3 years ago |
Sven Mika
|
ea4a22249c
[RLlib] Add simple action-masking example script/env/model (tf and torch). (#18494)
|
3 years ago |
Sven Mika
|
599e589481
[RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065)
|
3 years ago |
Sven Mika
|
5a313ba3d6
[RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169)
|
3 years ago |
Sven Mika
|
169ddabae7
[RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429)
|
3 years ago |
Sven Mika
|
bdda73e2dd
[RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421)
|
3 years ago |
Sven Mika
|
c90de315e5
[RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. (#15295)
|
3 years ago |
Sven Mika
|
ef944bc5f0
[RLlib] Re-enable placement group support for RLlib. (#14384)
|
3 years ago |
Eric Liang
|
9db000ff2c
Auto report object store memory usage; remove some deprecated code (#14260)
|
3 years ago |
Richard Liaw
|
a2d2275ee1
Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289)" (#14360)
|
3 years ago |
Sven Mika
|
6cd0cd3bd9
[RLlib + Tune] Add placement group support to RLlib. (#14289)
|
3 years ago |
Sven Mika
|
2e3655e8a9
[RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238)
|
3 years ago |
Sven Mika
|
e40b14d255
[RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420)
|
3 years ago |
Sven Mika
|
19c8033df2
[RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366)
|
3 years ago |
Sven Mika
|
62c7ab5182
[RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747)
|
4 years ago |
Eric Liang
|
5acd3e66dd
[rllib] Fix torch TD error, IMPALA LR updates (#9477)
|
4 years ago |
Sven Mika
|
2746fc0476
[RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520)
|
4 years ago |
Eric Liang
|
9a83908c46
[rllib] Deprecate policy optimizers (#8345)
|
4 years ago |
Eric Liang
|
9d012626e5
[rllib] Distributed exec workflow for impala (#8321)
|
4 years ago |
Sven Mika
|
166bb5d690
[RLlib] IMPALA PyTorch (#8287)
|
4 years ago |
Sven Mika
|
499ad5fbe4
[RLlib] PyTorch version of APPO. (#8120)
|
4 years ago |
Eric Liang
|
dd70720578
[rllib] Rename sample_batch_size => rollout_fragment_length (#7503)
|
4 years ago |