Commit History

Author SHA1 Message Date
  Sven Mika d5bfb7b7da [RLlib] Preparatory PR for multi-agent multi-GPU learner (alpha-star style) #03 (#21652) 2 years ago
  Sven Mika 92f030331e [RLlib] Initial code/comment cleanups in preparation for decentralized multi-agent learner. (#21420) 2 years ago
  Sven Mika bec719d823 [RLlib] Trainer sub-class IMPALA (instead of using `build_trainer()`). (#20570) 2 years ago
  Sven Mika 49cd7ea6f9 [RLlib] Trainer sub-class PPO/DDPPO (instead of `build_trainer()`). (#20571) 2 years ago
  Sven Mika e6ae08f416 [RLlib] Optionally don't drop last ts in v-trace calculations (APPO and IMPALA). (#19601) 3 years ago
  gjoliver 99a0088233 [RLlib] Unify the way we create local replay buffer for all agents (#19627) 3 years ago
  gjoliver 9226f9bddc [RLlib] Report timesteps_this_iter to Tune, so it can track/checkpoint/restore total timesteps trained. (#19264) 3 years ago
  Sven Mika 698b4eeed3 [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 3 years ago
  Sven Mika 3803e796ff [RLlib] Multi-GPU learner thread (IMPALA) error messages/comments/code-cleanup. (#18540) 3 years ago
  Sven Mika ea4a22249c [RLlib] Add simple action-masking example script/env/model (tf and torch). (#18494) 3 years ago
  Sven Mika 599e589481 [RLlib] Move existing fake multi-GPU learning tests into separate buildkite job. (#18065) 3 years ago
  Sven Mika 5a313ba3d6 [RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 3 years ago
  Sven Mika 169ddabae7 [RLlib] Issue 15973: Trainer.with_updates(validate_config=...) behaves confusingly. (#16429) 3 years ago
  Sven Mika bdda73e2dd [RLlib] Torch multi-GPU bug fixes (discussion 1755). (#15421) 3 years ago
  Sven Mika c90de315e5 [RLlib] APEX returns incorrect default resources (PleacementGroupFactory) colocated missing replay actors. (#15295) 3 years ago
  Sven Mika ef944bc5f0 [RLlib] Re-enable placement group support for RLlib. (#14384) 3 years ago
  Eric Liang 9db000ff2c Auto report object store memory usage; remove some deprecated code (#14260) 3 years ago
  Richard Liaw a2d2275ee1 Revert "[RLlib + Tune] Add placement group support to RLlib. (#14289)" (#14360) 3 years ago
  Sven Mika 6cd0cd3bd9 [RLlib + Tune] Add placement group support to RLlib. (#14289) 3 years ago
  Sven Mika 2e3655e8a9 [RLlib] Issue 9071 A3C w/ RNN not working due to VF assuming no RNN. (#13238) 3 years ago
  Sven Mika e40b14d255 [RLlib] Batch-size for truncate_episode batch_mode should be confgurable in agent-steps (rather than env-steps), if needed. (#12420) 3 years ago
  Sven Mika 19c8033df2 [RLlib] Fix most remaining RLlib algos for running with trajectory view API. (#12366) 3 years ago
  Sven Mika 62c7ab5182 [RLlib] Trajectory view API: Enable by default for PPO, IMPALA, PG, A3C (tf and torch). (#11747) 4 years ago
  Eric Liang 5acd3e66dd [rllib] Fix torch TD error, IMPALA LR updates (#9477) 4 years ago
  Sven Mika 2746fc0476 [RLlib] Auto-framework, retire `use_pytorch` in favor of `framework=...` (#8520) 4 years ago
  Eric Liang 9a83908c46 [rllib] Deprecate policy optimizers (#8345) 4 years ago
  Eric Liang 9d012626e5 [rllib] Distributed exec workflow for impala (#8321) 4 years ago
  Sven Mika 166bb5d690 [RLlib] IMPALA PyTorch (#8287) 4 years ago
  Sven Mika 499ad5fbe4 [RLlib] PyTorch version of APPO. (#8120) 4 years ago
  Eric Liang dd70720578 [rllib] Rename sample_batch_size => rollout_fragment_length (#7503) 4 years ago