Commit History

Author SHA1 Message Date
  Sven Mika a931076f59 [RLlib] Tf2 + eager-tracing same speed as framework=tf; Add more test coverage for tf2+tracing. (#19981) 3 years ago
  Sven Mika 0b308719f8 [RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829) 3 years ago
  Sven Mika f2cb2ed203 [RLlib; Docs overhaul] Docstring cleanup: Policies, policy_templates. (#19759) 3 years ago
  Sven Mika b213565783 [RLlib] Fix failing test cases: Soft-deprecate ModelV2.from_batch (in favor of ModelV2.__call__). (#19693) 3 years ago
  Sven Mika 61a1274619 [RLlib] No Preprocessors (part 2). (#18468) 3 years ago
  Sven Mika 698b4eeed3 [RLlib] POC: Separate losses for APPO/IMPALA. Enable TFPolicy to handle multiple optimizers/losses (like TorchPolicy). (#18669) 3 years ago
  Sven Mika 9883505e84 [RLlib] Add [LSTM=True + multi-GPU]-tests to nightly RLlib testing suite (for all algos supporting RNNs, except R2D2, RNNSAC, and DDPPO). (#18017) 3 years ago
  Sven Mika 494ddd98c1 [RLlib] Replace "seq_lens" w/ SampleBatch.SEQ_LENS. (#17928) 3 years ago
  Sven Mika a428f10ebe [RLlib] Add multi-GPU learning tests to nightly. (#17778) 3 years ago
  Sven Mika 924f11cd45 [RLlib] Torch algos use now-framework-agnostic MultiGPUTrainOneStep execution op (~33% speedup for PPO-torch + GPU). (#17371) 3 years ago
  Sven Mika 90b21ce27e [RLlib] De-flake 3 test cases; Fix `config.simple_optimizer` and `SampleBatch.is_training` warnings. (#17321) 3 years ago
  Chris Bamford 29768a7c01 [RLLib] (P1 regression) Fixing view requirements in compute actions (#15856) 3 years ago
  Sven Mika 5a313ba3d6 [RLlib] Refactor: All tf static graph code should reside inside Policy class. (#17169) 3 years ago
  Sven Mika 18d173b172 [RLlib] Implement policy_maps (multi-agent case) in RolloutWorkers as LRU caches. (#17031) 3 years ago
  Sven Mika be6db06485 [RLlib] Re-do: Trainer: Support add and delete Policies. (#16569) 3 years ago
  Sven Mika e973b726c2 [RLlib] Support native tf.keras.Models (part 2) - Default keras models for Vision/RNN/Attention. (#15273) 3 years ago
  Sven Mika bb8a286cbc [RLlib] Support native tf.keras.Model (milestone toward obsoleting ModelV2 class). (#14684) 3 years ago
  Sven Mika 9c5a0cfd7a [RLlib] Issue 14385: `Policy.compute_actions_from_input_dict` does not properly track accessed fields for Policy's view requirements. (#14386) 3 years ago
  Sven Mika 4f66309e19 [RLlib] Redo issue 14533 tf enable eager exec (#14984) 3 years ago
  SangBin Cho fa5f961d5e Revert "[RLlib] Issue 14533: `tf.enable_eager_execution()` must be called at beginning. (#14737)" (#14918) 3 years ago
  Sven Mika 3e389d5812 [RLlib] Issue 14533: `tf.enable_eager_execution()` must be called at beginning. (#14737) 3 years ago
  Sven Mika 04bc0a9828 [RLlib] Remove all non-trajectory view API code. (#14860) 3 years ago
  Sven Mika 69202c6a7d [RLlib] Obsolete usage tracking dict via sample batch. (#13065) 3 years ago
  Sven Mika ee4b6e7e3b [RLlib] Unity3D example broken due to change in ML-Agents API. Attention-net prev-n-a/r. Attention-wrapper works with images. (#14569) 3 years ago
  Sven Mika 732197e23a [RLlib] Multi-GPU for tf-DQN/PG/A2C. (#13393) 3 years ago
  Sven Mika 8000258333 [RLlib] R2D2 Implementation. (#13933) 3 years ago
  Sven Mika 4db86404ad [RLlib] Issue #13507: Fix MB-MPO CartPole Env's reward function as well as MB-MPO running into a traj. view API related issue. (#14037) 3 years ago
  Sven Mika d7301a51f4 [RLlib]: Trajectory View API: Keep env infos (e.g. for postprocessing callbacks), no matter what. (#13555) 3 years ago
  Sven Mika 391cdfae8c [RLlib] Trajectory view API docs. (#12718) 3 years ago
  Sven Mika b2bcab711d [RLlib] Attention Nets: tf (#12753) 3 years ago