Commit History

Author SHA1 Message Date
  Sven Mika 2d24ef0d32 [RLlib] Add all simple learning tests as `framework=tf2`. (#19273) 3 years ago
  Sven Mika 0b308719f8 [RLlib; Docs overhaul] Docstring cleanup: rllib/utils (#19829) 3 years ago
  Sven Mika 59f796edf3 [RLlib] Fix crash when using StochasticSampling exploration (most PG-style algos) w/ tf and numpy > 1.19.5 (#18366) 3 years ago
  Sven Mika 3013d9b341 [RLlib] Fix "Cannot convert a symbolic Tensor (default_policy/strided_slice_3:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported" (#17587) 3 years ago
  Sven Mika d0014cd351 [RLlib] Policies get/set_state fixes and enhancements. (#16354) 3 years ago
  Sven Mika 8ea1bc5ff9 [RLlib] Allow for more than 2^31 policy timesteps. (#11301) 4 years ago
  Sven Mika 199e5d0f75 [RLlib] Exploration class type annotations. (#11251) 4 years ago
  Sven Mika ce96b03b07 [RLlib] MB-MPO cleanup (comments, docstrings, type annotations). (#11033) 4 years ago
  Sven Mika c17169dc11 [RLlib] Fix all example scripts to run on GPUs. (#11105) 4 years ago
  Barak Michener 8e76796fd0 ci: Redo `format.sh --all` script & backfill lint fixes (#9956) 4 years ago
  Michael Luo 4d7bd8c892 [RLlib] Implementation of "Model-based Meta Policy Optimization" (MB MPO) (#9409) 4 years ago
  Sven Mika ff9c1dac88 [RLlib] Issue 9667 DDPG Torch bugs and enhancements. (#9680) 4 years ago
  Sven Mika fcdf410ae1 [RLlib] Tf2.x native. (#8752) 4 years ago
  Sven Mika 4da0e542d5 [RLlib] DDPG and SAC eager support (preparation for tf2.x) (#9204) 4 years ago
  Sven Mika 43043ee4d5 [RLlib] Tf2x preparation; part 2 (upgrading `try_import_tf()`). (#9136) 4 years ago
  Sven Mika 6c2b9a4cfa [RLlib] Remove tf.py_function from all Schedule classes (not differentiable and causes other bugs in MA setups). (#8304) 4 years ago
  Sven Mika 76e1a4df9e Fix TD3 torch via GaussianNoise torch bug. (#8276) 4 years ago
  Sven Mika 428516056a [RLlib] SAC Torch (incl. Atari learning) (#7984) 4 years ago
  Sven Mika 1b31c11806 [RLlib] DDPG re-factor to fit into RLlib's functional algorithm builder API. (#7934) 4 years ago
  Sven Mika e153e3179f [RLlib] Exploration API: Policy changes needed for forward pass noisifications. (#7798) 4 years ago
  Eric Liang 596b39e36a [rllib] Make timestep a required arg for exploration classes (#7380) 4 years ago
  Sven Mika 83e06cd30a [RLlib] DDPG refactor and Exploration API action noise classes. (#7314) 4 years ago