openoker
/
ray


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202
							.. _utils-reference-docs:

RLlib Utilities
===============

Here is a list of all the utilities available in RLlib.

Exploration API
---------------

Exploration is crucial in RL for enabling a learning agent to find new, potentially high-reward states by reaching unexplored areas of the environment.

RLlib has several built-in exploration components that
the different algorithms use. You can also customize an algorithm's exploration
behavior by sub-classing the Exploration base class and implementing
your own logic:

Built-in Exploration components
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.utils.exploration

.. autosummary::
   :toctree: doc/

   ~exploration.Exploration
   ~random.Random
   ~stochastic_sampling.StochasticSampling
   ~epsilon_greedy.EpsilonGreedy
   ~gaussian_noise.GaussianNoise
   ~ornstein_uhlenbeck_noise.OrnsteinUhlenbeckNoise
   ~random_encoder.RE3
   ~curiosity.Curiosity
   ~parameter_noise.ParameterNoise


Inference
~~~~~~~~~
.. autosummary::
   :toctree: doc/

   ~exploration.Exploration.get_exploration_action

Callback hooks
~~~~~~~~~~~~~~

.. autosummary::
   :toctree: doc/

   ~exploration.Exploration.before_compute_actions
   ~exploration.Exploration.on_episode_start
   ~exploration.Exploration.on_episode_end
   ~exploration.Exploration.postprocess_trajectory


Setting and getting states
~~~~~~~~~~~~~~~~~~~~~~~~~~

.. autosummary::
   :toctree: doc/

   ~exploration.Exploration.get_state
   ~exploration.Exploration.set_state


Scheduler API
-------------

Use a scheduler to set scheduled values for variables (in Python, PyTorch, or
TensorFlow) based on an (int64) timestep input. The computed values are usually float32
types.


Built-in Scheduler components
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.utils.schedules

.. autosummary::
   :toctree: doc/

   ~schedule.Schedule
   ~constant_schedule.ConstantSchedule
   ~linear_schedule.LinearSchedule
   ~piecewise_schedule.PiecewiseSchedule
   ~exponential_schedule.ExponentialSchedule
   ~polynomial_schedule.PolynomialSchedule

Methods
~~~~~~~

.. autosummary::
   :toctree: doc/

   ~schedule.Schedule.value
   ~schedule.Schedule.__call__


.. _train-ops-docs:

Training Operations Utilities
-----------------------------

.. currentmodule:: ray.rllib.execution.train_ops

.. autosummary::
   :toctree: doc/

   ~multi_gpu_train_one_step
   ~train_one_step


Framework Utilities
-------------------

Import utilities
~~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.utils.framework

.. autosummary::
   :toctree: doc/

   ~try_import_torch
   ~try_import_tf
   ~try_import_tfp


Tensorflow utilities
~~~~~~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.utils.tf_utils

.. autosummary::
   :toctree: doc/

   ~explained_variance
   ~flatten_inputs_to_1d_tensor
   ~get_gpu_devices
   ~get_placeholder
   ~huber_loss
   ~l2_loss
   ~make_tf_callable
   ~minimize_and_clip
   ~one_hot
   ~reduce_mean_ignore_inf
   ~scope_vars
   ~warn_if_infinite_kl_divergence
   ~zero_logps_from_actions


Torch utilities
~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.utils.torch_utils


.. autosummary::
   :toctree: doc/

   ~apply_grad_clipping
   ~concat_multi_gpu_td_errors
   ~convert_to_torch_tensor
   ~explained_variance
   ~flatten_inputs_to_1d_tensor
   ~get_device
   ~global_norm
   ~huber_loss
   ~l2_loss
   ~minimize_and_clip
   ~one_hot
   ~reduce_mean_ignore_inf
   ~sequence_mask
   ~warn_if_infinite_kl_divergence
   ~set_torch_seed
   ~softmax_cross_entropy_with_logits


Numpy utilities
~~~~~~~~~~~~~~~

.. currentmodule:: ray.rllib.utils.numpy

.. autosummary::
   :toctree: doc/

   ~aligned_array
   ~concat_aligned
   ~convert_to_numpy
   ~fc
   ~flatten_inputs_to_1d_tensor
   ~make_action_immutable
   ~huber_loss
   ~l2_loss
   ~lstm
   ~one_hot
   ~relu
   ~sigmoid
   ~softmax