utils.rst 3.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202
  1. .. _utils-reference-docs:
  2. RLlib Utilities
  3. ===============
  4. Here is a list of all the utilities available in RLlib.
  5. Exploration API
  6. ---------------
  7. Exploration is crucial in RL for enabling a learning agent to find new, potentially high-reward states by reaching unexplored areas of the environment.
  8. RLlib has several built-in exploration components that
  9. the different algorithms use. You can also customize an algorithm's exploration
  10. behavior by sub-classing the Exploration base class and implementing
  11. your own logic:
  12. Built-in Exploration components
  13. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  14. .. currentmodule:: ray.rllib.utils.exploration
  15. .. autosummary::
  16. :toctree: doc/
  17. ~exploration.Exploration
  18. ~random.Random
  19. ~stochastic_sampling.StochasticSampling
  20. ~epsilon_greedy.EpsilonGreedy
  21. ~gaussian_noise.GaussianNoise
  22. ~ornstein_uhlenbeck_noise.OrnsteinUhlenbeckNoise
  23. ~random_encoder.RE3
  24. ~curiosity.Curiosity
  25. ~parameter_noise.ParameterNoise
  26. Inference
  27. ~~~~~~~~~
  28. .. autosummary::
  29. :toctree: doc/
  30. ~exploration.Exploration.get_exploration_action
  31. Callback hooks
  32. ~~~~~~~~~~~~~~
  33. .. autosummary::
  34. :toctree: doc/
  35. ~exploration.Exploration.before_compute_actions
  36. ~exploration.Exploration.on_episode_start
  37. ~exploration.Exploration.on_episode_end
  38. ~exploration.Exploration.postprocess_trajectory
  39. Setting and getting states
  40. ~~~~~~~~~~~~~~~~~~~~~~~~~~
  41. .. autosummary::
  42. :toctree: doc/
  43. ~exploration.Exploration.get_state
  44. ~exploration.Exploration.set_state
  45. Scheduler API
  46. -------------
  47. Use a scheduler to set scheduled values for variables (in Python, PyTorch, or
  48. TensorFlow) based on an (int64) timestep input. The computed values are usually float32
  49. types.
  50. Built-in Scheduler components
  51. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  52. .. currentmodule:: ray.rllib.utils.schedules
  53. .. autosummary::
  54. :toctree: doc/
  55. ~schedule.Schedule
  56. ~constant_schedule.ConstantSchedule
  57. ~linear_schedule.LinearSchedule
  58. ~piecewise_schedule.PiecewiseSchedule
  59. ~exponential_schedule.ExponentialSchedule
  60. ~polynomial_schedule.PolynomialSchedule
  61. Methods
  62. ~~~~~~~
  63. .. autosummary::
  64. :toctree: doc/
  65. ~schedule.Schedule.value
  66. ~schedule.Schedule.__call__
  67. .. _train-ops-docs:
  68. Training Operations Utilities
  69. -----------------------------
  70. .. currentmodule:: ray.rllib.execution.train_ops
  71. .. autosummary::
  72. :toctree: doc/
  73. ~multi_gpu_train_one_step
  74. ~train_one_step
  75. Framework Utilities
  76. -------------------
  77. Import utilities
  78. ~~~~~~~~~~~~~~~~
  79. .. currentmodule:: ray.rllib.utils.framework
  80. .. autosummary::
  81. :toctree: doc/
  82. ~try_import_torch
  83. ~try_import_tf
  84. ~try_import_tfp
  85. Tensorflow utilities
  86. ~~~~~~~~~~~~~~~~~~~~
  87. .. currentmodule:: ray.rllib.utils.tf_utils
  88. .. autosummary::
  89. :toctree: doc/
  90. ~explained_variance
  91. ~flatten_inputs_to_1d_tensor
  92. ~get_gpu_devices
  93. ~get_placeholder
  94. ~huber_loss
  95. ~l2_loss
  96. ~make_tf_callable
  97. ~minimize_and_clip
  98. ~one_hot
  99. ~reduce_mean_ignore_inf
  100. ~scope_vars
  101. ~warn_if_infinite_kl_divergence
  102. ~zero_logps_from_actions
  103. Torch utilities
  104. ~~~~~~~~~~~~~~~
  105. .. currentmodule:: ray.rllib.utils.torch_utils
  106. .. autosummary::
  107. :toctree: doc/
  108. ~apply_grad_clipping
  109. ~concat_multi_gpu_td_errors
  110. ~convert_to_torch_tensor
  111. ~explained_variance
  112. ~flatten_inputs_to_1d_tensor
  113. ~get_device
  114. ~global_norm
  115. ~huber_loss
  116. ~l2_loss
  117. ~minimize_and_clip
  118. ~one_hot
  119. ~reduce_mean_ignore_inf
  120. ~sequence_mask
  121. ~warn_if_infinite_kl_divergence
  122. ~set_torch_seed
  123. ~softmax_cross_entropy_with_logits
  124. Numpy utilities
  125. ~~~~~~~~~~~~~~~
  126. .. currentmodule:: ray.rllib.utils.numpy
  127. .. autosummary::
  128. :toctree: doc/
  129. ~aligned_array
  130. ~concat_aligned
  131. ~convert_to_numpy
  132. ~fc
  133. ~flatten_inputs_to_1d_tensor
  134. ~make_action_immutable
  135. ~huber_loss
  136. ~l2_loss
  137. ~lstm
  138. ~one_hot
  139. ~relu
  140. ~sigmoid
  141. ~softmax