123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178 |
- .. algorithm-reference-docs:
- Algorithms
- ==========
- The :py:class:`~ray.rllib.algorithms.algorithm.Algorithm` class is the highest-level API in RLlib responsible for **WHEN** and **WHAT** of RL algorithms. Things like **WHEN** should we sample the algorithm, **WHEN** should we perform a neural network update, and so on. The **HOW** will be delegated to components such as :py:class:`~ray.rllib.evaluation.rollout_worker.RolloutWorker`, etc.. It is the main entry point for RLlib users to interact with RLlib's algorithms.
- It allows you to train and evaluate policies, save an experiment's progress and restore from
- a prior saved experiment when continuing an RL run.
- :py:class:`~ray.rllib.algorithms.algorithm.Algorithm` is a sub-class
- of :py:class:`~ray.tune.trainable.Trainable`
- and thus fully supports distributed hyperparameter tuning for RL.
- .. https://docs.google.com/drawings/d/1J0nfBMZ8cBff34e-nSPJZMM1jKOuUL11zFJm6CmWtJU/edit
- .. figure:: ../images/trainer_class_overview.svg
- :align: left
- **A typical RLlib Algorithm object:** Algorhtms are normally comprised of
- N :py:class:`~ray.rllib.evaluation.rollout_worker.RolloutWorker` that
- orchestrated via a :py:class:`~ray.rllib.evaluation.worker_set.WorkerSet` object.
- Each worker own its own a set of :py:class:`~ray.rllib.policy.policy.Policy` objects and their NN models per worker, plus a :py:class:`~ray.rllib.env.base_env.BaseEnv` instance per worker.
- .. _algo-config-api:
- Algorithm Configuration API
- ----------------------------
- The :py:class:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig` class represents
- the primary way of configuring and building an :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`.
- You don't use ``AlgorithmConfig`` directly in practice, but rather use its algorithm-specific
- implementations such as :py:class:`~ray.rllib.algorithms.ppo.ppo.PPOConfig`, which each come
- with their own set of arguments to their respective ``.training()`` method.
- .. currentmodule:: ray.rllib.algorithms.algorithm_config
- Constructor
- ~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~AlgorithmConfig
- Public methods
- ~~~~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~AlgorithmConfig.build
- ~AlgorithmConfig.freeze
- ~AlgorithmConfig.copy
- ~AlgorithmConfig.validate
- Configuration methods
- ~~~~~~~~~~~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~AlgorithmConfig.callbacks
- ~AlgorithmConfig.debugging
- ~AlgorithmConfig.environment
- ~AlgorithmConfig.evaluation
- ~AlgorithmConfig.experimental
- ~AlgorithmConfig.fault_tolerance
- ~AlgorithmConfig.framework
- ~AlgorithmConfig.multi_agent
- ~AlgorithmConfig.offline_data
- ~AlgorithmConfig.python_environment
- ~AlgorithmConfig.reporting
- ~AlgorithmConfig.resources
- ~AlgorithmConfig.rl_module
- ~AlgorithmConfig.rollouts
- ~AlgorithmConfig.training
- Getter methods
- ~~~~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~AlgorithmConfig.get_default_learner_class
- ~AlgorithmConfig.get_default_rl_module_spec
- ~AlgorithmConfig.get_evaluation_config_object
- ~AlgorithmConfig.get_marl_module_spec
- ~AlgorithmConfig.get_multi_agent_setup
- ~AlgorithmConfig.get_rollout_fragment_length
- Miscellaneous methods
- ~~~~~~~~~~~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~AlgorithmConfig.validate_train_batch_size_vs_rollout_fragment_length
- Building Custom Algorithm Classes
- ---------------------------------
- .. warning::
- As of Ray >= 1.9, it is no longer recommended to use the `build_trainer()` utility
- function for creating custom Algorithm sub-classes.
- Instead, follow the simple guidelines here for directly sub-classing from
- :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`.
- In order to create a custom Algorithm, sub-class the
- :py:class:`~ray.rllib.algorithms.algorithm.Algorithm` class
- and override one or more of its methods. Those are in particular:
- * :py:meth:`~ray.rllib.algorithms.algorithm.Algorithm.setup`
- * :py:meth:`~ray.rllib.algorithms.algorithm.Algorithm.get_default_config`
- * :py:meth:`~ray.rllib.algorithms.algorithm.Algorithm.get_default_policy_class`
- * :py:meth:`~ray.rllib.algorithms.algorithm.Algorithm.training_step`
- `See here for an example on how to override Algorithm <https://github.com/ray-project/ray/blob/master/rllib/algorithms/ppo/ppo.py>`_.
- .. _rllib-algorithm-api:
- Algorithm API
- -------------
- .. currentmodule:: ray.rllib.algorithms.algorithm
- Constructor
- ~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~Algorithm
- Inference and Evaluation
- ~~~~~~~~~~~~~~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~Algorithm.compute_actions
- ~Algorithm.compute_single_action
- ~Algorithm.evaluate
- Saving and Restoring
- ~~~~~~~~~~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~Algorithm.from_checkpoint
- ~Algorithm.from_state
- ~Algorithm.get_weights
- ~Algorithm.set_weights
- ~Algorithm.export_model
- ~Algorithm.export_policy_checkpoint
- ~Algorithm.export_policy_model
- ~Algorithm.import_policy_model_from_h5
- ~Algorithm.restore
- ~Algorithm.restore_from_object
- ~Algorithm.restore_workers
- ~Algorithm.save
- ~Algorithm.save_checkpoint
- ~Algorithm.save_to_object
- Training
- ~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~Algorithm.train
- ~Algorithm.training_step
- Multi Agent
- ~~~~~~~~~~~
- .. autosummary::
- :toctree: doc/
- ~Algorithm.add_policy
- ~Algorithm.remove_policy
|