rllib-catalogs.rst 8.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181
  1. .. include:: /_includes/rllib/announcement.rst
  2. .. include:: /_includes/rllib/we_are_hiring.rst
  3. .. include:: /_includes/rllib/rlmodules_rollout.rst
  4. .. note:: Interacting with Catalogs mainly covers advanced use cases.
  5. Catalog (Alpha)
  6. ===============
  7. Catalogs are where `RLModules <rllib-rlmodule.html>`__ primarily get their models and action distributions from.
  8. Each :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule` has its own default
  9. :py:class:`~ray.rllib.core.models.catalog.Catalog`. For example,
  10. :py:class:`~ray.rllib.algorithms.ppo.ppo_torch_rl_module.PPOTorchRLModule` has the
  11. :py:class:`~ray.rllib.algorithms.ppo.ppo_catalog.PPOCatalog`.
  12. You can override Catalogs’ methods to alter the behavior of existing RLModules.
  13. This makes Catalogs a means of configuration for RLModules.
  14. You interact with Catalogs when making deeper customization to what :py:class:`~ray.rllib.core.models.Model` and :py:class:`~ray.rllib.models.distributions.Distribution` RLlib creates by default.
  15. .. note::
  16. If you simply want to modify a :py:class:`~ray.rllib.core.models.Model` by changing its default values,
  17. have a look at the ``model config dict``:
  18. .. dropdown:: **``MODEL_DEFAULTS`` dict**
  19. :animate: fade-in-slide-down
  20. This dict (or an overriding sub-set) is part of :py:class:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig`
  21. and therefore also part of any algorithm-specific config.
  22. You can override its values and pass it to an :py:class:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig`
  23. to change the behavior RLlib's default models.
  24. .. literalinclude:: ../../../rllib/models/catalog.py
  25. :language: python
  26. :start-after: __sphinx_doc_begin__
  27. :end-before: __sphinx_doc_end__
  28. While Catalogs have a base class :py:class:`~ray.rllib.core.models.catalog.Catalog`, you mostly interact with
  29. Algorithm-specific Catalogs.
  30. Therefore, this doc also includes examples around PPO from which you can extrapolate to other algorithms.
  31. Prerequisites for this user guide is a rough understanding of `RLModules <rllib-rlmodule.html>`__.
  32. This user guide covers the following topics:
  33. - Basic usage
  34. - What are Catalogs
  35. - Inject your custom models into RLModules
  36. - Inject your custom action distributions into RLModules
  37. .. - Extend RLlib’s selection of Models and Distributions with your own
  38. .. - Write a Catalog from scratch
  39. Catalog and AlgorithmConfig
  40. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  41. Since Catalogs effectively control what ``models`` and ``distributions`` RLlib uses under the hood,
  42. they are also part of RLlib’s configurations. As the primary entry point for configuring RLlib,
  43. :py:class:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig` is the place where you can configure the
  44. Catalogs of the RLModules that are created.
  45. You set the ``catalog class`` by going through the :py:class:`~ray.rllib.core.rl_module.rl_module.SingleAgentRLModuleSpec`
  46. or :py:class:`~ray.rllib.core.rl_module.marl_module.MultiAgentRLModuleSpec` of an AlgorithmConfig.
  47. For example, in heterogeneous multi-agent cases, you modify the MultiAgentRLModuleSpec.
  48. .. image:: images/catalog/catalog_rlmspecs_diagram.svg
  49. :align: center
  50. The following example shows how to configure the Catalog of an :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule`
  51. created by PPO.
  52. .. literalinclude:: doc_code/catalog_guide.py
  53. :language: python
  54. :start-after: __sphinx_doc_algo_configs_begin__
  55. :end-before: __sphinx_doc_algo_configs_end__
  56. Basic usage
  57. ~~~~~~~~~~~
  58. The following three examples illustrate three basic usage patterns of Catalogs.
  59. The first example showcases the general API for interacting with Catalogs.
  60. .. literalinclude:: doc_code/catalog_guide.py
  61. :language: python
  62. :start-after: __sphinx_doc_basic_interaction_begin__
  63. :end-before: __sphinx_doc_basic_interaction_end__
  64. The second example showcases how to use the :py:class:`~ray.rllib.algorithms.ppo.ppo_catalog.PPOCatalog`
  65. to create a ``model`` and an ``action distribution``.
  66. This is more similar to what RLlib does internally.
  67. .. dropdown:: **Use catalog-generated models**
  68. :animate: fade-in-slide-down
  69. .. literalinclude:: doc_code/catalog_guide.py
  70. :language: python
  71. :start-after: __sphinx_doc_ppo_models_begin__
  72. :end-before: __sphinx_doc_ppo_models_end__
  73. The third example showcases how to use the base :py:class:`~ray.rllib.core.models.catalog.Catalog`
  74. to create an ``encoder`` and an ``action distribution``.
  75. Besides these, we create a ``head network`` that fits these two by hand to show how you can combine RLlib's
  76. :py:class:`~ray.rllib.core.models.base.ModelConfig` API and Catalog.
  77. Extending Catalog to also build this head is how :py:class:`~ray.rllib.core.models.catalog.Catalog` is meant to be
  78. extended, which we cover later in this guide.
  79. .. dropdown:: **Customize a policy head**
  80. :animate: fade-in-slide-down
  81. .. literalinclude:: doc_code/catalog_guide.py
  82. :language: python
  83. :start-after: __sphinx_doc_modelsworkflow_begin__
  84. :end-before: __sphinx_doc_modelsworkflow_end__
  85. What are Catalogs
  86. ~~~~~~~~~~~~~~~~~
  87. Catalogs have two primary roles: Choosing the right :py:class:`~ray.rllib.core.models.Model` and choosing the right :py:class:`~ray.rllib.models.distributions.Distribution`.
  88. By default, all catalogs implement decision trees that decide model architecture based on a combination of input configurations.
  89. These mainly include the ``observation space`` and ``action space`` of the :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule`, the ``model config dict`` and the ``deep learning framework backend``.
  90. The following diagram shows the break down of the information flow towards ``models`` and ``distributions`` within an :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule`.
  91. An :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule` creates an instance of the Catalog class they receive as part of their constructor.
  92. It then create its internal ``models`` and ``distributions`` with the help of this Catalog.
  93. .. note::
  94. You can also modify :py:class:`~ray.rllib.core.models.Model` or :py:class:`~ray.rllib.models.distributions.Distribution` in an :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule` directly by overriding the RLModule's constructor!
  95. .. image:: images/catalog/catalog_and_rlm_diagram.svg
  96. :align: center
  97. The following diagram shows a concrete case in more detail.
  98. .. dropdown:: **Example of catalog in a PPORLModule**
  99. :animate: fade-in-slide-down
  100. The :py:class:`~ray.rllib.algorithms.ppo.ppo_catalog.PPOCatalog` is fed an ``observation space``, ``action space``,
  101. a ``model config dict`` and the ``view requirements`` of the :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule`.
  102. The model config dicts and the view requirements are only of interest in special cases, such as
  103. recurrent networks or attention networks. A PPORLModule has four components that are created by the
  104. :py:class:`~ray.rllib.algorithms.ppo.ppo_catalog.PPOCatalog`:
  105. ``Encoder``, ``value function head``, ``policy head``, and ``action distribution``.
  106. .. image:: images/catalog/ppo_catalog_and_rlm_diagram.svg
  107. :align: center
  108. Inject your custom model or action distributions into RLModules
  109. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  110. You can make a :py:class:`~ray.rllib.core.models.catalog.Catalog` build custom ``models`` by overriding the Catalog’s methods used by RLModules to build ``models``.
  111. Have a look at these lines from the constructor of the :py:class:`~ray.rllib.algorithms.ppo.ppo_torch_rl_module.PPOTorchRLModule` to see how Catalogs are being used by an :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule`:
  112. .. literalinclude:: ../../../rllib/algorithms/ppo/ppo_rl_module.py
  113. :language: python
  114. :start-after: __sphinx_doc_begin__
  115. :end-before: __sphinx_doc_end__
  116. Consequently, in order to build a custom :py:class:`~ray.rllib.core.models.Model` compatible with a PPORLModule,
  117. you can override methods by inheriting from :py:class:`~ray.rllib.algorithms.ppo.ppo_catalog.PPOCatalog`
  118. or write a :py:class:`~ray.rllib.core.models.catalog.Catalog` that implements them from scratch.
  119. The following showcases such modifications.
  120. This example shows two modifications:
  121. - How to write a custom :py:class:`~ray.rllib.models.distributions.Distribution`
  122. - How to inject a custom action distribution into a :py:class:`~ray.rllib.core.models.catalog.Catalog`
  123. .. literalinclude:: ../../../rllib/examples/catalog/custom_action_distribution.py
  124. :language: python
  125. :start-after: __sphinx_doc_begin__
  126. :end-before: __sphinx_doc_end__
  127. Notable TODOs
  128. -------------
  129. - Add cross references to Model and Distribution API docs
  130. - Add example that shows how to inject own model
  131. - Add more instructions on how to write a catalog from scratch
  132. - Add section "Extend RLlib’s selection of Models and Distributions with your own"
  133. - Add section "Write a Catalog from scratch"