multiprocessing.rst 2.5 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
  1. .. _ray-multiprocessing:
  2. Distributed multiprocessing.Pool
  3. ================================
  4. .. _`issue on GitHub`: https://github.com/ray-project/ray/issues
  5. Ray supports running distributed python programs with the `multiprocessing.Pool API`_
  6. using `Ray Actors <actors.html>`__ instead of local processes. This makes it easy
  7. to scale existing applications that use ``multiprocessing.Pool`` from a single node
  8. to a cluster.
  9. .. _`multiprocessing.Pool API`: https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
  10. Quickstart
  11. ----------
  12. To get started, first `install Ray <installation.html>`__, then use
  13. ``ray.util.multiprocessing.Pool`` in place of ``multiprocessing.Pool``.
  14. This will start a local Ray cluster the first time you create a ``Pool`` and
  15. distribute your tasks across it. See the `Run on a Cluster`_ section below for
  16. instructions to run on a multi-node Ray cluster instead.
  17. .. code-block:: python
  18. from ray.util.multiprocessing import Pool
  19. def f(index):
  20. return index
  21. pool = Pool()
  22. for result in pool.map(f, range(100)):
  23. print(result)
  24. The full ``multiprocessing.Pool`` API is currently supported. Please see the
  25. `multiprocessing documentation`_ for details.
  26. .. warning::
  27. The ``context`` argument in the ``Pool`` constructor is ignored when using Ray.
  28. .. _`multiprocessing documentation`: https://docs.python.org/3/library/multiprocessing.html#module-multiprocessing.pool
  29. Run on a Cluster
  30. ----------------
  31. This section assumes that you have a running Ray cluster. To start a Ray cluster,
  32. please refer to the `cluster setup <cluster/index.html>`__ instructions.
  33. To connect a ``Pool`` to a running Ray cluster, you can specify the address of the
  34. head node in one of two ways:
  35. - By setting the ``RAY_ADDRESS`` environment variable.
  36. - By passing the ``ray_address`` keyword argument to the ``Pool`` constructor.
  37. .. code-block:: python
  38. from ray.util.multiprocessing import Pool
  39. # Starts a new local Ray cluster.
  40. pool = Pool()
  41. # Connects to a running Ray cluster, with the current node as the head node.
  42. # Alternatively, set the environment variable RAY_ADDRESS="auto".
  43. pool = Pool(ray_address="auto")
  44. # Connects to a running Ray cluster, with a remote node as the head node.
  45. # Alternatively, set the environment variable RAY_ADDRESS="<ip_address>:<port>".
  46. pool = Pool(ray_address="<ip_address>:<port>")
  47. You can also start Ray manually by calling ``ray.init()`` (with any of its supported
  48. configuration options) before creating a ``Pool``.