mars-on-ray.rst 2.0 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
  1. .. _mars-on-ray:
  2. Using Mars on Ray
  3. =================
  4. .. _`issue on GitHub`: https://github.com/mars-project/mars/issues
  5. `Mars`_ is a tensor-based unified framework for large-scale data computation which scales Numpy, Pandas and Scikit-learn.
  6. Mars on Ray makes it easy to scale your programs with a Ray cluster. Currently Mars on Ray supports both Ray actors
  7. and tasks as execution backend. The task will be scheduled by mars scheduler if Ray actors is used. This mode can reuse
  8. all mars scheduler optimizations. If ray tasks mode is used, all tasks will be scheduled by ray, which can reuse failover and
  9. pipeline capabilities provided by ray futures.
  10. .. _`Mars`: https://mars-project.readthedocs.io/en/latest/
  11. Installation
  12. -------------
  13. You can simply install Mars via pip:
  14. .. code-block:: bash
  15. pip install pymars>=0.8.3
  16. Getting started
  17. ----------------
  18. It's easy to run Mars jobs on a Ray cluster.
  19. Starting a new Mars on Ray runtime locally via:
  20. .. code-block:: python
  21. import ray
  22. ray.init()
  23. import mars
  24. mars.new_ray_session()
  25. import mars.tensor as mt
  26. mt.random.RandomState(0).rand(1000_0000, 5).sum().execute()
  27. Or connecting to a Mars on Ray runtime which is already initialized:
  28. .. code-block:: python
  29. import mars
  30. mars.new_ray_session('http://<web_ip>:<ui_port>')
  31. # perform computation
  32. Interact with Dataset:
  33. .. code-block:: python
  34. import mars.tensor as mt
  35. import mars.dataframe as md
  36. df = md.DataFrame(
  37. mt.random.rand(1000_0000, 4),
  38. columns=list('abcd'))
  39. # Convert mars dataframe to ray dataset
  40. import ray
  41. # ds = md.to_ray_dataset(df)
  42. ds = ray.data.from_mars(df)
  43. print(ds.schema(), ds.count())
  44. ds.filter(lambda row: row["a"] > 0.5).show(5)
  45. # Convert ray dataset to mars dataframe
  46. # df2 = md.read_ray_dataset(ds)
  47. df2 = ds.to_mars()
  48. print(df2.head(5).execute())
  49. Refer to _`Mars on Ray`: https://mars-project.readthedocs.io/en/latest/installation/ray.html#mars-ray for more information.