debugging.rst 3.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110
  1. Debugging
  2. =========
  3. Starting processes in a debugger
  4. --------------------------------
  5. When processes are crashing, it is often useful to start them in a debugger.
  6. Ray currently allows processes to be started in the following:
  7. - valgrind
  8. - the valgrind profiler
  9. - the perftools profiler
  10. - gdb
  11. - tmux
  12. To use any of these tools, please make sure that you have them installed on
  13. your machine first (``gdb`` and ``valgrind`` on MacOS are known to have issues).
  14. Then, you can launch a subset of ray processes by adding the environment
  15. variable ``RAY_{PROCESS_NAME}_{DEBUGGER}=1``. For instance, if you wanted to
  16. start the raylet in ``valgrind``, then you simply need to set the environment
  17. variable ``RAY_RAYLET_VALGRIND=1``.
  18. To start a process inside of ``gdb``, the process must also be started inside of
  19. ``tmux``. So if you want to start the raylet in ``gdb``, you would start your
  20. Python script with the following:
  21. .. code-block:: bash
  22. RAY_RAYLET_GDB=1 RAY_RAYLET_TMUX=1 python
  23. You can then list the ``tmux`` sessions with ``tmux ls`` and attach to the
  24. appropriate one.
  25. You can also get a core dump of the ``raylet`` process, which is especially
  26. useful when filing `issues`_. The process to obtain a core dump is OS-specific,
  27. but usually involves running ``ulimit -c unlimited`` before starting Ray to
  28. allow core dump files to be written.
  29. Inspecting Redis shards
  30. -----------------------
  31. To inspect Redis, you can use the global state API. The easiest way to do this
  32. is to start or connect to a Ray cluster with ``ray.init()``, then query the API
  33. like so:
  34. .. code-block:: python
  35. ray.init()
  36. ray.nodes()
  37. # Returns current information about the nodes in the cluster, such as:
  38. # [{'ClientID': '2a9d2b34ad24a37ed54e4fcd32bf19f915742f5b',
  39. # 'IsInsertion': True,
  40. # 'NodeManagerAddress': '1.2.3.4',
  41. # 'NodeManagerPort': 43280,
  42. # 'ObjectManagerPort': 38062,
  43. # 'ObjectStoreSocketName': '/tmp/ray/session_2019-01-21_16-28-05_4216/sockets/plasma_store',
  44. # 'RayletSocketName': '/tmp/ray/session_2019-01-21_16-28-05_4216/sockets/raylet',
  45. # 'Resources': {'CPU': 8.0, 'GPU': 1.0}}]
  46. To inspect the primary Redis shard manually, you can also query with commands
  47. like the following.
  48. .. code-block:: python
  49. r_primary = ray.worker.global_worker.redis_client
  50. r_primary.keys("*")
  51. To inspect other Redis shards, you will need to create a new Redis client.
  52. For example (assuming the relevant IP address is ``127.0.0.1`` and the
  53. relevant port is ``1234``), you can do this as follows.
  54. .. code-block:: python
  55. import redis
  56. r = redis.StrictRedis(host='127.0.0.1', port=1234)
  57. You can find a list of the relevant IP addresses and ports by running
  58. .. code-block:: python
  59. r_primary.lrange('RedisShards', 0, -1)
  60. .. _backend-logging:
  61. Backend logging
  62. ---------------
  63. The ``raylet`` process logs detailed information about events like task
  64. execution and object transfers between nodes. To set the logging level at
  65. runtime, you can set the ``RAY_BACKEND_LOG_LEVEL`` environment variable before
  66. starting Ray. For example, you can do:
  67. .. code-block:: shell
  68. export RAY_BACKEND_LOG_LEVEL=debug
  69. ray start
  70. This will print any ``RAY_LOG(DEBUG)`` lines in the source code to the
  71. ``raylet.err`` file, which you can find in :ref:`temp-dir-log-files`.
  72. If it worked, you should see as the first line in ``raylet.err``:
  73. .. code-block:: shell
  74. logging.cc:270: Set ray log level from environment variable RAY_BACKEND_LOG_LEVEL to -1
  75. (-1 is defined as RayLogLevel::DEBUG in logging.h.)
  76. .. literalinclude:: /../../src/ray/util/logging.h
  77. :language: C
  78. :lines: 52,54
  79. .. _`issues`: https://github.com/ray-project/ray/issues