Yi Cheng a09ae65823 [serve] Make serve agent not blocking when GCS is down. (#27526) (#27674) 2 年之前
..
build aa8a7dcb48 [Docker] Add Cuda 11.6 support (#26695) 2 年之前
env 419ba8efd3 [serve] Integrate and Document Bring-Your-Own Gradio Applications (#2… (#27560) 2 年之前
k8s c4a259828b [kuberay] Update KubeRay operator commit, turn autoscaler RPC drain back on (#27077) 2 年之前
lint 4bf33efd5c [air] Add annotation for Tune module. (#27060) (#27210) 2 年之前
pipeline 7f908c4086 Revert "[ci] fix determine_tests_to_run.py by finding merge base (#26790)" (#26799) 2 年之前
run 8c70f02652 [build] Fix the `install-bazel.sh` (#25251) 2 年之前
README.md e6a458a31e [CI] Create zip of ray `session_latest/logs` dir on test failure and upload to buildkite via `/artifact-mount` (#23783) 2 年之前
ci.sh a09ae65823 [serve] Make serve agent not blocking when GCS is down. (#27526) (#27674) 2 年之前
keep_alive ad8e35b919 [ray] Update cpp to std14 (#14441) 3 年之前
remote-watch.py 7f1bacc7dc [CI] Format Python code with Black (#21975) 2 年之前
repro-ci-requirements.txt 794a81028b [ci] add repro-ci-requirements.txt (#26951) 2 年之前
repro-ci.py b0e1cfbcaa [ci] repro-ci.py: Use `Name` tag instead of `repo_name` (#26035) 2 年之前
suppress_output 56d2cf6479 Shellcheck rewrites (#9597) 4 年之前

README.md

CI process

This document is a work-in-progress. Please double-check file/function/etc. names for changes, as this document may be out of sync.

Dependencies

All dependencies (e.g. apt, pip) should be installed in install_dependencies(), following the same pattern as those that already exist.

Once a dependency is added/removed, please ensure that shell environment variables are persisted appropriately, as CI systems differ on when ~/.bashrc et al. are reloaded, if at all. (And they are not necessarily idempotent.)

Bazel, environment variables, and caching

Any environment variables passed to Bazel actions (e.g. PATH) should be idempotent to hit the Bazel cache.

If a different PATH gets passed to a Bazel action, Bazel will not hit the cache, and you might trigger a full rebuild when you really expect an incremental (or no-op) build for an option (say pip install -e . after bazel build //...).

Invocation

The CI system (such as Travis) must source (not execute) ci/ci.sh and pass the action(s) to execute. The script either handles the work or dispatches it to other script(s) as it deems appropriate. This helps ensure any environment setup/teardown is handled appropriately.

Development best practices & pitfalls (read before adding a new script)

Before adding new scripts, please read this section.

First, please consider modifying an existing script instead (e.g. add your code as a separate function). Adding new scripts has a number of pitfalls that easily take hours (even days) to track down and fix:

  • When calling other scripts (as executables), environment variables (like PATH) cannot propagate back up to the caller. Often, the caller expects such variables to be updated.

  • When sourcing other scripts, global state (ROOT_DIR, main, set -e, etc.) may be overwritten silently, causing unexpected behavior.

The following practices can avoid such pitfalls while maintaining intuitive control flow:

  • Put all environment-modifying functions in the top-level shell script, so that their invocation behaves intuitively. (The sheer length of the script is a secondary concern and can be mitigated by keeping functions modular.)

  • Avoid adding new scripts if possible. If it's necessary that you do so, call them instead of sourcing them. Note that this implies new scripts should not modify the environment, or the caller will not see such changes!

  • Always add code inside a function, not at global scope. Use local for variables where it makes sense. However, be careful and know the shell rules: for example, e.g. local x=$(false) succeeds even under set -e.

Ultimately, it's best to only add new scripts if they might need to be executed directly by non-CI code, as in that case, they should probably not use CI entrypoints (which assume exclusive control over the machine).