Kai Fricke
|
d33b0e4bc3
[tune] Reconcile placement groups every N seconds to avoid bottlenecks when running many short trials (#15011)
|
3 years ago |
Kai Fricke
|
84b3c3376b
[tune] document scalability best practices (k8s, scalability thresholds) (#14566)
|
3 years ago |
Kai Fricke
|
898243d538
[tune] Limit maximum number of pending trials. Add convergence test. (#14835)
|
3 years ago |
Kai Fricke
|
757866ec01
[tune] enable placement groups per default (#13906)
|
3 years ago |
javi-redondo
|
b8b2d6410d
[docs] new Ray Cluster documentation (#13839)
|
3 years ago |
Kai Fricke
|
d29fcfb45c
[tune] catch SIGINT signal and trigger experiment checkpoint (#13767)
|
3 years ago |
Kai Fricke
|
dc42abb2f5
[tune] placement group support (#13370)
|
3 years ago |
Richard Liaw
|
86387504ee
[tune] fix small docs typo (#13355)
|
3 years ago |
Kai Fricke
|
518427627b
[tune] buffer trainable results (#13236)
|
3 years ago |
Edwin Goh
|
a5ddc27bab
Fix typo in Tune Docs (Checkpointing) (#13348)
|
3 years ago |
Kai Fricke
|
5f04ade6ef
[tune] add more stoppers and stopper documentation (#12750)
|
3 years ago |
Richard Liaw
|
9ce7ad17fd
[tune] remove some bottlenecks in trialrunner (#12476)
|
3 years ago |
Richard Liaw
|
e59fe65d3d
[tune] Fix logging for dockersyncer (#12196)
|
3 years ago |
Keqiu Hu
|
0c1bdaef59
[tune] TensorFlow Distributed Trainable (#11876)
|
4 years ago |
Kai Fricke
|
603accf1c2
[tune] logger refactor part 3: Add ExperimentLogger class (#11749)
|
4 years ago |
Frank Gu
|
73fa94731f
[tune] Add HDFS as Cloud Sync Client (#11524)
|
4 years ago |
Richard Liaw
|
a4b418d30c
[docs] update cloud docs (#11262)
|
4 years ago |
Richard Liaw
|
56f858ed1a
[tune][docs/util] gputil check, docs (#11260)
|
4 years ago |
Kai Fricke
|
b450cb030a
[tune] reuse actors for function API (#11230)
|
4 years ago |
Kai Fricke
|
bdf647c4ec
[tune] docker syncer (#11035)
|
4 years ago |
Kai Fricke
|
c77cfaa5ad
[tune] use dated experiment dir per default (#11104)
|
4 years ago |
Kai Fricke
|
e7315b0856
[tune] Callbacks for tune runs (#11001)
|
4 years ago |
Richard Liaw
|
a563344bc2
[docs] remove ref to google groups -> github discussions (#11019)
|
4 years ago |
Kai Fricke
|
d9c4dea7cf
[tune] strict metric checking (#10972)
|
4 years ago |
Richard Liaw
|
b0ca70f628
[tune+core] tune lifecycle and starting ray guide (#10813)
|
4 years ago |
Ameer Haj Ali
|
6edacb22b8
Fix abstraction violations in command_runner interface (#10715)
|
4 years ago |
Max Fitton
|
017737b82b
[Documentation] `local_mode` doc updates and actor / worker explanation from Slack (#10748)
|
4 years ago |
Kai Fricke
|
7eaf063f29
[tune] wrapper function to pass arbitrary objects through the object store to trainables (#10679)
|
4 years ago |
Richard Liaw
|
551c597312
[tune] API revamp fix (#10518)
|
4 years ago |
Kai Fricke
|
2fac66650d
[tune] extend search space api docs (#10576)
|
4 years ago |