This repository was archived by the owner on Nov 14, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 49
False Error log complains failed to read the result of trails #270
Copy link
Copy link
Open
Description
Well, I just copy the code from https://docs.ray.io/en/master/tune/examples/tune-sklearn.html, add some ray.init code to let the Tune tasks run on a remote Ray cluster:
ray.init(
address="ray://xxx.xxx.xxx.xxx:10001",
runtime_env={"pip": ["tune-sklearn==0.4.6",
"scikit-learn==1.0.2"]
},
)
However I met below log:
2023-07-10 18:37:16,273 WARNING experiment_analysis.py:917 -- Failed to read the results for 6 trials:
- /home/ray/ray_results/_Trainable_2023-07-10_03-36-56/_Trainable_b19bc_00000_0_alpha=0.0001,epsilon=0.0100_2023-07-10_03-36-57
- /home/ray/ray_results/_Trainable_2023-07-10_03-36-56/_Trainable_b19bc_00001_1_alpha=0.1000,epsilon=0.0100_2023-07-10_03-36-57
- /home/ray/ray_results/_Trainable_2023-07-10_03-36-56/_Trainable_b19bc_00002_2_alpha=1,epsilon=0.0100_2023-07-10_03-36-57
- /home/ray/ray_results/_Trainable_2023-07-10_03-36-56/_Trainable_b19bc_00003_3_alpha=0.0001,epsilon=0.1000_2023-07-10_03-36-57
- /home/ray/ray_results/_Trainable_2023-07-10_03-36-56/_Trainable_b19bc_00004_4_alpha=0.1000,epsilon=0.1000_2023-07-10_03-36-57
- /home/ray/ray_results/_Trainable_2023-07-10_03-36-56/_Trainable_b19bc_00005_5_alpha=1,epsilon=0.1000_2023-07-10_03-36-57
After some triage I found the files under /home/ray/ray_results/_Trainable_2023-07-10_03-36-56 do exist on the remote cluster's head node. But what the code from https://docs.ray.io/en/master/tune/examples/tune-sklearn.html doing is to read the remote logs from my local dev machine, which definitely will fail because the results of the 6 trails don't exist locally.
So the question is: Why the code will try to read the results on the remote host from local? Is this a bug or it is my fault?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels