You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hard to say whether this falls under "bug" or "would be nice to have", but when one runs the osprey plot config.yaml command from the command line, it just dumps the entire database, and ignores that you may have set a project_name in your config such as:
I'm feeding osprey a lot of complex featurizations that are not really amenable to pipelining, so that means each one has to have its own config.yaml (unless there's a better way?). I suppose I could send each featurization to its own database instead but then that defeats the purpose of having a project_name option.
Alternatively, the plot command could just take the database as the input file and the split things out on a per-project basis. Not sure which is better.
If I have time I can look into making this happen, though I'm swamped with my projects right now and I don't know the osprey code very well. Just wanted to put this out there.
The text was updated successfully, but these errors were encountered:
I've run into the same problem when trying out different clustering algorithms in Pipelines.
Is there a better solution than having each one on its own config file?
The dump command also does this, dumping the whole database and not just the pertinent project_name as specified in the config file.
Not sure how to know to separate the results from each run since everything is mixed in the json file.
Edit: Thinking about this, my contribution from awhile ago where the hyperparameters went into columns instead of a dedicated parameters one might complicate things on this end.
You can also use Osprey dump the results to a csv file (osprey dump -o csv > filename.csv), which is jankier, but you should be able to look through hyperparams and stuff by concatenating csv files (add a \n between them) and inserting a column for the run or clusterer at the beginning of each line. You could also contribute a ClusterSelector to MSMBuilder in the spirit of the FeatureSelector which was designed for the same purpose. See the FeatureSelector code here and an example using a pipeline here.
Hard to say whether this falls under "bug" or "would be nice to have", but when one runs the
osprey plot config.yaml
command from the command line, it just dumps the entire database, and ignores that you may have set aproject_name
in your config such as:I'm feeding osprey a lot of complex featurizations that are not really amenable to pipelining, so that means each one has to have its own config.yaml (unless there's a better way?). I suppose I could send each featurization to its own database instead but then that defeats the purpose of having a
project_name
option.Alternatively, the plot command could just take the database as the input file and the split things out on a per-project basis. Not sure which is better.
If I have time I can look into making this happen, though I'm swamped with my projects right now and I don't know the osprey code very well. Just wanted to put this out there.
The text was updated successfully, but these errors were encountered: