Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Lexicographic optimization in Blendsearch #972

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

skzhang1
Copy link
Collaborator

@skzhang1 skzhang1 commented Apr 1, 2023

Why are these changes needed?

We plan to support lexicographic optimization in BlendSearch. This pull request is primarily intended to track the implementation progress. We encourage everyone to provide feedback in this pull request regarding both implementation suggestions and application scenarios.

TODO:

  • Implement a easier ECI calculation method.
  • Force the global search algorithm to MO-HPO in lexicographic optimization.
  • Add tests and docs.

Related issue number

N/A

Checks

@skzhang1 skzhang1 requested review from sonichi and qingyun-wu April 1, 2023 23:19
@skzhang1 skzhang1 self-assigned this Apr 1, 2023
@skzhang1 skzhang1 changed the title Support Lexicographic optimization in Blendsearch [Draft] Support Lexicographic optimization in Blendsearch Apr 7, 2023
@skzhang1 skzhang1 marked this pull request as draft April 7, 2023 01:25
@skzhang1 skzhang1 changed the title [Draft] Support Lexicographic optimization in Blendsearch Support Lexicographic optimization in Blendsearch Apr 7, 2023
@sonichi
Copy link
Contributor

sonichi commented Jun 12, 2023

Is it better to create a new subclass than modifying the original BlendSearch? It's an open question.

@skzhang1
Copy link
Collaborator Author

Is it better to create a new subclass than modifying the original BlendSearch? It's an open question.

I think we do not need to do it for now. Because LexiFlow is also not independent of flow2 in the current version.

Copy link
Contributor

@sonichi sonichi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR involves many big changes. There are many changes I don't understand. Could you summarize the changes first? Please also justify why a particular change should be done in that place. We do not want to add unnecessary changes or break the original functionality.
Later, I'll also request adding comments for all the non-obvious changes. Let's get a high-level understanding first.

metric += self.lagrange
if self.lexico_objectives:
self._metric_constraints = None
logger.info("Do not support providing metric_constraints in lexicographic optimization for now.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raise an error? Metric constraints should be provided via a target.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

Comment on lines +495 to +499
assert "lexico_algorithm" in lexico_objectives and lexico_objectives["lexico_algorithm"] in [
"CFO",
"BlendSearch",
], 'When performing lexicographic optimization, "lexico_algorithm" should be provided in lexicographic objectives (CFO or BlendSearch).'
SearchAlgorithm = CFO if lexico_objectives["lexico_algorithm"] == "CFO" else BlendSearch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange. Redundant with search_alg.

Copy link
Collaborator Author

@skzhang1 skzhang1 Jun 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is simpler for users to specify algorithm in lexico_objectives. Because search_alg in tune represents algorithm instance rather than string. We have also not tried to use tune by passing instance of CFO before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can support str type search_alg in tune to make it easier. But don't create a new argument which is confusing.
AutoML uses tune by passing instance of CFO.

@@ -739,3 +921,100 @@ def resolve_value(domain: Domain) -> ot.distributions.BaseDistribution:
values = {"/".join(path): resolve_value(domain) for path, domain in domain_vars}

return values


class LexiGlobalSearch(OptunaSearch):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why adding this class? None of the method looks necessary. The update of fbest etc. can happen in SearchThread.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

Comment on lines 781 to 785
if obj_1 > obj_2:
return True
else:
return False

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

Comment on lines 795 to 799
n = len(arr)
for i in range(n):
for j in range(0, n - i - 1):
if self._lexico_(arr[j], arr[j + 1]):
arr[j], arr[j + 1] = arr[j + 1], arr[j]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should be a more standard way for sorting in python

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

@@ -748,6 +910,10 @@ def suggest(self, trial_id: str) -> Optional[Dict]:
self._subspace[trial_id] = space
else: # use init config
if self._candidate_start_points is not None and self._points_to_evaluate:
if self.lexico_objectives:
raise NotImplementedError(
"It doesn't support providing points_to_evaluate in lexicographic optimization for now."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revise English.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

self._result[config_signature][k_metric] = {
self._metric: np.inf * -1 if k_mode == "max" else np.inf
}
self._result[config_signature]["time_total_s"] = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move out of the loop

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

@@ -312,6 +314,24 @@ def denormalize(self, config):
"""denormalize each dimension in config from [0,1]."""
return denormalize(config, self._space, self.best_config, self.incumbent, self._random)

def _get_lexico_bound(self, metric, mode):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same function appears in multiple places, which is not a good design.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

)
else:
self.speed = 0
elif self._search_alg._histories:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't access private fields

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a test for BlendSearch

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved. I added a simple test function for now

@@ -103,6 +103,7 @@ def evaluate_function(configuration):

lexico_objectives = {}
lexico_objectives["metrics"] = ["error_rate", "flops"]
lexico_objectives["algorithm"] = ["CFO"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a test for BlendSearch as well.

this functionality.

Tune automatically converts search spaces to Optuna's format:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why such a significant change on docstr?

points_to_evaluate: Optional[List[Dict]] = None,
sampler: Optional["BaseSampler"] = None,
seed: Optional[int] = None,
evaluated_rewards: Optional[List] = None,
):
assert ot is not None, "Optuna must be installed! Run `pip install optuna`."
super(OptunaSearch, self).__init__(metric=metric, mode=mode, max_concurrent=None, use_early_stopped_trials=None)
super(OptunaSearch, self).__init__(metric=metric, mode=mode)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing max_concurrent=None, use_early_stopped_trials=None?

@@ -500,7 +676,7 @@ def _setup_study(self, mode: str):
for point in self._points_to_evaluate:
self._ot_study.enqueue_trial(point)

def set_search_properties(self, metric: Optional[str], mode: Optional[str], config: Dict) -> bool:
def set_search_properties(self, metric: Optional[str], mode: Optional[str], config: Dict, **spec) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how spec is used in this func?

@@ -553,21 +729,8 @@ def suggest(self, trial_id: str) -> Optional[Dict]:
raise RuntimeError(
UNDEFINED_METRIC_MODE.format(cls=self.__class__.__name__, metric=self._metric, mode=self._mode)
)

if isinstance(self._space, list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why removing this case?

# Optuna doesn't support incremental results
# for multi-objective optimization
return
if trial_id in self._completed_trials:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why/when this happen?

metric = result[self.metric]
step = result[TRAINING_ITERATION]
ot_trial = self._ot_trials[trial_id]
ot_trial.report(metric, step)

def on_trial_complete(self, trial_id: str, result: Optional[Dict] = None, error: bool = False):
if trial_id in self._completed_trials:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment with the question in on_trial_result

@@ -641,34 +834,23 @@ def add_evaluated_point(
self._ot_study.add_trial(trial)

def save(self, checkpoint_path: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change is needed?

self._points_to_evaluate,
self._evaluated_rewards,
)
save_object = self.__dict__.copy()
with open(checkpoint_path, "wb") as outputFile:
pickle.dump(save_object, outputFile)

def restore(self, checkpoint_path: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change is needed?

@@ -26,7 +28,7 @@
from .flow2 import FLOW2
from ..space import add_cost_to_space, indexof, normalize, define_by_run_func
from ..result import TIME_TOTAL_S

from ..utils import get_lexico_bound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_lexico_bound could be included in this file as a staticmethod of BlendSearch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants