Implementation of (1 + lambda) #288

siddn · 2025-02-03T01:41:08Z

Does the current code base provide an implementation of (1 + lambda) or (1 + 1) CMA-ES? I know the recommended implementation is the standard (mu, lambda), however I would like to evaluate our problem with the other two methods as we have a higher tolerance for unideal local optima. In the current CMAOptions(), I see there are settings for parent number, and elitism but I'm unsure if this will utilize the step-size described in this paper Section 2.1 Algorithm 1, Procedure updateStepSize. Any guidance on this would be appreciated!

The text was updated successfully, but these errors were encountered:

nikohansen · 2025-02-05T13:06:35Z

Does the current code base provide an implementation of (1 + lambda) or (1 + 1) CMA-ES?

No, it does not. Indeed, the closest functionality provided would be using the option {'CMA_elitist': True}. This does not change the step-size adaptation mechanism, but I am also not really convinced that it "should". It is, however, a not thoroughly benchmarked code feature.

as we have a higher tolerance for unideal local optima

I don't quite understand what this means or why this implies to better use (1+1)-selection. My first approach to finding a local optimum quickly would be to set a relatively small initial step-size, the third argument of cma.fmin2.

siddn · 2025-02-05T20:27:50Z

Thanks for the information.

As to the second part, I can elaborate a little on our specific use case. Our problem involves testing several interactions between a user and a robotic device, and we want to find the best device parameters for a given user as defined by our objective function. Some key points

Our problem is 3 dimensional, so relative to many other test problems our search space is rather small.
The "function evaluations" are extremely expensive, i.e. it can take as much as several minutes evaluate each candidate solution. To this end, we are more interested in finding a "good enough" solution with minimal wall time spent optimizing than a global solution. Additionally, we want the user to be spending as much time in a "close enough" region as possible, without sampling too far away from our expected optima. (in this case, I think your advice on starting with a small initial step-size seems to be the best place to start)
We also think our users will adapt to the device over time, giving us some dynamic landscape for our objective function. This is primarily why we want to use CMA-ES over another optimizer.
I've implemented the 1+lambda CMA-ES as best I can to test it on some offline problems, and it seems that it actually ends up being worse for dynamic problems, as the elitist solution (if we preserve its old function value and don't re-evaluate) tends to dominate and stagnate the search. Does this match our intuition given the design of the 1+lambda implementation?
If the above is true, are there general recommendations for how the parameters for CMA-ES should be selected for problems with dynamic landscapes, and more interest in tracking a local minima than hunting for a global minima?
Thank you again for the replies 😄

nikohansen · 2025-02-05T21:39:09Z

The "function evaluations" are extremely expensive, i.e. it can take as much as several minutes evaluate each candidate solution. To this end, we are more interested in finding a "good enough" solution with minimal wall time spent optimizing than a global solution. Additionally, we want the user to be spending as much time in a "close enough" region as possible, without sampling too far away from our expected optima. (in this case, I think your advice on starting with a small initial step-size seems to be the best place to start)

It's always a good idea to use cma.plot to display the tracked state variables. It will in particular give a good indication whether the initial step-size was chosen well.

We also think our users will adapt to the device over time, giving us some dynamic landscape for our objective function. This is primarily why we want to use CMA-ES over another optimizer.

This looks like one of the worse case scenarios for having elitism without reevaluations.

I've implemented the 1+lambda CMA-ES as best I can to test it on some offline problems, and it seems that it actually ends up being worse for dynamic problems, as the elitist solution (if we preserve its old function value and don't re-evaluate) tends to dominate and stagnate the search. Does this match our intuition given the design of the 1+lambda implementation?

I'd say so.

If the above is true, are there general recommendations for how the parameters for CMA-ES should be selected for problems with dynamic landscapes, and more interest in tracking a local minima than hunting for a global minima?

The main "issue" would be that the step-size becomes too small, when either the landscape is not moving for some time or when it is moving too fast (which can then impose as "noise") where it should increase. Using the implemented noise handling should at least address the second scenario.

If you have control over the dynamic changes, it seems also highly advisable to do changes only between the CMA-ES iterations and not within.

siddn · 2025-02-21T15:35:15Z

Thank you for the responses, this has been very helpful. I did have a few more general questions if you'd be willing. In general, I understand CMA-ES is quite good (and generically consistent) at larger dimensional problems, and is more tolerant of potentially dynamic landscapes than a method like bayesian optimization. However, are there recommendations you may have for a system with a low dimensional search space (<10) and few function evaluations (~100-150), primarily that are still able to have CMA-ES's higher tolerance for dynamic landscapes. Thanks!

nikohansen · 2025-02-23T00:37:46Z

Generally, some tweaked versions of Nelder Mead are very good in low dimension, in particular below five, and SLSQP (available in scipy.optimize) is very good with low budgets. How they hold up in a dynamic landscape is hard to say, Nelder Mead probably better than SLSQP.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of (1 + lambda) #288

Implementation of (1 + lambda) #288

siddn commented Feb 3, 2025

nikohansen commented Feb 5, 2025 •

edited

Loading

siddn commented Feb 5, 2025

nikohansen commented Feb 5, 2025

siddn commented Feb 21, 2025

nikohansen commented Feb 23, 2025

Implementation of (1 + lambda) #288

Implementation of (1 + lambda) #288

Comments

siddn commented Feb 3, 2025

nikohansen commented Feb 5, 2025 • edited Loading

siddn commented Feb 5, 2025

nikohansen commented Feb 5, 2025

siddn commented Feb 21, 2025

nikohansen commented Feb 23, 2025

nikohansen commented Feb 5, 2025 •

edited

Loading