-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Currently the covariance terms have parameters and those parameters can be aggregated and tuned in optimization routines, but we are restricted to gradient free optimization routines. It'd be nice to be able to switch to something like L-BFGS.
How to best do this isn't clear but it could follow the interface for call operators. A CovarianceTerm which supports gradients would be required to include a method,
std::map<std::string, double> gradient(X &x, Y&y) const
which would return a map from parameter name to gradient in the vicinity of the current parameter value for any two supported features x and y.
Alternatively gradient could return a vector, either std:: or Eigen:: which is assumed to follow the same order as get_params_as_vector. Though subsequent concatenation of these vectors might get confusing.
Summation and other operations on CovarianceTerms would need to be defined to follow the chain rule. In order to decide if a gradient method is simply not defined, or should be assumed zero we'd have to use trait inspection with logic along the lines of "if the gradient is not defined but the () operator is then the gradient must not be provided."