diff --git a/Changelog.md b/Changelog.md index 1b7ba778c6..8561cd9c66 100644 --- a/Changelog.md +++ b/Changelog.md @@ -6,6 +6,34 @@ The file was started with Version `0.4`. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.6.0] unreleased + +This is a breaking change since the JuMP extension is dropped. + +### Added + +* `nonpositive_curvature_behavior` for `QuasiNewtonLimitedMemoryDirectionUpdate` that determines how transported (y, s) vector pairs are treated after transport; if their inner product gets too low, it may lead to non-positive-definite Hessians which needs to be avoided. This resolves issue #549. (#554) +* `GeneralizedCauchyDirectionSubsolver` for handling direction selection in the presence of box (`Hyperrectangle`) constraints in quasi-Newton methods. This allows for L-BFGS-B-style box constraint handling. (#554) +* New stopping criteria: `StopWhenRelativeAPosterioriCostChangeLessOrEqual` and `StopWhenProjectedNegativeGradientNormLess`. (#554). +* `HagerZhangLinesearch` stepsize, a state-of-the-art line search for smooth objectives with cubic interpolation and adaptive Wolfe condition checking. (#554) +* Stopping criteria can now be initialized using `initialize_stepsize!`, similar to solvers. (#554) + +### Fixed + +* Fixed `show` methods of various state and stopping criteria to properly handle both `repr` and multiline printing (#569) +* Unified all `show` methods and their human readable analoga `status_summary` throughout the package (#569) +* Fixed some text descriptions of a few stopping criteria. +* unify naming of fields, `debugDictionary` of the debug state is now called `debug_dictionary` +* the `NesterovRule` now also stores an actual `AbstractRetractionMethod` instead of implicitly always using the default one. +* Line searches consistently respect `stop_when_stepsize_exceeds` keyword argument as a hard limit. (#554) +* `StopWhenChangeLess` falsely claimed to indicate convergence. This is now fixed. (#554) + +### Removed + +* The extension to JuMP. A replacement as a separate package is planned when the support for variables beyond vectors is more accessible in JuMP +* the plotting functions to `Asymptote`. They can now be found in the separate package [`ManifoldAsymptote.jl`]() + this way, `Manopt.jl` has less dependencies, especially the color and colorschemes dependencies are dropped + ## [0.5.37] May 5, 2026 ### Changed @@ -28,6 +56,27 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed +* The default restart rule for `conjugate_gradient_descent` is now `RestartOnNonDescent` instead of `NeverRestart`, which makes the algorithm more robust to non-convexity and numerical issues. The old default can still be used by explicitly passing `restart_condition=NeverRestart()`. (#604) +* `HagerZhangCoefficientRule` now has a safeguard against the denominator being too close to zero (the `denom_threshold` field). By default it is set to 1.0e-10. You can set it to a lower positive value (or even zero) to weaken the safeguard, but it is recommended to keep it to avoid numerical issues. (#604) +* introduce for all `Rule`s also a variant without being encapsulated in a memory, where the old values have to be passed as keywords. This is now used by the `ConjugateGradientBealeRestartRule` when evaluating its inner rule. (#604) + +## [0.5.36] April 24, 2026 + +### Added + +* a function `stopped_at(state)` to access the number of iterations it took a solver to stop. (#599) + +### Fixed + +* a small bug where `get_count(sc::StopWhenAny, Val(:Iteration))` wrongly reported it stopped before the first iteration when it actually did not yet stop. (#599) + +## [0.5.35] April 16, 2026 + +### Changed + +* `NonlinearLeastSquaresObjective` is now called `ManifoldNonlinearLeastSquaresObjective` (#569). +* This is a breaking release in order to move a few parts to a unified naming and since we +discontinue the `JuMP` extension. (#532) * Improved formatting of the references in the Readme.md (#586) * Bump compat for RecursiveArrayTools.jl to include version 4 * deactivate CompatHelper Action and solely use dependabot @@ -94,7 +143,7 @@ Moved the documentation glossaries to using the new [Glossaries.jl](https://gith * Removed `atol` from `DebugFeasibility` and instead use the one newly added `atol` from the `ConstrainedManifoldObjective`. (#546) * Move from CompatHelper to dependabot to keep track of dependency updates in Julia packages. (#547) * moved the `ManoptTestSuite` module to a sub module `Manopt.Test` within `Manopt.jl`, -so it can be easier reused by others as well (#550) + so it can be easier reused by others as well (#550) * moved to using a `Project.toml` for tests and an overall `[Workspace]`. This also allows finally to run single test files without installing all packages manually, but instead just switching to and instantiating the test environment. (#550) * for compatibility, state also `[source]` entries consistently in the sub `Project.toml` files. (#550) diff --git a/Project.toml b/Project.toml index b2c5a1b4a1..ccbbf04823 100644 --- a/Project.toml +++ b/Project.toml @@ -1,15 +1,11 @@ name = "Manopt" uuid = "0fc0a36d-df90-57f3-8f93-d78a9fc72bb5" version = "0.5.37" -authors = [{family-names = "Bergmann", given-names = "Ronny", alias = "kellertuer", city = "Trondheim", affiliation = "Norwegian University of Science and Technology", country = "NO", email = "manopt@ronnybergmann.net", orcid = "https://orcid.org/0000-0001-8342-7218", website = "https://ronnybergmann.net"}] [workspace] projects = ["test", "docs", "tutorials"] [deps] -ColorSchemes = "35d6a980-a343-548e-a6ea-1d62b119f2f4" -ColorTypes = "3da002f7-5984-5a60-b8a6-cbb66c0b333f" -Colors = "5ae59095-9a9b-59fe-a467-6f913c188581" DataStructures = "864edb3b-99cc-5e75-8d2d-829cb0a9cfe8" Dates = "ade2ca70-3891-5945-98fb-dc099432e06a" Glossaries = "8f48dd54-e453-4cdc-9500-53b96149560b" @@ -34,7 +30,6 @@ RecursiveArrayTools = "731186ca-8d62-57ce-b412-fbd966d074cd" RipQP = "1e40b3f8-35eb-4cd8-8edd-3e515bb9de08" [extensions] -ManoptJuMPExt = "JuMP" ManoptLRUCacheExt = "LRUCache" ManoptLineSearchesExt = "LineSearches" ManoptManifoldsExt = "Manifolds" @@ -42,9 +37,6 @@ ManoptRecursiveArrayToolsExt = "RecursiveArrayTools" ManoptRipQPQuadraticModelsExt = ["RipQP", "QuadraticModels"] [compat] -ColorSchemes = "3.5.0" -ColorTypes = "0.9.1, 0.10, 0.11, 0.12" -Colors = "0.11.2, 0.12, 0.13" DataStructures = "0.17, 0.18, 0.19" Dates = "1.10" Glossaries = "0.1.1" @@ -66,3 +58,14 @@ RipQP = "0.6.4, 0.7" SparseArrays = "1.10" Statistics = "1.10" julia = "1.10" + +[[authors]] +affiliation = "Norwegian University of Science and Technology" +alias = "kellertuer" +city = "Trondheim" +country = "NO" +email = "manopt@ronnybergmann.net" +family-names = "Bergmann" +given-names = "Ronny" +orcid = "https://orcid.org/0000-0001-8342-7218" +website = "https://ronnybergmann.net" diff --git a/_typos.toml b/_typos.toml index 25a7a96777..48761dd9be 100644 --- a/_typos.toml +++ b/_typos.toml @@ -3,7 +3,7 @@ methodes = "methodes" # french Serie = "Serie" # french sur = "sur" # french cmo = "cmo" # often used abbreviation for constrained manifold objective - +nd = "nd" # like in 2nd [files] extend-exclude = [ "tutorials/*.html", diff --git a/docs/Project.toml b/docs/Project.toml index 89e24413de..c1924c0d74 100644 --- a/docs/Project.toml +++ b/docs/Project.toml @@ -9,7 +9,6 @@ DocumenterInterLinks = "d12716ef-a0f6-4df4-a9f1-a5a34e75c656" FiniteDifferences = "26cc04aa-876d-5657-8c51-4c34ba976000" Images = "916415d5-f1e6-5110-898d-aaa5f9f070e0" JLD2 = "033835bb-8acc-5ee8-8aae-3f567f8a3819" -JuMP = "4076af6c-e467-56ae-b986-b466b2749572" LRUCache = "8ac3fa9e-de4c-5943-b1dc-09c6b5f20637" LineSearches = "d3d80556-e9d4-5f37-9878-2ab0fcc64255" LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" @@ -23,8 +22,8 @@ Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" RecursiveArrayTools = "731186ca-8d62-57ce-b412-fbd966d074cd" RipQP = "1e40b3f8-35eb-4cd8-8edd-3e515bb9de08" -[sources.Manopt] -path = ".." +[sources] +Manopt = {path = ".."} [compat] BenchmarkTools = "1.3" @@ -37,7 +36,6 @@ DocumenterInterLinks = "0.3, 1" FiniteDifferences = "0.12" Images = "0.26" JLD2 = "0.4, 0.5, 0.6" -JuMP = "1" LRUCache = "1" LineSearches = "7" Literate = "2" diff --git a/docs/make.jl b/docs/make.jl index c14c32e6a8..8ca3d1be3e 100755 --- a/docs/make.jl +++ b/docs/make.jl @@ -45,6 +45,7 @@ tutorials_menu = "Implement a solver" => "tutorials/ImplementASolver.md", "Optimize on your own manifold" => "tutorials/ImplementOwnManifold.md", "Do constrained optimization" => "tutorials/ConstrainedOptimization.md", + "Do optimization with bounds" => "tutorials/BoxDomain.md", ] # Check whether all tutorials are rendered, issue a warning if not (and quarto if not set) all_tutorials_exist = true @@ -101,7 +102,7 @@ end # (c) load necessary packages for the docs using Documenter using DocumenterCitations, DocumenterInterLinks -using JuMP, LineSearches, LRUCache, Manopt, Manifolds, Plots, RecursiveArrayTools +using LineSearches, LRUCache, Manopt, Manifolds, Plots, RecursiveArrayTools using RipQP, QuadraticModels # (d) add contributing.md and changelog.md to the docs – and link to releases and issues @@ -162,7 +163,6 @@ makedocs(; ), modules = [ Manopt, - Base.get_extension(Manopt, :ManoptJuMPExt), Base.get_extension(Manopt, :ManoptLineSearchesExt), Base.get_extension(Manopt, :ManoptLRUCacheExt), Base.get_extension(Manopt, :ManoptManifoldsExt), @@ -189,6 +189,7 @@ makedocs(; "Douglas—Rachford" => "solvers/DouglasRachford.md", "Exact Penalty Method" => "solvers/exact_penalty_method.md", "Frank-Wolfe" => "solvers/FrankWolfe.md", + "Generalized Cauchy direction subsolver" => "solvers/generalized_cauchy_direction_subsolver.md", "Gradient Descent" => "solvers/gradient_descent.md", "Interior Point Newton" => "solvers/interior_point_Newton.md", "Levenberg–Marquardt" => "solvers/LevenbergMarquardt.md", @@ -218,7 +219,6 @@ makedocs(; ], "Helpers" => [ "Checks" => "helpers/checks.md", - "Exports" => "helpers/exports.md", "Test" => "helpers/test.md", ], "Contributing to Manopt.jl" => "contributing.md", diff --git a/docs/src/about.md b/docs/src/about.md index 44a4c46403..4ff1057d78 100644 --- a/docs/src/about.md +++ b/docs/src/about.md @@ -15,6 +15,7 @@ Thanks to the following contributors to `Manopt.jl`: * Mathias Ravn Munkvold contributed most of the implementation of the [Adaptive Regularization with Cubics](solvers/adaptive-regularization-with-cubics.md) solver as well as its [Lanczos](@ref arc-Lanczos) subsolver * [Sander Engen Oddsen](https://github.com/oddsen) contributed to the implementation of the [LTMADS](solvers/mesh_adaptive_direct_search.md) solver. * [Jonas Püschel](https://www.uni-augsburg.de/de/fakultaet/mntf/math/prof/numa/team/jonas-pueschel/) contributed [restart rules for the conjugate gradient solver](@ref cg-restart). +* [Patryk Przybysz](https://www.linkedin.com/in/patryk-przybysz-5644aa1a1/) contributed to the [Generalized Cauchy Direction](solvers/generalized_cauchy_direction_subsolver.md). * [Tom-Christian Riemer](https://www.tu-chemnitz.de/mathematik/wire/mitarbeiter.php) implemented the [trust regions](solvers/trust_regions.md) and [quasi Newton](solvers/quasi_Newton.md) solvers as well as the [truncated conjugate gradient descent](solvers/truncated_conjugate_gradient_descent.md) subsolver. * [Markus A. Stokkenes](https://www.linkedin.com/in/markus-a-stokkenes-b41bba17b/) contributed most of the implementation of the [Interior Point Newton Method](solvers/interior_point_Newton.md) as well as its default [Conjugate Residual](solvers/conjugate_residual.md) subsolver * [Laura Weigl](https://num.math.uni-bayreuth.de/en/team/laura-weigl/index.php) implemented the [Vector bundle Newton Method](solvers/vectorbundle_newton.md). diff --git a/docs/src/extensions.md b/docs/src/extensions.md index 0f95f70977..4a076cea73 100644 --- a/docs/src/extensions.md +++ b/docs/src/extensions.md @@ -45,10 +45,12 @@ x_opt = quasi_Newton( ) ``` -In general this defines the following new [stepsize](@ref Stepsize) +In general this defines the following new [stepsize](@ref Stepsize) with helper functions for setting and getting the maximum step size: ```@docs Manopt.LineSearchesStepsize +Manopt.linesearches_get_max_alpha +Manopt.linesearches_set_max_alpha ``` ## Manifolds.jl @@ -69,52 +71,3 @@ Euclidean space when needed as Manopt.Rn Manopt.Rn_default ``` - -## [JuMP.jl](@extref JuMP :std:doc:`index`) - -Manopt can be used from within [`JuMP.jl`](@extref JuMP :std:doc:`index`). -The manifold is provided in the `@variable` macro. Note that until now, -only variables (points on manifolds) are supported, that are arrays, especially structs do not yet work. -The algebraic expression of the objective function is specified in the `@objective` macro. -The `descent_state_type` attribute specifies the solver. - -```julia -using JuMP, Manopt, Manifolds -model = Model(Manopt.JuMP_Optimizer) -# Change the solver with this option, `GradientDescentState` is the default -set_attribute(model, "descent_state_type", GradientDescentState) -@variable(model, U[1:2, 1:2] in Stiefel(2, 2), start = 1.0) -@objective(model, Min, sum((A - U) .^ 2)) -optimize!(model) -solution_summary(model) -``` - -Several functions from the [Mathematical Optimization Interface (MOI)](@extref JuMP :std:label:`The-MOI-interface`) are -extended when both `Manopt.jl` and [`JuMP.jl`](@extref JuMP :std:doc:`index`) are loaded: - -```@docs -Manopt.JuMP_Optimizer -``` - -### Internal functions - -```@docs -JuMP.build_variable -MOI.add_constrained_variables -MOI.copy_to -MOI.empty! -MOI.dimension -MOI.supports_add_constrained_variables -MOI.get -MOI.is_valid -MOI.supports -MOI.supports_incremental_interface -MOI.set -``` - -### Internal wrappers and their functions - -```@autodocs -Modules = [Base.get_extension(Manopt, :ManoptJuMPExt)] -Order = [:type, :function] -``` diff --git a/docs/src/helpers/exports.md b/docs/src/helpers/exports.md deleted file mode 100644 index b0ed2223aa..0000000000 --- a/docs/src/helpers/exports.md +++ /dev/null @@ -1,14 +0,0 @@ -# [Exports](@id sec-exports) - -Exports aim to provide a consistent generation of images of your results. For example if you [record](@ref subsec-record-states) the trace your algorithm walks on the [Sphere](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/sphere.html), you can easily export this trace to a rendered image using [`asymptote_export_S2_signals`](@ref) and render the result with [Asymptote](https://sourceforge.net/projects/asymptote/). -Despite these, you can always [record](@ref subsec-record-states) values during your iterations, -and export these, for example to `csv`. - -## Asymptote - -The following functions provide exports both in graphics and/or raw data using [Asymptote](https://sourceforge.net/projects/asymptote/). - -```@autodocs -Modules = [Manopt] -Pages = ["Asymptote.jl"] -``` diff --git a/docs/src/index.md b/docs/src/index.md index b0cbf14fe4..73a201c7e8 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -93,7 +93,7 @@ The notation in the documentation aims to follow the same [notation](https://jul ### Visualization -To visualize and interpret results, `Manopt.jl` aims to provide both easy plot functions as well as [exports](helpers/exports.md). Furthermore a system to get [debug](plans/debug.md) during the iterations of an algorithms as well as [record](plans/record.md) capabilities, for example to record a specified tuple of values per iteration, most prominently [`RecordCost`](@ref) and +To visualize and interpret results, `Manopt.jl` provides a system to get [debug](plans/debug.md) during the iterations of an algorithms as well as [record](plans/record.md) capabilities, for example to record a specified tuple of values per iteration, most prominently [`RecordCost`](@ref) and [`RecordIterate`](@ref). Take a look at the [🏔️ Get started with Manopt.jl](tutorials/getstarted.md) tutorial on how to easily activate this. ## Literature diff --git a/docs/src/plans/index.md b/docs/src/plans/index.md index 40e31b0186..8ba9029492 100644 --- a/docs/src/plans/index.md +++ b/docs/src/plans/index.md @@ -28,7 +28,7 @@ The following symbols are used. | `:Activity` | [`DebugWhenActive`](@ref) | activity of the debug action stored within | | `:Basepoint` | [`TangentSpace`](@extref ManifoldsBase `ManifoldsBase.TangentSpace`) | the point the tangent space is at | | `:Cost` | generic |the cost function (within an objective, as pass down) | -| `:Debug` | [`DebugSolverState`](@ref) | the stored `debugDictionary` | +| `:Debug` | [`DebugSolverState`](@ref) | the stored `debug_dictionary` | | `:Gradient` | generic | the gradient function (within an objective, as pass down) | | `:Iterate` | generic | the (current) iterate, similar to [`set_iterate!`](@ref), within a state | | `:Manifold` | generic |the manifold (within a problem, as pass down) | diff --git a/docs/src/plans/objective.md b/docs/src/plans/objective.md index e4a8e2283f..fd87e1a247 100644 --- a/docs/src/plans/objective.md +++ b/docs/src/plans/objective.md @@ -115,7 +115,7 @@ AbstractManifoldFirstOrderObjective ManifoldFirstOrderObjective ManifoldAlternatingGradientObjective ManifoldStochasticGradientObjective -NonlinearLeastSquaresObjective +ManifoldNonlinearLeastSquaresObjective ``` While the [`ManifoldFirstOrderObjective`](@ref) allows to provide different diff --git a/docs/src/references.bib b/docs/src/references.bib index e93d4b007f..c93389e940 100644 --- a/docs/src/references.bib +++ b/docs/src/references.bib @@ -87,6 +87,16 @@ @article{BacakBergmannSteidlWeinmann:2016 VOLUME = {38}, } +@misc{BaranBergmannPrzybysz:2026, + title = {A {Riemannian} quasi-{Newton} algorithm for optimization with {Euclidean} bounds}, + doi = {10.48550/arXiv.2605.10573}, + publisher = {arXiv}, + author = {Baran, Mateusz and Bergmann, Ronny and Przybysz, Patryk}, + month = may, + year = {2026}, + note = {arXiv:2605.10573 [math.OC]}, +} + @inproceedings{Beale:1972, ADDRESS = {London}, AUTHOR = {Beale, E. M. L.}, @@ -260,6 +270,33 @@ @book{Boumal:2023 ISBN = {978-1-00-916616-4} } +@article{ByrdNocedalSchnabel:1994, + title = {Representations of quasi-{Newton} matrices and their use in limited memory methods}, + volume = {63}, + issn = {1436-4646}, + doi = {10.1007/BF01582063}, + number = {1}, + journal = {Mathematical Programming}, + author = {Byrd, Richard H. and Nocedal, Jorge and Schnabel, Robert B.}, + month = jan, + year = {1994}, + pages = {129--156}, +} + +@article{ByrdLuNocedalZhu:1995, + title = {A {Limited} {Memory} {Algorithm} for {Bound} {Constrained} {Optimization}}, + volume = {16}, + issn = {1064-8275}, + doi = {10.1137/0916069}, + number = {5}, + journal = {SIAM Journal on Scientific Computing}, + author = {Byrd, Richard H. and Lu, Peihuang and Nocedal, Jorge and Zhu, Ciyou}, + month = sep, + year = {1995}, + note = {Publisher: Society for Industrial and Applied Mathematics}, + pages = {1190--1208}, +} + % --- C % % @@ -882,4 +919,17 @@ @article{ZhangSra:2018 TITLE = {Towards Riemannian accelerated gradient methods}, URL = {https://arxiv.org/abs/1806.02812}, YEAR = {2018}, -} \ No newline at end of file +} + +@article{ZhuByrdLuNocedal:1997, + title = {Algorithm 778: {L}-{BFGS}-{B}: {Fortran} subroutines for large-scale bound-constrained optimization}, + volume = {23}, + issn = {0098-3500}, + doi = {10.1145/279232.279236}, + number = {4}, + journal = {ACM Trans. Math. Softw.}, + author = {Zhu, Ciyou and Byrd, Richard H. and Lu, Peihuang and Nocedal, Jorge}, + month = dec, + year = {1997}, + pages = {550--560}, +} diff --git a/docs/src/solvers/generalized_cauchy_direction_subsolver.md b/docs/src/solvers/generalized_cauchy_direction_subsolver.md new file mode 100644 index 0000000000..73aaf780ad --- /dev/null +++ b/docs/src/solvers/generalized_cauchy_direction_subsolver.md @@ -0,0 +1,77 @@ +# Generalized Cauchy direction subsolver + +The generalized Cauchy direction (GCD) subsolver is a component in optimization algorithms that handle problems with bound constraints [BaranBergmannPrzybysz:2026](@cite). It solves the following problem + +```math +\begin{align*} +\operatorname*{arg\,min}_{Y ∈ T_p D \times \mathcal{M}}&\ m_p(Y), \qquad m_p(Y) = ⟨X_g, Y⟩_p + \frac{1}{2} ⟨\mathcal{H}_p[Y], Y⟩_p\\ +\text{such that}& \ \exp_p(Y) = \exp_p(\alpha X) \in D \times \mathcal{M} \text{ for some } \alpha \in [0, A] +\end{align*} +``` + +where $X=(X_{\mathrm{D}}, X_{\mathcal{M}})$ is a given direction, the exponential map handles projection of the tangent vector when reaching the boundary, $D$ is a box domain ([`Hyperrectangle`](@extref Manifolds.Hyperrectangle)), $\mathcal{M}$ is a Riemannian manifold, $X_g$ is the gradient of a scalar function $f$ at point $p=(p_{\mathrm{D}}, p_{\mathcal{M}})$, $A$ is the maximum allowed step size on $\mathcal{M}$ at point $p=(p_{\mathrm{D}}, p_{\mathcal{M}})$ in direction $X_{\mathcal{M}}$ (infinity is supported) and $\mathcal{H}_p$ is a linear operator that approximates the Hessian of $f$ at $p$. + +Additionally, the subsolver indicates whether the selected direction $Y$ reaches the boundary of $D$ at some point, in which case the subsequent step size selection in direction $Y$ needs to be limited to the interval $[0, s_{\max}]$, where the number $1 ≤ s_{\max} ≤ ∞$ is also returned by the subsolver. +Note that the value $s_{\max}=1$ is obtained when the minimum lies at the boundary of $D$, while larger values indicate that we are further away from the boundary along the selected direction $X$. + +The solver is currently primarily intended for internal use by optimization algorithms that require bound-constrained subproblem solutions. + +## Simple stepsize limiting + +In case there is no Hessian approximation available, a simple stepsize limiting procedure is can be used to limit the stepsize in direction $X$ to the maximum allowed by the boundary of $D$ and the maximum allowed stepsize on $\mathcal{M}$. +This procedure is available using the following: + +```@docs +Manopt.MaxStepsizeInDirectionSubsolver +Manopt.find_max_stepsize_in_direction +``` + +## Internal types and method + +### Symbols related to the GCD computation + +These symbols are directly used by solvers to compute the descent direction corresponding to the Generalized Cauchy direction. + +```@docs +Manopt.has_anisotropic_max_stepsize +Manopt.find_generalized_cauchy_direction! +Manopt.GeneralizedCauchyDirectionSubsolver +``` + +### Symbols related to the Hessian approximation + +These symbols are used to evaluate the Hessian approximation at specific tangent vectors during the generalized Cauchy direction computation. + +```@docs +Manopt.hessian_value +Manopt.hessian_value_diag +``` + +### Symbols related to bound handling + +These are internal symbols used to manage and manipulate bound constraints during the GCP computation. + +```@docs +Manopt.init_updater! +Manopt.UnitVector +Manopt.to_coordinate_index +Manopt.AbstractSegmentHessianUpdater +Manopt.GenericSegmentHessianUpdater +Manopt.get_bounds_index +Manopt.get_stepsize_bound +Manopt.get_at_bound_index +Manopt.set_stepsize_bound! +Manopt.set_zero_at_index! +``` + +### Symbols related to specific Hessian approximations + +```@docs +Manopt.LimitedMemorySegmentHessianUpdater +Manopt.hessian_value_from_inner_products +Manopt.update_current_scale! +``` + +```@bibliography +Canonical=false +``` diff --git a/docs/src/solvers/quasi_Newton.md b/docs/src/solvers/quasi_Newton.md index 02c74365a5..2d608a34db 100644 --- a/docs/src/solvers/quasi_Newton.md +++ b/docs/src/solvers/quasi_Newton.md @@ -88,6 +88,7 @@ QuasiNewtonLimitedMemoryDirectionUpdate QuasiNewtonCautiousDirectionUpdate Manopt.initialize_update! QuasiNewtonPreconditioner +QuasiNewtonLimitedMemoryBoxDirectionUpdate ``` ## Hessian update rules diff --git a/ext/ManoptJuMPExt.jl b/ext/ManoptJuMPExt.jl deleted file mode 100644 index 991c5cb1d7..0000000000 --- a/ext/ManoptJuMPExt.jl +++ /dev/null @@ -1,705 +0,0 @@ -module ManoptJuMPExt - -using Manopt -using LinearAlgebra -using JuMP: JuMP -using ManifoldsBase -using ManifoldDiff -const MOI = JuMP.MOI - -""" - ManoptOptimizer <: MOI.AbstractOptimizer - -Represent a solver from `Manopt.jl` within the [`MathOptInterface` (MOI)](@extref JuMP :std:label:`The-MOI-interface`) framework of [`JuMP.jl`](@extref JuMP :std:doc:`index`) - -# Fields -* `problem::`[`AbstractManoptProblem`](@ref) a problem in manopt, especially - containing the manifold and the objective function. It can be constructed as soon as - the manifold and the objective are present. -* `manifold::`[`AbstractManifold`](@extref `ManifoldsBase.AbstractManifold`) the manifold on which the optimization is performed. -* `objective::`[`AbstractManifoldObjective`](@ref) the objective function to be optimized. -* `state::`[`AbstractManoptSolverState`](@ref) the state specifying the solver to use. -* `variable_primal_start::Vector{Union{Nothing,Float64}}` starting value for the solver, - in a vectorized form that [`JuMP.jl`](@extref JuMP :std:doc:`index`) requires. -* `sense::`[`MOI.OptimizationSense`](@extref JuMP :jl:type:`MathOptInterface.OptimizationSense`) the sense of optimization, - currently only minimization and maximization are supported. -* `options::Dict{String,Any}`: parameters specifying a solver before the `state` - is initialized, so especially which [`AbstractManoptSolverState`](@ref) to use, - when setting up the `state. -All types in brackets can also be `Nothing`, indicating they were not yet initialized. -""" -mutable struct ManoptOptimizer <: MOI.AbstractOptimizer - problem::Union{Nothing, Manopt.AbstractManoptProblem} - manifold::Union{Nothing, ManifoldsBase.AbstractManifold} - objective::Union{Nothing, Manopt.AbstractManifoldObjective} - state::Union{Nothing, Manopt.AbstractManoptSolverState} - # Does this make sense to be elementwise Nothing? On a manifold a partial init is not possible - variable_primal_start::Vector{Union{Nothing, Float64}} - sense::MOI.OptimizationSense - options::Dict{String, Any} - function ManoptOptimizer() - return new( - nothing, - nothing, - nothing, - nothing, - Union{Nothing, Float64}[], - MOI.FEASIBILITY_SENSE, - Dict{String, Any}(DESCENT_STATE_TYPE => Manopt.GradientDescentState), - ) - end -end -""" - Manopt.JuMP_Optimizer() - -Represent a solver from `Manopt.jl` within the [`MathOptInterface` (MOI)](@extref JuMP :std:label:`The-MOI-interface`) framework. -See [`ManoptOptimizer`](@ref) for the fields and their meaning. -""" -function Manopt.JuMP_Optimizer(args...) - return ManoptOptimizer(args...) -end - -""" - ManifoldSet{M<:ManifoldsBase.AbstractManifold} <: MOI.AbstractVectorSet - -Model a manifold from [`ManifoldsBase.jl`](@extref) as a vectorial set in the -[`MathOptInterface` (MOI)](@extref JuMP :std:label:`The-MOI-interface`). -This is a slight misuse of notation, since the manifold itself might not be embedded, -but just be parametrized in a certain way. - -# Fields - -* `manifold::M`: The manifold in which the variables are constrained to lie. - This is a [`ManifoldsBase.AbstractManifold`](@extref) object. -""" -struct ManifoldSet{M <: ManifoldsBase.AbstractManifold} <: MOI.AbstractVectorSet - manifold::M -end - -""" - MOI.dimension(set::ManifoldSet) - -Return the representation size of points on the (vectorized in representation) manifold. -As the MOI variables are real, this means if the [`representation_size`](@extref `ManifoldsBase.representation_size-Tuple{AbstractManifold}`) -yields (in product) `n`, this refers to the vectorized point / tangent vector from (a subset of ``ℝ^n``). - -Note that this is not the dimension of the manifold itself, but the -vector length of the vectorized representation of the manifold. -""" -function MOI.dimension(set::ManifoldSet) - return length(_shape(set.manifold)) -end - -@doc """ - RiemannianFunction{MO<:Manopt.AbstractManifoldObjective} <: MOI.AbstractScalarFunction -A wrapper for a [`AbstractManifoldObjective`](@ref) that can be used -as a [`MOI.AbstractScalarFunction`](@extref JuMP :jl:type:`MathOptInterface.AbstractScalarFunction`). - - - -# Fields -* `func::MO`: The [`AbstractManifoldObjective`](@ref) function to be wrapped. -""" -struct RiemannianFunction{MO <: Manopt.AbstractManifoldObjective} <: - MOI.AbstractScalarFunction - func::MO -end - -@doc """ - JuMP.jump_function_type(::JuMP.AbstractModel, F::Type{<:RiemannianFunction}) - -The [`JuMP.jl`](@extref JuMP :std:doc:`index`) function type of a function of type [`RiemannianFunction`](@ref) for any [`AbstractModel`](@extref JuMP.AbstractModel) -is that function type itself -""" -function JuMP.jump_function_type(::JuMP.AbstractModel, F::Type{<:RiemannianFunction}) - return F -end - -@doc """ - JuMP.jump_function(::JuMP.AbstractModel, F::Type{<:RiemannianFunction}) - -The [`JuMP.jl`](@extref JuMP :std:doc:`index`) function of a [`RiemannianFunction`](@ref) for any [`AbstractModel`](@extref JuMP.AbstractModel) -is that function itself. -""" -JuMP.jump_function(::JuMP.AbstractModel, f::RiemannianFunction) = f - -# -# The string representation -# maybe not document this since it seems to be mainly for display reasons -JuMP.function_string(mime::MIME, f::RiemannianFunction) = string(f.func) - -""" - MOI.Utilities.map_indices(index_map::Function, func::RiemannianFunction) - -The original docstring states something about substituting some variable indices -by their index map variants. -On a [`RiemannianFunction`](@ref) there is nothing to substitute, -""" -MOI.Utilities.map_indices(::Function, func::RiemannianFunction) = func - -# We we don't support `MOI.modify` and `RiemannianFunction` is not mutable, no need to copy anything -Base.copy(func::RiemannianFunction) = func - -# This is called for instance when the user does `@objective(model, Min, func)`. -# JuMP only accepts subtypes of `MOI.AbstractFunction` as objective so we wrap `func`. -# It will then be allowed to go through all the MOI layers because it is of the right type -# We will then receive it in `MOI.set(::ManoptOptimizer, ::MOI.ObjectiveFunction, RiemannianFunction)` -# where we will unwrap it and recover `func`. -@doc """ - JuMP.set_objective_function(model::JuMP.Model, obj::Manopt.AbstractManifoldObjective) - -Set the objective function of a [`JuMP.Model`](@extref) `model` to an [`AbstractManifoldObjective`](@ref) `obj`. -This allows to use `@objective` with an objective from `Manopt.jl`. -""" -function JuMP.set_objective_function( - model::JuMP.Model, func::Manopt.AbstractManifoldObjective - ) - return JuMP.set_objective_function(model, RiemannianFunction(func)) -end - -""" - MOI.get(::ManoptOptimizer, ::MOI.SolverVersion) - -Return the version of the Manopt solver, it corresponds to the version of -Manopt.jl. -""" -MOI.get(::ManoptOptimizer, ::MOI.SolverVersion) = "Manopt.jl $(pkgversion(Manopt))" - -function MOI.is_empty(model::ManoptOptimizer) - return isnothing(model.manifold) && - isempty(model.variable_primal_start) && - isnothing(model.objective) && - model.sense == MOI.FEASIBILITY_SENSE -end - -""" - MOI.empty!(model::ManoptOptimizer) - -Clear all model data from `model` but keep the `options` set. -""" -function MOI.empty!(model::ManoptOptimizer) - model.manifold = nothing - model.problem = nothing - model.state = nothing - empty!(model.variable_primal_start) - model.sense = MOI.FEASIBILITY_SENSE - model.objective = nothing - return nothing -end - -""" - MOI.supports(::ManoptOptimizer, attr::MOI.RawOptimizerAttribute) - -Return a `Bool` indicating whether `attr.name` is a valid option name -for `Manopt`. -""" -function MOI.supports(::ManoptOptimizer, ::MOI.RawOptimizerAttribute) - # FIXME Ideally, this should only return `true` if it is a valid keyword argument for - # one of the `...DescentState()` constructors. Is there an easy way to check this ? - # Does it depend on the different solvers ? - return true -end - -""" - MOI.get(model::ManoptOptimizer, attr::MOI.RawOptimizerAttribute) - -Return last `value` set by [`set`](@extref `MathOptInterface.set`)`(model, attr, value)`. -""" -function MOI.get(model::ManoptOptimizer, attr::MOI.RawOptimizerAttribute) - return model.options[attr.name] -end - -""" - MOI.get(model::ManoptOptimizer, attr::MOI.RawOptimizerAttribute) - -Set the value for the keyword argument `attr.name` to give for the constructor -`model.options[DESCENT_STATE_TYPE]`. -""" -function MOI.set(model::ManoptOptimizer, attr::MOI.RawOptimizerAttribute, value) - model.options[attr.name] = value - return nothing -end - -""" - MOI.get(::ManoptOptimizer, ::MOI.SolverName) - -Return the name of the [`ManoptOptimizer`](@ref) with the value of -the `descent_state_type` option. -""" -function MOI.get(model::ManoptOptimizer, ::MOI.SolverName) - return "A Manopt.jl solver, namely $(model.options[DESCENT_STATE_TYPE])" -end - -""" - MOI.supports_incremental_interface(::ManoptOptimizer) - -Return `true` indicating that [`ManoptOptimizer`](@ref) implements -[`add_constrained_variables`](@extref `MathOptInterface.add_constrained_variables`) and [`set`](@extref `MathOptInterface.set`) for -[`ObjectiveFunction`](@extref `MathOptInterface.ObjectiveFunction`) so it can be used with [`direct_model`](@extref `JuMP.direct_model`) -and does not require a [`CachingOptimizer`](@extref `MathOptInterface.Utilities.CachingOptimizer`). -See See [`supports_incremental_interface`](@extref `MathOptInterface.supports_incremental_interface`). -""" -MOI.supports_incremental_interface(::ManoptOptimizer) = true - -""" - MOI.copy_to(dest::ManoptOptimizer, src::MOI.ModelLike) - -Because [`supports_incremental_interface`](@extref `MathOptInterface.supports_incremental_interface`)`(dest)` is `true`, this simply -uses [`default_copy_to`](@extref `MathOptInterface.Utilities.default_copy_to`) and copies the variables with -[`add_constrained_variables`](@extref `MathOptInterface.add_constrained_variables`) and the objective sense with [`set`](@extref `MathOptInterface.set`). -""" -function MOI.copy_to(dest::ManoptOptimizer, src::MOI.ModelLike) - return MOI.Utilities.default_copy_to(dest, src) -end - -""" - MOI.supports_add_constrained_variables(::ManoptOptimizer, ::Type{<:ManifoldSet}) - -Return `true` indicating that [`ManoptOptimizer`](@ref) support optimization on -variables constrained to belong in a vectorized manifold. -""" -function MOI.supports_add_constrained_variables(::ManoptOptimizer, ::Type{<:ManifoldSet}) - return true -end - -""" - MOI.add_constrained_variables(model::ManoptOptimizer, set::ManifoldSet) - -Add [`dimension`](@extref `MathOptInterface.dimension`)`(set)` variables constrained in `set` and return the list -of variable indices that can be used to reference them as well a constraint -index for the constraint enforcing the membership of the variables manifold as a set. -""" -function MOI.add_constrained_variables(model::ManoptOptimizer, set::ManifoldSet) - F = MOI.VectorOfVariables - if !isnothing(model.manifold) - throw( - MOI.AddConstraintNotAllowed{F, typeof(set)}( - "Only one manifold allowed, variables in `$(model.manifold)` have already been added.", - ), - ) - end - model.manifold = set.manifold - model.problem = nothing - model.state = nothing - n = MOI.dimension(set) - v = MOI.VariableIndex.(1:n) - for _ in 1:n - push!(model.variable_primal_start, nothing) - end - return v, MOI.ConstraintIndex{F, typeof(set)}(1) -end - -""" - MOI.is_valid(model::ManoptOptimizer, vi::MOI.VariableIndex) - -Return whether `vi` is a valid variable index. -""" -function MOI.is_valid(model::ManoptOptimizer, vi::MOI.VariableIndex) - return !isnothing(model.manifold) && - 1 <= vi.value <= MOI.dimension(ManifoldSet(model.manifold)) -end - -""" - MOI.get(model::ManoptOptimizer, ::MOI.NumberOfVariables) - -Return the number of variables added in the model, this corresponds -to the [`dimension`](@extref JuMP :jl:function:`MathOptInterface.dimension`) of the [`ManifoldSet`](@ref). -""" -function MOI.get(model::ManoptOptimizer, ::MOI.NumberOfVariables) - if isnothing(model.manifold) - return 0 - else - return MOI.dimension(ManifoldSet(model.manifold)) - end -end - -""" - MOI.supports(::ManoptOptimizer, attr::MOI.RawOptimizerAttribute) - -Return `true` indicating that [`ManoptOptimizer`](@ref) supports starting values -for the variables. -""" -function MOI.supports( - ::ManoptOptimizer, ::MOI.VariablePrimalStart, ::Type{MOI.VariableIndex} - ) - return true -end - -""" - function MOI.set( - model::ManoptOptimizer, - ::MOI.VariablePrimalStart, - vi::MOI.VariableIndex, - value::Union{Real,Nothing}, - ) - -Set the starting value of the variable of index `vi` to `value`. Note that if -`value` is `nothing` then it essentially unset any previous starting values set -and hence `MOI.optimize!` unless another starting value is set. -""" -function MOI.set( - model::ManoptOptimizer, - ::MOI.VariablePrimalStart, - vi::MOI.VariableIndex, - value::Union{Real, Nothing}, - ) - MOI.throw_if_not_valid(model, vi) - model.variable_primal_start[vi.value] = value - model.state = nothing - return nothing -end - -""" - MOI.supports(::ManoptOptimizer, ::Union{MOI.ObjectiveSense,MOI.ObjectiveFunction}) - -Return `true` indicating that `Optimizer` supports being set the objective -sense (that is, min, max or feasibility) and the objective function. -""" -function MOI.supports(::ManoptOptimizer, ::Union{MOI.ObjectiveSense, MOI.ObjectiveFunction}) - return true -end - -""" - MOI.set(model::ManoptOptimizer, ::MOI.ObjectiveSense, sense::MOI.OptimizationSense) - -Modify the objective sense to either [`MAX_SENSE`](@extref), [`MIN_SENSE`](@extref) or -[`FEASIBILITY_SENSE`](@extref). -""" -function MOI.set(model::ManoptOptimizer, ::MOI.ObjectiveSense, sense::MOI.OptimizationSense) - model.sense = sense - return nothing -end - -""" - MOI.get(model::ManoptOptimizer, ::MOI.ObjectiveSense) - -Return the objective sense, defaults to [`FEASIBILITY_SENSE`](@extref) if no sense has -already been set. -""" -MOI.get(model::ManoptOptimizer, ::MOI.ObjectiveSense) = model.sense - -""" - _EmbeddingObjective{E<:MOI.AbstractNLPEvaluator,T} - -Objective where `evaluator` is a MathOptInterface evaluator for the objective -in the embedding. The fields `vectorized_point`, `vectorized_tangent` -and `embedding_tangent` are used as preallocated buffer so that the conversion -to Euclidean objective is allocation-free. -""" -struct _EmbeddingObjective{E <: MOI.AbstractNLPEvaluator, T} - evaluator::E - # Used to store the vectorized point - vectorized_point::Vector{Float64} - # Used to store the vectorized tangent - vectorized_tangent::Vector{Float64} - # Used to store the tangent in the embedding space - embedding_tangent::T -end - -""" - _get_cost(M, objective::_EmbeddingObjective, p) - -Convert the point `p` to its vectorization and then evaluate the objective -using `objective.evaluator`. -""" -function _get_cost(M, objective::_EmbeddingObjective, p) - _vectorize!(objective.vectorized_point, p, _shape(M)) - return MOI.eval_objective(objective.evaluator, objective.vectorized_point) -end - -""" - _get_cost(M, objective::_EmbeddingObjective, p) - -Convert the point `p` to its vectorization and then evaluate the gradient -using `objective.evaluator` to get the vectorized gradient. Then reshape the -gradient and convert it to the Riemannian gradient. -""" -function _get_gradient!(M, gradient, objective::_EmbeddingObjective, p) - _vectorize!(objective.vectorized_point, p, _shape(M)) - MOI.eval_objective_gradient( - objective.evaluator, objective.vectorized_tangent, objective.vectorized_point - ) - _reshape_vector!(objective.embedding_tangent, objective.vectorized_tangent, _shape(M)) - return ManifoldDiff.riemannian_gradient!(M, gradient, p, objective.embedding_tangent) -end - -""" - MOI.set(model::ManoptOptimizer, ::MOI.ObjectiveFunction{F}, func::F) where {F} - -Set the objective function as `func` for `model`. -""" -function MOI.set( - model::ManoptOptimizer, ::MOI.ObjectiveFunction, func::MOI.AbstractScalarFunction - ) - backend = MOI.Nonlinear.SparseReverseMode() - vars = [MOI.VariableIndex(i) for i in eachindex(model.variable_primal_start)] - nlp_model = MOI.Nonlinear.Model() - nl = convert(MOI.ScalarNonlinearFunction, func) - MOI.Nonlinear.set_objective(nlp_model, nl) - evaluator = MOI.Nonlinear.Evaluator(nlp_model, backend, vars) - MOI.initialize(evaluator, [:Grad]) - objective = let # COV_EXCL_LINE - # To avoid creating a closure capturing the `embedding_obj` object, - # we use the `let` block trick detailed in: - # https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-captured - embedding_obj = _EmbeddingObjective( - evaluator, - zeros(length(_shape(model.manifold))), - zeros(length(_shape(model.manifold))), - _zero(_shape(model.manifold)), - ) - RiemannianFunction( - Manopt.ManifoldGradientObjective( - (M, x) -> _get_cost(M, embedding_obj, x), - (M, g, x) -> _get_gradient!(M, g, embedding_obj, x); - evaluation = Manopt.InplaceEvaluation(), - ), - ) - end - MOI.set(model, MOI.ObjectiveFunction{typeof(objective)}(), objective) - return nothing -end - -function MOI.set(model::ManoptOptimizer, ::MOI.ObjectiveFunction, func::RiemannianFunction) - model.objective = func.func - model.problem = nothing - model.state = nothing - return nothing -end - -# Name of the attribute for the type of the descent state to be used as follows: -# ```julia -# set_attribute(model, "descent_state_type", Manopt.TrustRegionsState) -# ``` -const DESCENT_STATE_TYPE = "descent_state_type" - -function MOI.optimize!(model::ManoptOptimizer) - start = Float64[ - if isnothing(model.variable_primal_start[i]) - error("No starting value specified for `$i`th variable.") - else - model.variable_primal_start[i] - end for i in eachindex(model.variable_primal_start) - ] - objective = model.objective - if model.sense == MOI.FEASIBILITY_SENSE - objective = Manopt.ManifoldGradientObjective( - (_, _) -> 0.0, ManifoldsBase.zero_vector - ) - elseif model.sense == MOI.MAX_SENSE - objective = -objective - end - dmgo = decorate_objective!(model.manifold, objective) - model.problem = DefaultManoptProblem(model.manifold, dmgo) - reshaped_start = JuMP.reshape_vector(start, _shape(model.manifold)) - descent_state_type = model.options[DESCENT_STATE_TYPE] - kws = Dict{Symbol, Any}( - Symbol(key) => value for (key, value) in model.options if key != DESCENT_STATE_TYPE - ) - s = descent_state_type(model.manifold; p = reshaped_start, kws...) - model.state = decorate_state!(s) - solve!(model.problem, model.state) - return nothing -end - -@doc """ - ManifoldPointArrayShape{N} <: JuMP.AbstractShape - -Represent some generic `AbstractArray` of a certain size representing an point -on a manifold - -# Fields - -* `size::NTuple{N,Int}`: The size of the array -""" -struct ManifoldPointArrayShape{N} <: JuMP.AbstractShape - size::NTuple{N, Int} -end - -""" - length(shape::ManifoldPointArrayShape) - -Return the length of the vectors in the vectorized representation. -""" -Base.length(shape::ManifoldPointArrayShape) = prod(shape.size) - -""" - _vectorize!(res::Vector{T}, array::Array{T,N}, shape::ManifoldPointArrayShape{N}) where {T,N} - -Inplace version of `res = JuMP.vectorize(array, shape)`. -""" -function _vectorize!( - res::Vector{T}, array::Array{T, N}, ::ManifoldPointArrayShape{N} - ) where {T, N} - return copyto!(res, array) -end - -""" - _reshape_vector!(res::Array{T,N}, vec::Vector{T}, ::ManifoldPointArrayShape{N}) where {T,N} - -Inplace version of `res = JuMP.reshape_vector(vec, shape)`. -""" -function _reshape_vector!( - res::Array{T, N}, vec::Vector{T}, ::ManifoldPointArrayShape{N} - ) where {T, N} - return copyto!(res, vec) -end - -""" - _zero(shape::ManifoldPointArrayShape) - -Return a zero element of the shape `shape`. -""" -_zero(shape::ManifoldPointArrayShape{N}) where {N} = zeros(shape.size) - -""" - JuMP.vectorize(p::Array{T,N}, shape::ManifoldPointArrayShape{N}) where {T,N} - -Given a point `p` as an ``N``-dimensional array representing a point on a certain -manifold, reshape it to a vector, which is necessary within [`JuMP`](@extref JuMP :std:doc:`index`). -For the inverse see [`JuMP.reshape_vector`](@ref JuMP.reshape_vector(::Vector, ::ManifoldPointArrayShape)). -""" -function JuMP.vectorize(array::Array{T, N}, ::ManifoldPointArrayShape{N}) where {T, N} - return vec(array) -end - -""" - JuMP.reshape_vector(vector::Vector, shape::ManifoldPointArrayShape) - -Given some vector representation `vector` used within [`JuMP`](@extref JuMP :std:doc:`index`) of a point on a manifold represents points -by arrays, use the information from the `shape` to reshape it back into such an array. -For the inverse see [`JuMP.vectorize`](@ref JuMP.vectorize(::Array, ::ManifoldPointArrayShape)). -""" -function JuMP.reshape_vector(vector::Vector, shape::ManifoldPointArrayShape) - return reshape(vector, shape.size) -end - -function JuMP.reshape_set(set::ManifoldSet, shape::ManifoldPointArrayShape) - return set.manifold -end - -""" - _shape(m::ManifoldsBase.AbstractManifold) - -Return the shape of points of the manifold `m`. -At the moment, we only support manifolds for which the shape is a `Array`. -""" -function _shape(m::ManifoldsBase.AbstractManifold) - return ManifoldPointArrayShape(ManifoldsBase.representation_size(m)) -end - -_in(mime::MIME"text/plain") = "in" -_in(mime::MIME"text/latex") = "\\in" - -function JuMP.in_set_string(mime, set::ManifoldsBase.AbstractManifold) - return _in(mime) * " " * string(set) -end - -""" - JuMP.build_variable(::Function, func, m::ManifoldsBase.AbstractManifold) - -Build a `JuMP.VariablesConstrainedOnCreation` object containing variables -and the [`ManifoldSet`](@ref) in which they should belong as well as the -`shape` that can be used to go from the vectorized MOI representation to the -shape of the manifold, that is, [`ManifoldPointArrayShape`](@ref). -""" -function JuMP.build_variable(::Function, func, m::ManifoldsBase.AbstractManifold) - shape = _shape(m) - return JuMP.VariablesConstrainedOnCreation( - JuMP.vectorize(func, shape), ManifoldSet(m), shape - ) -end - -""" - MOI.get(model::ManoptOptimizer, ::MOI.ResultCount) - -Return [`OPTIMIZE_NOT_CALLED`](@extref `MathOptInterface.OPTIMIZE_NOT_CALLED`) if [`optimize!`](@extref `JuMP.optimize!`) hasn't been called yet and -[`LOCALLY_SOLVED`](@extref `MathOptInterface.LOCALLY_SOLVED`) otherwise indicating that the solver has solved the -problem to local optimality the value of [`RawStatusString`](@extref `MathOptInterface.RawStatusString`) for more -details on why the solver stopped. -""" -function MOI.get(model::ManoptOptimizer, ::MOI.TerminationStatus) - if isnothing(model.state) - return MOI.OPTIMIZE_NOT_CALLED - else - return MOI.LOCALLY_SOLVED - end -end - -""" - MOI.get(model::ManoptOptimizer, ::MOI.ResultCount) - -Return `0` if [`optimize!`](@extref `JuMP.optimize!`) hasn't been called yet and -`1` otherwise indicating that one solution is available. -""" -function MOI.get(model::ManoptOptimizer, ::MOI.ResultCount) - if isnothing(model.state) - return 0 - else - return 1 - end -end - -""" - MOI.get(model::ManoptOptimizer, ::MOI.PrimalStatus) - -Return [`MOI.NO_SOLUTION`](@extref JuMP :jl:constant:`MathOptInterface.NO_SOLUTION`) if `optimize!` hasn't been called yet and -[`MOI.FEASIBLE_POINT`](@extref `MathOptInterface.FEASIBLE_POINT`) if it is otherwise indicating that a solution is available -to query with [`VariablePrimalStart`](@extref `MathOptInterface.VariablePrimalStart`). -""" -function MOI.get(model::ManoptOptimizer, ::MOI.PrimalStatus) - if isnothing(model.state) - return MOI.NO_SOLUTION - else - return MOI.FEASIBLE_POINT - end -end - -""" - MOI.get(::ManoptOptimizer, ::MOI.DualStatus) - -Returns [`MOI.NO_SOLUTION`](@extref `MathOptInterface.NO_SOLUTION`) indicating that there is no dual solution -available. -""" -MOI.get(::ManoptOptimizer, ::MOI.DualStatus) = MOI.NO_SOLUTION - -""" - MOI.get(model::ManoptOptimizer, ::MOI.RawStatusString) - -Return a `String` containing [`get_reason`](@ref) without the ending newline -character. -""" -function MOI.get(model::ManoptOptimizer, ::MOI.RawStatusString) - # `strip` removes the `\n` at the end and returns an `AbstractString` - # Since MOI wants a `String`, pass it through `string` - return string(strip(get_reason(model.state))) -end - -""" - MOI.get(model::ManoptOptimizer, attr::MOI.ObjectiveValue) - -Return the value of the objective function evaluated at the solution. -""" -function MOI.get(model::ManoptOptimizer, attr::MOI.ObjectiveValue) - MOI.check_result_index_bounds(model, attr) - solution = Manopt.get_solver_return(model.state) - value = get_cost(model.problem, solution) - if model.sense == MOI.MAX_SENSE - value = -value - end - return value -end - -""" - MOI.get(model::ManoptOptimizer, attr::MOI.VariablePrimal, vi::MOI.VariableIndex) - -Return the value of the solution for the variable of index `vi`. -""" -function MOI.get(model::ManoptOptimizer, attr::MOI.VariablePrimal, vi::MOI.VariableIndex) - MOI.check_result_index_bounds(model, attr) - MOI.throw_if_not_valid(model, vi) - solution = Manopt.get_solver_return(get_objective(model.problem), model.state) - return solution[vi.value] -end - -end # module diff --git a/ext/ManoptLineSearchesExt.jl b/ext/ManoptLineSearchesExt.jl index c58734e57c..5343e05289 100644 --- a/ext/ManoptLineSearchesExt.jl +++ b/ext/ManoptLineSearchesExt.jl @@ -5,6 +5,37 @@ import Manopt: LineSearchesStepsize using ManifoldsBase using LineSearches +Manopt.linesearches_get_max_alpha(ls::LineSearches.HagerZhang) = ls.alphamax +Manopt.linesearches_get_max_alpha(ls::LineSearches.MoreThuente) = ls.alphamax + +function Manopt.linesearches_set_max_alpha(ls::LineSearches.HagerZhang{T, Tm}, max_alpha::T) where {T, Tm} + return HagerZhang{T, Tm}( + delta = ls.delta, + sigma = ls.sigma, + alphamax = max_alpha, + rho = ls.rho, + epsilon = ls.epsilon, + gamma = ls.gamma, + linesearchmax = ls.linesearchmax, + psi3 = ls.psi3, + display = ls.display, + mayterminate = ls.mayterminate, + cache = ls.cache, + check_flatness = ls.check_flatness, + ) +end +function Manopt.linesearches_set_max_alpha(ls::LineSearches.MoreThuente{T}, max_alpha::T) where {T} + return MoreThuente{T}( + f_tol = ls.f_tol, + gtol = ls.gtol, + x_tol = ls.x_tol, + alphamin = ls.alphamin, + alphamax = max_alpha, + maxfev = ls.maxfev, + cache = ls.cache + ) +end + function (cs::Manopt.LineSearchesStepsize)( mp::AbstractManoptProblem, s::AbstractManoptSolverState, @@ -24,6 +55,19 @@ function (cs::Manopt.LineSearchesStepsize)( # guess initial alpha α0 = cs.initial_guess(mp, s, k, cs.last_stepsize, η; lf0 = fp, Dlf0 = dphi_0) + # handle stepsize limit + local ls # COV_EXCL_LINE + if :stop_when_stepsize_exceeds in keys(kwargs) + new_max_alpha = min( + kwargs[:stop_when_stepsize_exceeds], + Manopt.linesearches_get_max_alpha(cs.linesearch), + ) + ls = Manopt.linesearches_set_max_alpha(cs.linesearch, new_max_alpha) + α0 = min(α0, new_max_alpha) + else + ls = cs.linesearch + end + # perform actual line-search function ϕ(α) @@ -41,7 +85,7 @@ function (cs::Manopt.LineSearchesStepsize)( return Manopt.get_cost_and_differential(mp, p_tmp, Y_tmp; Y = X_tmp) end - α, fp = cs.linesearch(ϕ, dϕ, ϕdϕ, α0, fp, dphi_0) + α, fp = ls(ϕ, dϕ, ϕdϕ, α0, fp, dphi_0) cs.last_stepsize = α return α end diff --git a/ext/ManoptManifoldsExt/ManoptManifoldsExt.jl b/ext/ManoptManifoldsExt/ManoptManifoldsExt.jl index b1e027d713..a502802751 100644 --- a/ext/ManoptManifoldsExt/ManoptManifoldsExt.jl +++ b/ext/ManoptManifoldsExt/ManoptManifoldsExt.jl @@ -2,7 +2,7 @@ module ManoptManifoldsExt using ManifoldsBase: exp, log, ParallelTransport, vector_transport_to using Manopt -using Manopt: _math, _tex, ManifoldDefaultsFactory, _produce_type +using Manopt: _math, _tex, ManifoldDefaultsFactory, _produce_type, get_stepsize_bound import Manopt: max_stepsize, get_gradient, diff --git a/ext/ManoptManifoldsExt/manifold_functions.jl b/ext/ManoptManifoldsExt/manifold_functions.jl index d063cbdf5b..52e2ee0e9b 100644 --- a/ext/ManoptManifoldsExt/manifold_functions.jl +++ b/ext/ManoptManifoldsExt/manifold_functions.jl @@ -8,6 +8,33 @@ Manopt.default_point_distance(::Euclidean, p) = norm(p, Inf) Manopt.default_vector_norm(::Euclidean, p, X) = norm(p, Inf) +""" + get_bounds_index(::Hyperrectangle) + +Get the bound indices of [`Hyperrectangle`](@extref Manifolds.Hyperrectangle) `M`. They are the same as the indices of the +lower (or upper) bounds. +""" +Manopt.get_bounds_index(M::Hyperrectangle) = eachindex(M.lb) +""" + get_stepsize_bound(M::Hyperrectangle, x, d, i) + +Get the upper bound on moving in direction `d` from point `p` on [`Hyperrectangle`](@extref Manifolds.Hyperrectangle) `M`, +for the bound index `i`. There are three cases: + +1. If `d[i] > 0`, the formula reads `(M.ub[i] - p[i]) / d[i]`. +2. If `d[i] < 0`, the formula reads `(M.lb[i] - p[i]) / d[i]`. +3. If `d[i] == 0`, the result is `Inf`. +""" +function Manopt.get_stepsize_bound(M::Hyperrectangle, p, d, i) + if d[i] > 0 + return (M.ub[i] - p[i]) / d[i] + elseif d[i] < 0 + return (M.lb[i] - p[i]) / d[i] + else + return Inf + end +end + """ max_stepsize(M::TangentBundle, p) @@ -40,17 +67,16 @@ The default maximum stepsize for `Hyperrectangle` manifold with corners is maxim of distances from `p` to each boundary. """ function max_stepsize(M::Hyperrectangle, p) - ms = 0.0 - for i in eachindex(M.lb, p) - dist_ub = M.ub[i] - p[i] - if dist_ub > 0 - ms = max(ms, dist_ub) - end - dist_lb = p[i] - M.lb[i] - if dist_lb > 0 - ms = max(ms, dist_lb) - end - end + lb = M.lb + ub = M.ub + ms = zero(eltype(p)) + @inbounds @simd for i in eachindex(lb, ub, p) + dist_ub = ub[i] - p[i] + dist_lb = p[i] - lb[i] + cand_ub = ifelse(dist_ub > 0, dist_ub, zero(dist_ub)) + cand_lb = ifelse(dist_lb > 0, dist_lb, zero(dist_lb)) + ms = max(ms, max(cand_ub, cand_lb)) + end # COV_EXCL_LINE return ms end function max_stepsize(M::Hyperrectangle) @@ -60,6 +86,9 @@ function max_stepsize(M::Hyperrectangle) end return ms end +function max_stepsize(M::ProbabilitySimplex) + return 1.0 +end """ mid_point(M, p, q, x) @@ -168,3 +197,53 @@ function reflect!( X .*= -1 return retract!(M, q, p, X, retraction_method) end + + +""" + Manopt.set_zero_at_index!(M::Hyperrectangle, d, i) + +Set element of tangent vector `d` on [`Hyperrectangle`](@extref Manifolds.Hyperrectangle) +at index `i` to 0. +""" +function Manopt.set_zero_at_index!(M::Hyperrectangle, d, i) + d[i] = 0 + return d +end + +""" + Manopt.set_stepsize_bound!(M::Hyperrectangle, d_out, p, d, t_current::Real) + +For each element `i` in the tangent vector `d_out`, if the stepsize bound in direction `d` +for that element is less than `t_current`, set the element of `d_out` to the distance from +`p[i]` to the bound in the direction of `d[i]`. If the stepsize bound is non-positive, +set the element to 0. +""" +function Manopt.set_stepsize_bound!(M::Hyperrectangle, d_out, p, d, t_current::Real) + + for i in eachindex(d_out, d) + bound = get_stepsize_bound(M, p, d, i) + if bound > 0 + if bound < t_current && d_out[i] != 0 + d_out[i] = d[i] > 0 ? M.ub[i] - p[i] : M.lb[i] - p[i] + end + else + d_out[i] = 0 + end + end + return d_out +end + +""" + Manopt.has_anisotropic_max_stepsize(::Hyperrectangle) + +Returns `true`, as [`Hyperrectangle`](@extref `Manifolds.Hyperrectangle`) manifold requires generalized Cauchy point computation in solvers. +""" +Manopt.has_anisotropic_max_stepsize(::Hyperrectangle) = true + +""" + Manopt.get_at_bound_index(::Hyperrectangle, X, b) + +Extract the element of tangent vector `X` to a point on [`Hyperrectangle`](@extref Manifolds.Hyperrectangle) +at index `b`. +""" +Manopt.get_at_bound_index(::Hyperrectangle, X, b) = X[b] diff --git a/src/Manopt.jl b/src/Manopt.jl index a52fd47ea4..ff8349ae53 100644 --- a/src/Manopt.jl +++ b/src/Manopt.jl @@ -8,16 +8,16 @@ """ module Manopt +# When indenting something in print, use two spaces (or maybe \t later?) +_MANOPT_INDENT = " " + import Base: &, copy, getindex, identity, length, setindex!, show, | import LinearAlgebra: reflect! import ManifoldsBase: embed!, plot_slope, prepare_check_result, find_best_slope_window import ManifoldsBase: base_manifold, base_point, get_basis import ManifoldsBase: project, project! -import LinearAlgebra: cross -using ColorSchemes -using ColorTypes -using Colors -using DataStructures: CircularBuffer, capacity, length, push!, size, isfull +import LinearAlgebra: cross, LowerTriangular +using DataStructures: CircularBuffer, capacity, length, push!, size, isfull, heapify!, heappop! using Dates: Millisecond, Nanosecond, Period, canonicalize, value using Glossaries using LinearAlgebra: @@ -139,6 +139,7 @@ using ManifoldsBase: set_component!, shortest_geodesic, shortest_geodesic!, + submanifold_component, submanifold_components, vector_transport_to, vector_transport_to!, @@ -221,14 +222,11 @@ include("solvers/debug_solver.jl") include("solvers/record_solver.jl") include("helpers/checks.jl") -include("helpers/exports/Asymptote.jl") include("helpers/LineSearchesTypes.jl") include("helpers//test.jl") include("deprecated.jl") -function JuMP_Optimizer end - function __init__() # # Error Hints @@ -249,17 +247,6 @@ function __init__() ) printstyled(io, "`using QuadraticModels, RipQP`"; color = :cyan) end - if exc.f === Manopt.JuMP_Optimizer - print( - io, - """ - - The `Manopt.JuMP_Optimizer` is not yet properly initialized. - It requires the package `JuMP.jl`, so please load it e.g. via - """, - ) - printstyled(io, "`using JuMP`"; color = :cyan) - end end end return nothing @@ -286,7 +273,7 @@ export AbstractDecoratedManifoldObjective, EmbeddedManifoldObjective, ScaledManifoldObjective, ManifoldCountObjective, - NonlinearLeastSquaresObjective, + ManifoldNonlinearLeastSquaresObjective, ManifoldAlternatingGradientObjective, ManifoldCostGradientObjective, ManifoldCostObjective, @@ -341,6 +328,7 @@ export AbstractGradientSolverState, ProjectedGradientMethodState, ProximalBundleMethodState, ProximalGradientMethodState, + ProximalPointState, RecordSolverState, StepsizeState, StochasticGradientDescentState, @@ -423,7 +411,7 @@ export CondensedKKTVectorField, CondensedKKTVectorFieldJacobian export SymmetricLinearSystemObjective export ProximalGradientNonsmoothCost, ProximalGradientNonsmoothSubgradient -export QuasiNewtonState, QuasiNewtonLimitedMemoryDirectionUpdate +export QuasiNewtonState, QuasiNewtonLimitedMemoryDirectionUpdate, QuasiNewtonLimitedMemoryBoxDirectionUpdate export QuasiNewtonMatrixDirectionUpdate export QuasiNewtonPreconditioner export QuasiNewtonCautiousDirectionUpdate, @@ -546,6 +534,7 @@ export get_stepsize, get_initial_stepsize, get_last_stepsize export InteriorPointCentralityCondition export DomainBackTracking, DomainBackTrackingStepsize, NullStepBackTrackingStepsize export ProximalGradientMethodBacktracking +export HagerZhangLinesearch # # Stopping Criteria export StoppingCriterion, StoppingCriterionSet @@ -568,6 +557,7 @@ export StopAfter, StopWhenGradientChangeLess, StopWhenGradientMappingNormLess, StopWhenGradientNormLess, + StopWhenProjectedNegativeGradientNormLess, StopWhenFirstOrderProgress, StopWhenIterateNaN, StopWhenKKTResidualLess, @@ -579,6 +569,7 @@ export StopAfter, StopWhenPopulationDiverges, StopWhenPopulationStronglyConcentrated, StopWhenProjectedGradientStationary, + StopWhenRelativeAPosterioriCostChangeLessOrEqual, StopWhenRelativeResidualLess, StopWhenRepeated, StopWhenSmallerOrEqual, @@ -589,13 +580,9 @@ export StopAfter, export get_active_stopping_criteria, get_stopping_criteria, get_reason, get_stopping_criterion, stopped_at # -# Exports -export asymptote_export_S2_signals, asymptote_export_S2_data, asymptote_export_SPD -export render_asymptote -# # Debugs export DebugSolverState, DebugAction, DebugGroup, DebugEntry, DebugEntryChange, DebugEvery -export DebugChange, DebugGradientChange +export DebugCallback, DebugChange, DebugGradientChange export DebugIterate, DebugIteration, DebugDivider, DebugTime export DebugFeasibility export DebugCost, DebugStoppingCriterion diff --git a/src/helpers/LineSearchesTypes.jl b/src/helpers/LineSearchesTypes.jl index 8263a6e15b..17b3c9c442 100644 --- a/src/helpers/LineSearchesTypes.jl +++ b/src/helpers/LineSearchesTypes.jl @@ -7,14 +7,13 @@ Wrapper for line searches available in the `LineSearches.jl` library. LineSearchesStepsize(M::AbstractManifold, linesearch; kwargs... LineSearchesStepsize( - linesearch; - retraction_method=ExponentialRetraction(), - vector_transport_method=ParallelTransport(), + linesearch; retraction_method=ExponentialRetraction(), vector_transport_method=ParallelTransport(), ) Wrap `linesearch` (for example [`HagerZhang`](https://julianlsolvers.github.io/LineSearches.jl/latest/reference/linesearch.html#LineSearches.HagerZhang) or [`MoreThuente`](https://julianlsolvers.github.io/LineSearches.jl/latest/reference/linesearch.html#LineSearches.MoreThuente)). -The initial step selection from Linesearches.jl is not yet supported and the value 1.0 is used. +The initial step selection from Linesearches.jl is not yet supported and `initial_guess` is +always used (by default [`ConstantInitialGuess`](@ref)). # Keyword Arguments @@ -34,9 +33,7 @@ function LineSearchesStepsize( linesearch; initial_guess::AbstractInitialLinesearchGuess = ConstantInitialGuess(), retraction_method::AbstractRetractionMethod = default_retraction_method(M), - vector_transport_method::AbstractVectorTransportMethod = default_vector_transport_method( - M - ), + vector_transport_method::AbstractVectorTransportMethod = default_vector_transport_method(M), last_stepsize::Real = NaN, ) return LineSearchesStepsize( @@ -49,21 +46,52 @@ function LineSearchesStepsize( end function LineSearchesStepsize( linesearch; - initial_guess::AbstractInitialLinesearchGuess = ConstantInitialGuess(), - retraction_method::AbstractRetractionMethod = ExponentialRetraction(), - vector_transport_method::AbstractVectorTransportMethod = ParallelTransport(), + initial_guess::ILG = ConstantInitialGuess(), + retraction_method::RM = ExponentialRetraction(), + vector_transport_method::VTM = ParallelTransport(), last_stepsize::Real = NaN, - ) - return LineSearchesStepsize{ - typeof(linesearch), typeof(initial_guess), typeof(retraction_method), typeof(vector_transport_method), typeof(last_stepsize), - }( + ) where {ILG <: AbstractInitialLinesearchGuess, RM <: AbstractRetractionMethod, VTM <: AbstractVectorTransportMethod} + return LineSearchesStepsize{typeof(linesearch), ILG, RM, VTM, typeof(last_stepsize)}( linesearch, initial_guess, retraction_method, vector_transport_method, last_stepsize ) end +""" + linesearches_get_max_alpha(ls) + +Get the maximum step size for `LineSearches.jl` line search `ls`. +""" +linesearches_get_max_alpha(ls) + +function linesearches_get_max_alpha end + +""" + linesearches_set_max_alpha(ls, max_alpha::Real) + +Set the maximum step size for `LineSearches.jl` line search `ls` to `max_alpha`. +Return a new line search object with the updated maximum step size. +""" +linesearches_set_max_alpha(ls, max_alpha::Real) + +function linesearches_set_max_alpha end + function Base.show(io::IO, cs::LineSearchesStepsize) return print( io, "LineSearchesStepsize($(cs.linesearch); initial_guess=$(cs.initial_guess), retraction_method=$(cs.retraction_method), vector_transport_method=$(cs.vector_transport_method), last_stepsize=$(cs.last_stepsize))", ) end +function status_summary(cs::LineSearchesStepsize; context::Symbol = :default) + (context === :short) && return repr(cs) + (context === :inline) && return "A linesearch stepsize wrapper for LineSearches.jl (last step size $(cs.last_stepsize))" + return """ + A step size wrapper for LineSearches.jl + (last step size: $(cs.last_stepsize)) + + ## Parameters + * line search: $(_MANOPT_INDENT)$(cs.linesearch) + * initial guess: $(_MANOPT_INDENT)$(cs.initial_guess) + * retraction method: $(_MANOPT_INDENT)$(cs.retraction_method) + * vector transport method:$(_MANOPT_INDENT)$(cs.vector_transport_method) + """ +end diff --git a/src/helpers/exports/Asymptote.jl b/src/helpers/exports/Asymptote.jl deleted file mode 100644 index 2b83b982b8..0000000000 --- a/src/helpers/exports/Asymptote.jl +++ /dev/null @@ -1,411 +0,0 @@ -@doc """ - asymptote_export_S2_signals(filename; points, curves, tangent_vectors, colors, kwargs...) - -Export given `points`, `curves`, and `tangent_vectors` on the sphere ``𝕊^2`` -to Asymptote. - -# Input -* `filename` a file to store the Asymptote code in. - -# Keywaord arguments for the data - -* `colors=Dict{Symbol,Array{RGBA{Float64},1}}()`: dictionary of color arrays, - indexed by symbols `:points`, `:curves` and `:tvector`, where each entry has to provide - as least as many colors as the length of the corresponding sets. -* `curves=Array{Array{Float64,1},1}(undef, 0)`: an `Array` of `Arrays` of points - on the sphere, where each inner array is interpreted as a curve - and is accompanied by an entry within `colors`. -* `points=Array{Array{Float64,1},1}(undef, 0)`: an `Array` of `Arrays` of points - on the sphere where each inner array is interpreted as a set of points and is accompanied - by an entry within `colors`. -* `tangent_vectors=Array{Array{Tuple{Float64,Float64},1},1}(undef, 0)`: - an `Array` of `Arrays` of tuples, where the first is a points, the second a tangent vector - and each set of vectors is accompanied by an entry from within `colors`. - -# Keyword arguments for asymptote - -* `arrow_head_size=6.0`: - size of the arrowheads of the tangent vectors -* `arrow_head_sizes` overrides the previous value to specify a value per `tVector`` set. -* `camera_position=(1., 1., 0.)`: - position of the camera in the Asymptote scene -* `line_width=1.0`: - size of the lines used to draw the curves. -* `line_widths` overrides the previous value to specify a value per curve and `tVector`` set. -* `dot_size=1.0`: - size of the dots used to draw the points. -* `dot_sizes` overrides the previous value to specify a value per point set. -* `size=nothing`: - a tuple for the image size, otherwise a relative size `4cm` is used. -* `sphere_color=RGBA{Float64}(0.85, 0.85, 0.85, 0.6)`: - color of the sphere the data is drawn on -* `sphere_line_color=RGBA{Float64}(0.75, 0.75, 0.75, 0.6)`: - color of the lines on the sphere -* `sphere_line_width=0.5`: - line width of the lines on the sphere -* `target=(0.,0.,0.)`: - position the camera points at -""" -function asymptote_export_S2_signals( - filename::String; - points::Array{Array{T, 1}, 1} where {T} = Array{Array{Float64, 1}, 1}(undef, 0), - curves::Array{Array{T, 1}, 1} where {T} = Array{Array{Float64, 1}, 1}(undef, 0), - tangent_vectors::Array{Array{Tuple{T, T}, 1}, 1} where {T} = Array{ - Array{Tuple{Float64, Float64}, 1}, 1, - }( - undef, 0 - ), - colors::Dict{Symbol, Array{RGBA{Float64}, 1}} = Dict{Symbol, Array{RGBA{Float64}, 1}}(), - arrow_head_size::Float64 = 6.0, - arrow_head_sizes::Array{Float64, 1} = fill(arrow_head_size, length(tangent_vectors)), - camera_position::Tuple{Float64, Float64, Float64} = (1.0, 1.0, 0.0), - line_width::Float64 = 1.0, - line_widths::Array{Float64, 1} = fill( - line_width, length(curves) + length(tangent_vectors) - ), - dot_size::Float64 = 1.0, - dot_sizes::Array{Float64, 1} = fill(dot_size, length(points)), - size::Union{Nothing, Tuple{Int, Int}} = nothing, - sphere_color::RGBA{Float64} = RGBA{Float64}(0.85, 0.85, 0.85, 0.6), - sphere_line_color::RGBA{Float64} = RGBA{Float64}(0.75, 0.75, 0.75, 0.6), - sphere_line_width::Float64 = 0.5, - target::Tuple{Float64, Float64, Float64} = (0.0, 0.0, 0.0), - ) - io = open(filename, "w") - return try - # - # Header - # --- - write( - io, - string( - "import settings;\nimport three;\nimport solids;", - isnothing(size) ? "unitsize(4cm);" : "size$(size);", - "\n\n", - "currentprojection=perspective( ", - "camera = $(camera_position), ", - "target = $(target) );\n", - "currentlight=nolight;\n\n", - "revolution S=sphere(O,0.995);\n", - "pen SpherePen = rgb($(red(sphere_color)),", - "$(green(sphere_color)),$(blue(sphere_color)))", - "+opacity($(alpha(sphere_color)));\n", - "pen SphereLinePen = rgb($(red(sphere_line_color)),", - "$(green(sphere_line_color)),$(blue(sphere_line_color)))", - "+opacity($(alpha(sphere_line_color)))+linewidth($(sphere_line_width)pt);\n", - "draw(surface(S), surfacepen=SpherePen, meshpen=SphereLinePen);\n", - ), - ) - write(io, "\n/*\n Colors\n*/\n") - j = 0 - for (key, value) in colors # colors for all keys - penPrefix = "$(j)" - sets = 0 - if key == :points - penPrefix = "point" - sets = length(points) - elseif key == :curves - penPrefix = "curve" - sets = length(curves) - elseif key == :tvectors - penPrefix = "tVector" - sets = length(tangent_vectors) - end - if length(value) < sets - throw( - ErrorException( - "Not enough colors ($(length(value))) provided for $(sets) sets in $(key).", - ), - ) - end - i = 0 - # export all colors - for c in value - i = i + 1 - if i > sets - # avoid access errors in `line_width` or `dot_sizes` if more colors then sets are given - break - end - write( - io, - string( - "pen $(penPrefix)Style$(i) = ", - "rgb($(red(c)),$(green(c)),$(blue(c)))", - (key == :curves) ? "+linewidth($(line_widths[i])pt)" : "", - if (key == :tvectors) - "+linewidth($(line_widths[length(curves) + i])pt)" - else - "" - end, - (key == :points) ? "+linewidth($(dot_sizes[i])pt)" : "", - "+opacity($(alpha(c)));\n", - ), - ) - end - end - if length(points) > 0 - write(io, "\n/*\n Exported Points\n*/\n") - end - i = 0 - for pSet in points - i = i + 1 - for point in pSet - write( - io, - string( - "dot( (", - string([string(v, ",") for v in point]...)[1:(end - 1)], - "), pointStyle$(i));\n", - ), - ) - end - end - i = 0 - if length(curves) > 0 - write(io, "\n/*\n Exported Curves\n*/\n") - end - for curve in curves - i = i + 1 - write(io, "path3 p$(i) = ") - j = 0 - for point in curve - j = j + 1 - pString = "(" * string(["$v," for v in point]...)[1:(end - 1)] * ")" - write(io, j > 1 ? " .. $(pString)" : pString) - end - write(io, string(";\n draw(p$(i), curveStyle$(i));\n")) - end - i = 0 - if length(tangent_vectors) > 0 - write(io, "\n/*\n Exported tangent vectors\n*/\n") - end - for tVecs in tangent_vectors - i = i + 1 - j = 0 - for vector in tVecs - j = j + 1 - base = vector[1] - endPoints = base + vector[2] - write( - io, - string( - "draw( (", - string([string(v, ",") for v in base]...)[1:(end - 1)], - ")--(", - string([string(v, ",") for v in endPoints]...)[1:(end - 1)], - "), tVectorStyle$(i),Arrow3($(arrow_head_sizes[i])));\n", - ), - ) - end - end - finally - close(io) - end -end -@doc """ - asymptote_export_S2_data(filename) - -Export given `data` as an array of points on the 2-sphere, which might be one-, two- -or three-dimensional data with points on the [Sphere](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/sphere.html) ``𝕊^2``. - -# Input - -* `filename` a file to store the Asymptote code in. - -# Optional arguments for the data - -* `data` a point representing the 1D,2D, or 3D array of points -* `elevation_color_scheme` A `ColorScheme` for elevation -* `scale_axes=(1/3,1/3,1/3)`: - move spheres closer to each other by a factor - per direction - -# Optional arguments for asymptote - -* `arrow_head_size=1.8`: - size of the arrowheads of the vectors (in mm) -* `camera_position` position of the camera scene (default: atop the center of the data in the xy-plane) -* `target` position the camera points at (default: center of xy-plane within data). -""" -function asymptote_export_S2_data( - filename::String; - data = fill([0.0, 0.0, 1.0], 0, 0), - arrow_head_size::Float64 = 1.8, - scale_axes = (1 / 3.0, 1 / 3.0, 1 / 3.0), - camera_position::Tuple{Float64, Float64, Float64} = scale_axes .* ( - (size(data, 1) - 1) / 2, (size(data, 2) - 1) / 2, max(size(data, 3), 0) + 10, - ), - target::Tuple{Float64, Float64, Float64} = scale_axes .* ( - (size(data, 1) - 1) / 2, (size(data, 2) - 1) / 2, 0.0, - ), - elevation_color_scheme = ColorSchemes.viridis, - ) - io = open(filename, "w") - return try - write( - io, - string( - "import settings;\nimport three;\n", - "size(7cm);\n", - "DefaultHead.size=new real(pen p=currentpen) {return $(arrow_head_size)mm;};\n", - "currentprojection=perspective( ", - "camera = $(camera_position), up=Y,", - "target = $(target) );\n\n", - ), - ) - dims = [size(data, i) for i in [1, 2, 3]] - for x in 1:dims[1] - for y in 1:dims[2] - for z in 1:dims[3] - v = Tuple(data[x, y, z]) #extract value - el = asin(min(1, max(-1, v[3]))) # since 3 is between -1 and 1 this yields a value between 0 and pi - # map elevation to color map - c = get(elevation_color_scheme, el + π / 2, (0.0, Float64(π))) - # write arrow in this color map - # transpose image to comply with image addresses (first index column downwards, second rows) - write( - io, - string( - "draw( $(scale_axes .* (x - 1, y - 1, z - 1))", - "--$(scale_axes .* (x - 1, y - 1, z - 1) .+ v),", - " rgb($(red(c)),$(green(c)),$(blue(c))), Arrow3);\n", - ), - ) - end - end - end - finally - close(io) - end -end -@doc """ - asymptote_export_SPD(filename) - -export given `data` as a point on a `Power(SymmetricPOsitiveDefinnite(3))}` manifold of -one-, two- or three-dimensional data with points on the manifold of symmetric positive -definite matrices. - -# Input -* `filename` a file to store the Asymptote code in. - -# Optional arguments for the data - -* `data` a point representing the 1D, 2D, or 3D array of SPD matrices -* `color_scheme` a `ColorScheme` for Geometric Anisotropy Index -* `scale_axes=(1/3,1/3,1/3)`: - move symmetric positive definite matrices - closer to each other by a factor per direction compared to the distance - estimated by the maximal eigenvalue of all involved SPD points - -# Optional arguments for asymptote - -* `camera_position` position of the camera scene (default: atop the center of the data in the xy-plane) -* `target` position the camera points at (default: center of xy-plane within data). - -Both values `camera_position` and `target` are scaled by `scaledAxes*EW`, where -`EW` is the maximal eigenvalue in the `data`. -""" -function asymptote_export_SPD( - filename::String; - data = fill(Matrix{Float64}(I, 3, 3), 0, 0), - scale_axes = (1 / 3.0, 1 / 3.0, 1 / 3.0) .* - (length(data) > 0 ? maximum(maximum(eigvals.(data))) : 1), - camera_position::Tuple{Float64, Float64, Float64} = ( - (size(data, 1) - 1) / 2, (size(data, 2) - 1) / 2, max(size(data, 3), 0.0) + 10.0, - ), - target::Tuple{Float64, Float64, Float64} = ( - (size(data, 1) - 1) / 2, (size(data, 2) - 1) / 2, 0.0, - ), - color_scheme = ColorSchemes.viridis, - ) - io = open(filename, "w") - return try - write( - io, - string( - "import settings;\nimport three;\n", - "surface ellipsoid(triple v1,triple v2,triple v3,real l1,real l2, real l3, triple pos=O) {\n", - " transform3 T = identity(4);\n", - " T[0][0] = l1*v1.x;\n T[1][0] = l1*v1.y;\n T[2][0] = l1*v1.z;\n", - " T[0][1] = l2*v2.x;\n T[1][1] = l2*v2.y;\n T[2][1] = l2*v2.z;\n", - " T[0][2] = l3*v3.x;\n T[1][2] = l3*v3.y;\n T[2][2] = l3*v3.z;\n", - " T[0][3] = pos.x;\n T[1][3] = pos.y;\n T[2][3] = pos.z;\n", - " return T*unitsphere;\n}\n\n", - "size(200);\n\n", - "real gDx=$(scale_axes[1]);\n", - "real gDy=$(scale_axes[2]);\n", - "real gDz=$(scale_axes[3]);\n\n", - "currentprojection=perspective(up=Y, ", - "camera = (gDx*$(camera_position[1]),gDy*$(camera_position[2]),gDz*$(camera_position[3])), ", - "target = (gDx*$(target[1]),gDy*$(target[2]),gDz*$(target[3])) );\n", - "currentlight=Viewport;\n\n", - ), - ) - dims = [size(data, 1) size(data, 2) size(data, 3)] - for x in 1:dims[1] - for y in 1:dims[2] - for z in 1:dims[3] - A = data[x, y, z] #extract matrix - F = eigen(A) - if maximum(abs.(A)) > 0.0 # a nonzero matrix (exclude several pixel - # Following Moakher & Batchelor: Geometric Anisotropic Index: - λ = F.values - V = F.vectors - Lλ = log.(λ) - GAI = sqrt( - 2 / 3 * sum(Lλ .^ 2) - - 2 / 3 * sum(sum(tril(Lλ * Lλ', -1); dims = 1); dims = 2)[1], - ) - c = get(color_scheme, GAI / (1 + GAI), (0, 1)) - write( - io, - string( - " draw( ellipsoid( ($(V[1, 1]),$(V[2, 1]),$(V[3, 1])),", - " ($(V[1, 2]),$(V[2, 2]),$(V[3, 2])), ($(V[1, 3]),$(V[2, 3]),$(V[3, 3])),", - " $(λ[1]), $(λ[2]), $(λ[3]), ", - " (gDx*$(x - 1), gDy*$(y - 1), gDz*$(z - 1))),", - " rgb($(red(c)),$(green(c)),$(blue(c))) );\n", - ), - ) - end - end - end - end - finally - close(io) - end -end - -""" - render_asymptote(filename; render=4, format="png", ...) -render an exported asymptote file specified in the `filename`, which can also -be given as a relative or full path - -# Input - -* `filename` filename of the exported `asy` and rendered image - -# Keyword arguments - -the default values are given in brackets - -* `render=4`: - render level of asymptote passed to its `-render` option. - This can be removed from the command by setting it to `nothing`. -* `format="png"`: - final rendered format passed to the `-f` option -* `export_file`: (the filename with format as ending) specify the export filename -""" -function render_asymptote( - filename; - render::Union{Int, Nothing} = 4, - format = "png", - export_folder = string(filename[1:([findlast(".", filename)...][1])], format), - ) - if isnothing(render) - renderCmd = `asy -f $(format) -globalwrite -o "$(relpath(export_folder))" $(filename)` - else - renderCmd = `asy -render $(render) -f $(format) -globalwrite -o "$(relpath(export_folder))" $(filename)` - end - return run(renderCmd) -end diff --git a/src/helpers/test.jl b/src/helpers/test.jl index 0efe64b2d9..678b6aaa4f 100644 --- a/src/helpers/test.jl +++ b/src/helpers/test.jl @@ -25,15 +25,28 @@ using ManifoldDiff # Dummy types struct DummyManifold <: AbstractManifold{ManifoldsBase.ℝ} end -struct DummyDecoratedObjective{E, O <: AbstractManifoldObjective} <: - Manopt.AbstractDecoratedManifoldObjective{E, O} +struct DummyDecoratedObjective{E, O <: AbstractManifoldObjective} <: Manopt.AbstractDecoratedManifoldObjective{E, O} objective::O end -function DummyDecoratedObjective( - o::O - ) where {E <: AbstractEvaluationType, O <: AbstractManifoldObjective{E}} +function DummyDecoratedObjective(o::O) where {E <: AbstractEvaluationType, O <: AbstractManifoldObjective{E}} return DummyDecoratedObjective{E, O}(o) end +function Manopt.status_summary( + ddo::DummyDecoratedObjective; kwargs... + ) + return "A dummy decorator for " * Manopt.status_summary(ddo.objective; kwargs...) +end +function Base.show(io::IO, ddo::DummyDecoratedObjective) + print(io, "DummyDecoratedObjective(") + print(io, ddo.objective) + return print(io, ")") +end +struct DummyEmptyDecoratedObjective{E, O <: AbstractManifoldObjective} <: Manopt.AbstractDecoratedManifoldObjective{E, O} + objective::O + function DummyEmptyDecoratedObjective(o::O) where {E <: AbstractEvaluationType, O <: AbstractManifoldObjective{E}} + return new{E, O}(o) + end +end struct DummyProblem{M <: AbstractManifold} <: AbstractManoptProblem{M} end struct DummyStoppingCriteriaSet <: StoppingCriterionSet end @@ -43,6 +56,8 @@ mutable struct DummyState <: AbstractManoptSolverState storage::Vector{Float64} end DummyState() = DummyState([]) +Manopt.status_summary(ds::DummyState; context = :Default) = "A Manopt Test state with storage $(ds.storage)" +Base.show(io::IO, ds::DummyState) = print(io, "Manopt.Test.DummyState($(ds.storage))") Manopt.get_iterate(::DummyState) = NaN Manopt.set_parameter!(s::DummyState, ::Val, v) = s Manopt.set_parameter!(s::DummyState, ::Val{:StoppingCriterion}, v) = s diff --git a/src/plans/adaptive_regularization_with_cubics_plan.jl b/src/plans/adaptive_regularization_with_cubics_plan.jl index ca80f1f8b9..ae4d1099c3 100644 --- a/src/plans/adaptive_regularization_with_cubics_plan.jl +++ b/src/plans/adaptive_regularization_with_cubics_plan.jl @@ -104,3 +104,21 @@ end function get_gradient_function(arcmo::AdaptiveRegularizationWithCubicsModelObjective) return (TpM, X) -> get_gradient(TpM, arcmo, X) end +function Base.show(io::IO, arcmo::AdaptiveRegularizationWithCubicsModelObjective) + print(io, "AdaptiveRegularizationWithCubicsModelObjective(") + print(io, arcmo.objective); print(io, ", ") + print(io, arcmo.σ) + return print(io, ")") +end +function status_summary(arcmo::AdaptiveRegularizationWithCubicsModelObjective; context::Symbol = :default) + (context === :short) && return repr(arcmo) + (context === :inline) && return "The (tangent space) model for the adaptive regularization with cubics sub problem with parameter σ=$(arcmo.σ) for the objective $(status_summary(arcmo.objective; context = context))" + return """ + The cubic polynomial based model for the sub problem of the Adaptive Regularization with cubics solver + + ## Regularization parameter + σ = $(arcmo.σ) + + ## Objective + $(_in_str(status_summary(arcmo.objective)))""" +end diff --git a/src/plans/alternating_gradient_plan.jl b/src/plans/alternating_gradient_plan.jl index 7a9f2c3cdb..828635e312 100644 --- a/src/plans/alternating_gradient_plan.jl +++ b/src/plans/alternating_gradient_plan.jl @@ -149,3 +149,20 @@ function get_gradient!( mago.gradient!![k](M, X, p) return X end + +function Base.show(io::IO, mago::ManifoldAlternatingGradientObjective{E}) where {E} + print(io, "ManifoldAlternatingGradientObjective(") + print(io, mago.cost); print(io, ", "); print(io, mago.gradient!!); print(io, "; ") + print(io, _to_kw(E)) + return print(io, ")") +end +function status_summary(mago::ManifoldAlternatingGradientObjective; context::Symbol = :default) + (context === :short) && (return repr(mago)) + (context === :inline) && (return "An alternating gradient objective on a manifold.") + return """ + An alternating gradient objective providing the gradient as components with which to alternate + + ## Functions + * cost: $(_MANOPT_INDENT)$(mago.cost) + * gradient:$(_MANOPT_INDENT)$(mago.gradient!!)""" +end diff --git a/src/plans/box_plan.jl b/src/plans/box_plan.jl new file mode 100644 index 0000000000..8a03898135 --- /dev/null +++ b/src/plans/box_plan.jl @@ -0,0 +1,863 @@ +""" + has_anisotropic_max_stepsize(M::AbstractManifold) + +Return `true` if `M` has `max_stepsize` that depends on the direction. +For example, if `M` is a `Hyperrectangle`-like manifold with corners, or a product of it +with a standard manifold. Otherwise return `false`. +""" +has_anisotropic_max_stepsize(::AbstractManifold) = false +has_anisotropic_max_stepsize(M::ProductManifold) = any(has_anisotropic_max_stepsize, M.manifolds) + +@doc raw""" + LimitedMemoryHessianApproximation <: AbstractQuasiNewtonDirectionUpdate + +An approximation of Hessian of a scalar function of the form ``B_0 = θ I``, +``B_{k+1} = B_k - W_k M_k W_k^{\mathrm{T}}``, +where ``θ > 0`` is an initial scaling guess. +Matrix ``M_k = \left(\begin{smallmatrix}M₁₁ & M₂₁^{\mathrm{T}}\\ M₂₁ & M₂₂\end{smallmatrix}\right)`` +is stored using its blocks. +Blocks ``W_k`` are (implicitly) composed from `memory_y` and `memory_s` stored in `qn_du` +of type [`QuasiNewtonLimitedMemoryDirectionUpdate`](@ref). + +Initial scale ``θ`` is stored in the field `initial_scale` but if the memory isn't empty, +the current scale is set to squared norm of $s_k$ divided by inner product of ``s_k`` and ``y_k`` +where ``k`` is the oldest index for which the denominator is not equal to 0. + +`last_gcd_result` stores the result of the last generalized Cauchy direction search. + +See [ByrdNocedalSchnabel:1994](@cite) for details. +""" +mutable struct QuasiNewtonLimitedMemoryBoxDirectionUpdate{ + TDU <: QuasiNewtonLimitedMemoryDirectionUpdate, + F <: Real, + T_HM <: AbstractMatrix, + V <: AbstractVector, + } <: AbstractQuasiNewtonDirectionUpdate + # this approximates inverse Hessian + qn_du::TDU + + # fields for approximating the Hessian + current_scale::F + M_11::T_HM + M_21::T_HM + M_22::T_HM + # buffer for calculating W_k blocks + buffer_inner_Sk_X::V + buffer_inner_Sk_Y::V + buffer_inner_Yk_X::V + buffer_inner_Yk_Y::V + last_gcd_result::Symbol + last_gcd_stepsize::F +end + +function status_summary(d::QuasiNewtonLimitedMemoryBoxDirectionUpdate) + s = "limited memory direction update with support for box constraints; " + s *= "internal direction update status: $(status_summary(d.qn_du))" + return s +end + +function get_parameter(d::QuasiNewtonLimitedMemoryBoxDirectionUpdate, ::Val{:max_stepsize}) + if d.last_gcd_result === :found_limited + return d.last_gcd_stepsize + else + return Inf + end +end + +function QuasiNewtonLimitedMemoryBoxDirectionUpdate( + qn_du::QuasiNewtonLimitedMemoryDirectionUpdate{<:AbstractQuasiNewtonUpdateRule, T, F} + ) where {T, F <: Real} + memory_size = capacity(qn_du.memory_s) + M_11 = zeros(F, memory_size, memory_size) + M_21 = zeros(F, memory_size, memory_size) + M_22 = zeros(F, memory_size, memory_size) + buffer_inner_Sk_X = zeros(F, memory_size) + buffer_inner_Sk_Y = zeros(F, memory_size) + buffer_inner_Yk_X = zeros(F, memory_size) + buffer_inner_Yk_Y = zeros(F, memory_size) + return QuasiNewtonLimitedMemoryBoxDirectionUpdate{ + typeof(qn_du), F, typeof(M_11), typeof(buffer_inner_Sk_X), + }( + qn_du, + qn_du.initial_scale, + M_11, + M_21, + M_22, + buffer_inner_Sk_X, + buffer_inner_Sk_Y, + buffer_inner_Yk_X, + buffer_inner_Yk_Y, + :not_searched, + NaN, + ) +end + +function initialize_update!(ha::QuasiNewtonLimitedMemoryBoxDirectionUpdate) + initialize_update!(ha.qn_du) + ha.last_gcd_result = :not_searched + return ha +end + +function (d::QuasiNewtonLimitedMemoryBoxDirectionUpdate)( + mp::AbstractManoptProblem, st + ) + r = zero_vector(get_manifold(mp), get_iterate(st)) + return d(r, mp, st) +end +function (d::QuasiNewtonLimitedMemoryBoxDirectionUpdate)( + r, mp::AbstractManoptProblem, st + ) + d.qn_du(r, mp, st) + M = get_manifold(mp) + p = get_iterate(st) + X = get_gradient(st) + gcd = GeneralizedCauchyDirectionSubsolver(M, p, d) + d.last_gcd_result, d.last_gcd_stepsize = find_generalized_cauchy_direction!(M, gcd, r, p, r, X) + return r +end + +get_update_vector_transport(u::QuasiNewtonLimitedMemoryBoxDirectionUpdate) = get_update_vector_transport(u.qn_du) + +function get_at_bound_index(M::ProductManifold, X, b::Tuple{Int, Any}) + return get_at_bound_index(M.manifolds[b[1]], submanifold_component(M, X, b[1]), b[2]) +end + +@doc raw""" + hessian_value_diag(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, M::AbstractManifold, p, X) + +Compute ``⟨X, B X⟩``, where ``B`` is the (1, 1)-Hessian represented by `gh`. +""" +function hessian_value_diag(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, M::AbstractManifold, p, X) + m = length(gh.qn_du.memory_s) + num_nonzero_rho = count(!iszero, gh.qn_du.ρ) + + normX_sqr = norm(M, p, X)^2 + + if m == 0 || num_nonzero_rho == 0 + return gh.qn_du.initial_scale \ normX_sqr + end + + ii = 1 + for i in 1:m + iszero(gh.qn_du.ρ[i]) && continue + gh.buffer_inner_Yk_X[ii] = inner(M, p, gh.qn_du.memory_y[i], X) + gh.buffer_inner_Sk_X[ii] = gh.current_scale * inner(M, p, gh.qn_du.memory_s[i], X) + + ii += 1 + end + buffer_inner_Yk = view(gh.buffer_inner_Yk_X, 1:num_nonzero_rho) + buffer_inner_Sk = view(gh.buffer_inner_Sk_X, 1:num_nonzero_rho) + + return hessian_value_from_inner_products(gh, normX_sqr, buffer_inner_Yk, buffer_inner_Sk, buffer_inner_Yk, buffer_inner_Sk) +end + +@doc raw""" + hessian_value_diag(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, M::AbstractManifold, p, X::UnitVector) + +Compute ``⟨X, B X⟩``, where ``B`` is the (1, 1)-Hessian represented by `gh`, and `X` is the +[`UnitVector`](@ref). +""" +function hessian_value_diag(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, M::AbstractManifold, p, X::UnitVector) + b = X.index + m = length(gh.qn_du.memory_s) + num_nonzero_rho = count(!iszero, gh.qn_du.ρ) + + if m == 0 || num_nonzero_rho == 0 + return inv(gh.qn_du.initial_scale) + end + + ii = 1 + for i in 1:m + iszero(gh.qn_du.ρ[i]) && continue + gh.buffer_inner_Yk_X[ii] = get_at_bound_index(M, gh.qn_du.memory_y[i], b) + gh.buffer_inner_Sk_X[ii] = gh.current_scale * get_at_bound_index(M, gh.qn_du.memory_s[i], b) + + ii += 1 + end + buffer_inner_Yk = view(gh.buffer_inner_Yk_X, 1:num_nonzero_rho) + buffer_inner_Sk = view(gh.buffer_inner_Sk_X, 1:num_nonzero_rho) + + return hessian_value_from_inner_products(gh, one(eltype(gh.qn_du.ρ)), buffer_inner_Yk, buffer_inner_Sk, buffer_inner_Yk, buffer_inner_Sk) +end + +@doc raw""" + hessian_value(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, M::AbstractManifold, p, X::UnitVector, Y) + +Compute ``⟨X, B Y⟩``, where ``B`` is the (1, 1)-Hessian represented by `gh`, where `X` is the +[`UnitVector`](@ref). +""" +function hessian_value(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, M::AbstractManifold, p, X::UnitVector, Y) + b = X.index + + m = length(gh.qn_du.memory_s) + num_nonzero_rho = count(!iszero, gh.qn_du.ρ) + + Yb = get_at_bound_index(M, Y, b) + if m == 0 || num_nonzero_rho == 0 + return gh.qn_du.initial_scale * Yb + end + + ii = 1 + for i in 1:m + iszero(gh.qn_du.ρ[i]) && continue + gh.buffer_inner_Yk_X[ii] = get_at_bound_index(M, gh.qn_du.memory_y[i], b) + gh.buffer_inner_Sk_X[ii] = gh.current_scale * get_at_bound_index(M, gh.qn_du.memory_s[i], b) + + gh.buffer_inner_Yk_Y[ii] = inner(M, p, gh.qn_du.memory_y[i], Y) + gh.buffer_inner_Sk_Y[ii] = gh.current_scale * inner(M, p, gh.qn_du.memory_s[i], Y) + ii += 1 + end + buffer_inner_Yk_X = view(gh.buffer_inner_Yk_X, 1:num_nonzero_rho) + buffer_inner_Yk_Y = view(gh.buffer_inner_Yk_Y, 1:num_nonzero_rho) + buffer_inner_Sk_X = view(gh.buffer_inner_Sk_X, 1:num_nonzero_rho) + buffer_inner_Sk_Y = view(gh.buffer_inner_Sk_Y, 1:num_nonzero_rho) + + return hessian_value_from_inner_products(gh, Yb, buffer_inner_Yk_X, buffer_inner_Sk_X, buffer_inner_Yk_Y, buffer_inner_Sk_Y) +end + +@doc raw""" + update_current_scale!(M::AbstractManifold, p, gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate) + +Refresh the scaling factor and blockwise Hessian approximation stored in `gh` using the +nonzero curvature pairs currently in memory. + +- Identifies the most recent index with nonzero ``ρ_i`` to scale the initial Hessian guess + by ``ρ_i‖y_i‖^2 / θ``. +- Builds ``L_k`` and ``S_k^\top S_k`` from the stored ``(s_i, y_i)`` pairs and updates the + block matrices ``M₁₁``, ``M₂₁``, and ``M₂₂`` via the blockwise inverse formula. +- If all ``ρ_i`` vanish, resets `current_scale` to the inverse of `initial_scale` and + clears the block matrices. + +Returns the mutated `gh`. +""" +function update_current_scale!(M::AbstractManifold, p, gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate) + m = length(gh.qn_du.memory_s) + last_safe_index = -1 + for i in eachindex(gh.qn_du.ρ) + if abs(gh.qn_du.ρ[i]) > 0 + last_safe_index = i + end + end + + if (last_safe_index == -1) + # All memory yield zero inner products + gh.current_scale = inv(gh.qn_du.initial_scale) + gh.M_11 = fill(0.0, 0, 0) + gh.M_21 = fill(0.0, 0, 0) + gh.M_22 = fill(0.0, 0, 0) + return gh + end + + invA = Diagonal([-ri for ri in gh.qn_du.ρ if !iszero(ri)]) + num_nonzero_rho = count(!iszero, gh.qn_du.ρ) + + Lk = LowerTriangular(zeros(num_nonzero_rho, num_nonzero_rho)) + + # total scaling factor for the initial Hessian + # written this way to avoid floating point overflow (when ynorm is finite but ynorm^2 is Inf) + # see CUTEst EXPQUAD problem for an example + ynorm = norm(M, p, gh.qn_du.memory_y[last_safe_index]) + gh.current_scale = ((gh.qn_du.ρ[last_safe_index] * ynorm) * ynorm) / gh.qn_du.initial_scale + + tsksk = Symmetric(zeros(num_nonzero_rho, num_nonzero_rho)) + ii = 1 + # fill Dk and Lk + for i in 1:m + iszero(gh.qn_du.ρ[i]) && continue + jj = 1 + for j in 1:m + iszero(gh.qn_du.ρ[j]) && continue + if jj < ii + Lk[ii, jj] = inner(M, p, gh.qn_du.memory_s[i], gh.qn_du.memory_y[j]) + end + if ii <= jj + tsksk.data[ii, jj] = inner(M, p, gh.qn_du.memory_s[i], gh.qn_du.memory_s[j]) + end + jj += 1 + end + ii += 1 + end + tsksk.data .*= gh.current_scale + + # matrix inversion using the blockwise formula for speed + # Schur complement of -Dk is the only non-diagonal matrix we actually need to inverse in this step + W1 = Lk * invA + W2 = W1 * Lk' + gh.M_22 = inv(Symmetric(tsksk - W2)) + W3 = gh.M_22 * W1 + W4 = W1' * W3 + + gh.M_11 = invA + W4 + gh.M_21 = -W3 + + return gh +end + +@doc raw""" + hessian_value_from_inner_products(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, iss::Real, cy1, cs1, cy2, cs2) + +Evaluate the quadratic form defined by the current blockwise Hessian approximation stored in +`gh`, given precomputed coordinate vectors. + +Arguments: +- `iss`: inner product of original vectors. +- `cy1`, `cy2`: coordinates of ``y``-like vectors in the ``Y_k`` basis. +- `cs1`, `cs2`: coordinates of ``s``-like vectors in the scaled ``S_k`` basis. + +The result is ``θ·iss - cy₁ᵀ M₁₁ cy₂ - 2·cs₁ᵀ M₂₁ cy₂ - cs₁ᵀ M₂₂ cs₂`` using the blocks +``M₁₁``, ``M₂₁``, ``M₂₂`` stored in `gh` and the current scale ``θ``. Returns the scalar value. +""" +function hessian_value_from_inner_products(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, iss::Real, cy1, cs1, cy2, cs2) + result = gh.current_scale * iss + if length(cy1) == 0 + return result + end + result -= dot(cy1, gh.M_11, cy2) + result -= 2 * dot(cs1, gh.M_21, cy2) + result -= dot(cs1, gh.M_22, cs2) + + return result +end + + +@doc raw""" + update_hessian!(gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, p) + +Update Hessian approximation `gh` by moving it to point `p` and updating the stored `s` and +`y` vectors. +""" +function update_hessian!( + gh::QuasiNewtonLimitedMemoryBoxDirectionUpdate, + mp::AbstractManoptProblem, + st::AbstractManoptSolverState, + p_old, + k::Int, + ) + (capacity(gh.qn_du.memory_s) == 0) && return gh + update_hessian!(gh.qn_du, mp, st, p_old, k) + update_current_scale!(get_manifold(mp), get_iterate(st), gh) + return gh +end + + +""" + abstract type AbstractSegmentHessianUpdater end + +Abstract type for methods that calculate f' and f'' in the GCD calculation in subsequent +line segments in [`GeneralizedCauchyDirectionSubsolver`](@ref). +""" +abstract type AbstractSegmentHessianUpdater end + +""" + init_updater!(::AbstractManifold, hessian_segment_updater::AbstractSegmentHessianUpdater, p, d, ha::AbstractQuasiNewtonDirectionUpdate) + +Method for initialization of `AbstractSegmentHessianUpdater` `hessian_segment_updater` just before the loop +that examines subsequent intervals for GCD. +""" +init_updater!(::AbstractManifold, hessian_segment_updater::AbstractSegmentHessianUpdater, p, d, ha::AbstractQuasiNewtonDirectionUpdate) + +""" + struct GenericSegmentHessianUpdater <: AbstractSegmentHessianUpdater end + +Generic f' and f'' calculation that only relies on `hessian_value` but is relatively slow for +high-dimensional domains. +""" +struct GenericSegmentHessianUpdater{TX} <: AbstractSegmentHessianUpdater + d_z::TX + d_tmp::TX +end + +function get_default_hessian_segment_updater(M::AbstractManifold, p, ::AbstractQuasiNewtonDirectionUpdate) + return GenericSegmentHessianUpdater(zero_vector(M, p), zero_vector(M, p)) +end + +function init_updater!(M::AbstractManifold, hessian_segment_updater::GenericSegmentHessianUpdater, p, d, ha::AbstractQuasiNewtonDirectionUpdate) + zero_vector!(M, hessian_segment_updater.d_z, p) + copyto!(M, hessian_segment_updater.d_tmp, d) + return hessian_segment_updater +end + +@doc raw""" + (upd::GenericSegmentHessianUpdater)(M::AbstractManifold, p, t::Real, dt::Real, b, db, ha::AbstractQuasiNewtonDirectionUpdate) + +Calculate Hessian values ``⟨e_b, B d_z⟩`` and ``⟨e_b, B d_tmp⟩`` for the generalized Cauchy +point line search using the generic approach via `hessian_value` with [`UnitVector`](@ref). +``d_z`` start with 0 and is updated in-place by adding `dt * d` to it. +""" +function (upd::GenericSegmentHessianUpdater)(M::AbstractManifold, p, t::Real, dt::Real, b, db, ha) + upd.d_z .+= dt .* upd.d_tmp + hv_eb_dz = hessian_value(ha, M, p, UnitVector(b), upd.d_z) + hv_eb_d = hessian_value(ha, M, p, UnitVector(b), upd.d_tmp) + + set_zero_at_index!(M, upd.d_tmp, b) + + return hv_eb_dz, hv_eb_d +end + +""" + struct LimitedMemorySegmentHessianUpdater{TV <: AbstractVector} <: AbstractSegmentHessianUpdater + +Hessian value calculation for generalized Cauchy direction line segments that is optimized for +[`QuasiNewtonLimitedMemoryBoxDirectionUpdate`](@ref). It relies on a specific Hessian structure. +""" +struct LimitedMemorySegmentHessianUpdater{TV <: AbstractVector} <: AbstractSegmentHessianUpdater + p_s::TV + p_y::TV + c_s::TV + c_y::TV +end + +function get_default_hessian_segment_updater(::AbstractManifold, p, ha::QuasiNewtonLimitedMemoryBoxDirectionUpdate) + return LimitedMemorySegmentHessianUpdater(similar(ha.qn_du.ρ), similar(ha.qn_du.ρ), similar(ha.qn_du.ρ), similar(ha.qn_du.ρ)) +end + +function init_updater!(M::AbstractManifold, hessian_segment_updater::LimitedMemorySegmentHessianUpdater, p, d, ha::QuasiNewtonLimitedMemoryBoxDirectionUpdate) + fill!(hessian_segment_updater.c_s, 0) + fill!(hessian_segment_updater.c_y, 0) + ii = 1 + for i in eachindex(ha.qn_du.ρ) + if iszero(ha.qn_du.ρ[i]) + continue + end + + hessian_segment_updater.p_s[ii] = ha.current_scale * inner(M, p, ha.qn_du.memory_s[i], d) + hessian_segment_updater.p_y[ii] = inner(M, p, ha.qn_du.memory_y[i], d) + ii += 1 + end + return hessian_segment_updater +end + +@doc raw""" + (hessian_segment_updater::LimitedMemorySegmentHessianUpdater)( + M::AbstractManifold, p, + t::Real, dt::Real, b, db, ha::QuasiNewtonLimitedMemoryBoxDirectionUpdate + ) + +Calculate Hessian values ``⟨e_b, B d_z⟩`` and ``⟨e_b, B d⟩`` for the generalized Cauchy +point line search using the limited-memory block Hessian stored in `ha`. +``d_z`` start with 0 and is updated in-place by adding `dt * d` to it. + +## Arguments: + +- `M`: manifold. +- `p`: current iterate. +- `t`: current step length from `p`. +- `dt`: step length increment from the last step. +- `b`: bound index of the current segment. +- `db`: search direction component at the bound index `b`. + +The updater reuses cached coordinate projections in `hessian_segment_updater` to cheaply +evaluate Hessian quadratic forms via `hessian_value_from_inner_products`. +""" +function (hessian_segment_updater::LimitedMemorySegmentHessianUpdater)( + M::AbstractManifold, p, + t::Real, dt::Real, b, db, ha::QuasiNewtonLimitedMemoryBoxDirectionUpdate + ) + + m = length(ha.qn_du.memory_s) + num_nonzero_rho = count(!iszero, ha.qn_du.ρ) + + ii = 1 + for i in 1:m + iszero(ha.qn_du.ρ[i]) && continue + # setting _X to w_b from the paper + ha.buffer_inner_Yk_X[ii] = get_at_bound_index(M, ha.qn_du.memory_y[i], b) + ha.buffer_inner_Sk_X[ii] = ha.current_scale * get_at_bound_index(M, ha.qn_du.memory_s[i], b) + + ii += 1 + end + + buffer_inner_Yk_eb = view(ha.buffer_inner_Yk_X, 1:num_nonzero_rho) + buffer_inner_Sk_eb = view(ha.buffer_inner_Sk_X, 1:num_nonzero_rho) + + buffer_inner_cy = view(hessian_segment_updater.c_y, 1:num_nonzero_rho) + buffer_inner_cs = view(hessian_segment_updater.c_s, 1:num_nonzero_rho) + buffer_inner_py = view(hessian_segment_updater.p_y, 1:num_nonzero_rho) + buffer_inner_ps = view(hessian_segment_updater.p_s, 1:num_nonzero_rho) + + buffer_inner_cy .+= dt .* buffer_inner_py + buffer_inner_cs .+= dt .* buffer_inner_ps + + eb_B_z = hessian_value_from_inner_products(ha, t * db, buffer_inner_Yk_eb, buffer_inner_Sk_eb, buffer_inner_cy, buffer_inner_cs) + + eb_B_d = hessian_value_from_inner_products(ha, db, buffer_inner_Yk_eb, buffer_inner_Sk_eb, buffer_inner_py, buffer_inner_ps) + + buffer_inner_py .-= db .* buffer_inner_Yk_eb + buffer_inner_ps .-= db .* buffer_inner_Sk_eb + + return eb_B_z, eb_B_d +end + +struct ProductIndex{T <: Tuple} + ranges::T +end + +Base.iterate(itr::ProductIndex) = _iterate(itr.ranges, 1, nothing) +Base.iterate(itr::ProductIndex, state) = _iterate(itr.ranges, state...) + +function _iterate(ranges, i, st) + i > length(ranges) && return nothing + if st === nothing + it = iterate(ranges[i]) + it === nothing && return _iterate(ranges, i + 1, nothing) + (j, st2) = it + return ((i, j), (i, st2)) + else + it = iterate(ranges[i], st) + if it === nothing + return _iterate(ranges, i + 1, nothing) + else + (j, st2) = it + return ((i, j), (i, st2)) + end + end +end + + +""" + to_coordinate_index(M::ProductManifold, b::UnitVector, B::AbstractBasis) + +Get the index of coordinate equal to 1 of [`UnitVector`](@ref) `b` with respect to +`AbstractBasis` `B`. +""" +to_coordinate_index(::AbstractManifold, b::UnitVector{Int}, ::AbstractBasis) = b.index +""" + to_coordinate_index(M::ProductManifold, b::UnitVector, B::AbstractBasis) + +Get the index of coordinate equal to 1 of [`UnitVector`](@ref) `b` with respect to +`AbstractBasis` `B`. +""" +function to_coordinate_index(M::ProductManifold, b::UnitVector{Tuple{Int, Int}}, B::AbstractBasis) + i, j = b.index + offset = sum(k -> number_of_coordinates(M.manifolds[k], B), 1:(i - 1); init = 0) + return offset + j +end + +Base.length(itr::ProductIndex) = sum(length, itr.ranges) + + +""" + get_bounds_index(::AbstractManifold) + +Get the bound indices of manifold `M`. Standard manifolds don't have bounds, so +`Base.OneTo(1)` is returned. +""" +get_bounds_index(M::AbstractManifold) = Base.OneTo(0) +function get_bounds_index(M::ProductManifold) + ranges = map(get_bounds_index, M.manifolds) + iter = ProductIndex(ranges) + return iter +end + +""" + get_stepsize_bound(M::AbstractManifold, x, d, i) + +Get the upper bound on moving in direction `d` from point `p` on manifold `M`, for the +bound index `i`. +""" +get_stepsize_bound(M::AbstractManifold, p, d, i) +function get_stepsize_bound(M::ProductManifold, p, d, i::Tuple{Int, Any}) + i1, i2 = i + return get_stepsize_bound(M.manifolds[i1], submanifold_component(M, p, i1), submanifold_component(M, d, i1), i2) +end + +""" + set_zero_at_index!(M::ProductManifold, d, i::Tuple{Int,Any}) + +Set the element of the `i[1]`th component of `d` at bound index `i[2]` to zero. +""" +function set_zero_at_index!(M::ProductManifold, d, i::Tuple{Int, Any}) + i1, i2 = i + set_zero_at_index!(M.manifolds[i1], submanifold_component(M, d, i1), i2) + return d +end + +""" + set_stepsize_bound!(M::AbstractManifold, d_out, p, d, t_current::Real) + +For each component at index `i` in the tangent vector `d_out`, if the stepsize bound in +direction `d` for that component is less than `t_current`, set the element of `d_out` to +the distance from `p[i]` to the bound in the direction of `d[i]`. If the stepsize bound is +non-positive, set the element to 0. + +By default it does not modify `d_out` because most manifolds don't have direction-specific +stepsize bounds, and general anisotropic bounds are handled differently. +""" +function set_stepsize_bound!(::AbstractManifold, d_out, p, d, t_current::Real) + return d_out +end + +function set_stepsize_bound!(M::ProductManifold, d_out, p, d, t_current::Real) + map( + (N, d_out_c, p_c, d_c) -> set_stepsize_bound!(N, d_out_c, p_c, d_c, t_current), + M.manifolds, + submanifold_components(M, d_out), + submanifold_components(M, p), + submanifold_components(M, d), + ) + return d_out +end + +@doc raw""" + GeneralizedCauchyDirectionSubsolver{TM <: AbstractManifold, TP, T_HA <: AbstractQuasiNewtonDirectionUpdate, TFU <: AbstractSegmentHessianUpdater} + +Helper container for generalized Cauchy direction search. Stores the manifold `M`, cached +original descent direction (`d_original`), the quasi-Newton direction update `ha`, and the +`hessian_segment_updater`, which computes certain values of the Hessian while advancing segments. +Instances are reused across segments during [`find_generalized_cauchy_direction!`](@ref) to +avoid allocations. +""" +struct GeneralizedCauchyDirectionSubsolver{ + TX, + T_HA <: AbstractQuasiNewtonDirectionUpdate, TFU <: AbstractSegmentHessianUpdater, TFT <: Tuple{<:Real, Any}, TBI, + TO <: Base.Order.Ordering, + } + d_original::TX + ha::T_HA + hessian_segment_updater::TFU + F_list::Vector{TFT} + bounds_indices::TBI + ordering::TO +end + +function GeneralizedCauchyDirectionSubsolver( + M::AbstractManifold, p, ha::AbstractQuasiNewtonDirectionUpdate; + hessian_segment_updater::AbstractSegmentHessianUpdater = get_default_hessian_segment_updater(M, p, ha) + ) + bounds_indices = get_bounds_index(M) + TInd = eltype(bounds_indices) + TF = number_eltype(p) + F_list = Tuple{TF, TInd}[] + sizehint!(F_list, length(bounds_indices) + 1) + ordering = Base.By(first) + return GeneralizedCauchyDirectionSubsolver( + zero_vector(M, p), ha, + hessian_segment_updater, F_list, bounds_indices, ordering + ) +end + +function collect_isotropic_limits!(::AbstractManifold, ::Vector{<:Tuple{TF, Any}}, p, d)::Tuple{Bool, TF} where {TF <: Real} + return false, convert(TF, Inf) +end + +function collect_isotropic_limits!(M::ProductManifold, F_list::Vector{<:Tuple{TF, Any}}, p, d)::Tuple{Bool, TF} where {TF <: Real} + has_finite_limit = false + smallest_positive_limit = Inf + map(M.manifolds, submanifold_components(M, p), submanifold_components(M, d)) do Mi, p_i, d_i + if !has_anisotropic_max_stepsize(Mi) + max_step = Manopt.max_stepsize(Mi, p_i) + if isfinite(max_step) + tms = max_step / norm(Mi, p_i, d_i) + push!(F_list, (tms, -1)) + has_finite_limit = true + if tms < smallest_positive_limit + smallest_positive_limit = tms + end + end + end + end + return has_finite_limit, smallest_positive_limit +end + +""" + find_generalized_cauchy_direction!( + M::AbstractManifold, + gcd::GeneralizedCauchyDirectionSubsolver, d_out, p, d, X + ) + +Find generalized Cauchy direction looking from point `p` on manifold `M` in direction `d` +and save it to `d_out`. Gradient of the objective at `p` is `X`. + +The function returns a pair (status, max_stepsize) where `status` is a symbol describing +the result of the search, and `max_stepsize` is the maximum stepsize that can be taken in +the direction `d_out`. + +The `status` can be one of the following: +* `:found_limited` if the point was found and we can perform a step of length at most 1 + in direction `d_out` afterwards, +* `:found_unlimited` if the point was found and we can perform a step of length at most + `max_stepsize(M, p)` in direction `d_out` afterwards, +* `:not_found` if the search cannot be performed in direction `d`. +""" +function find_generalized_cauchy_direction!( + M::AbstractManifold, + gcd::GeneralizedCauchyDirectionSubsolver{ + <:Any, <:AbstractQuasiNewtonDirectionUpdate, + <:AbstractSegmentHessianUpdater, <:Tuple{TF, Any}, + }, + d_out, p, d, X + ) where {TF <: Real} + copyto!(M, gcd.d_original, d) + copyto!(M, d_out, d) + + ordering = gcd.ordering + F_list = gcd.F_list + empty!(F_list) + + bounds_indices = gcd.bounds_indices + + # isotropic limits + has_finite_limit, smallest_positive_limit = collect_isotropic_limits!(M, F_list, p, d) + # anisotropic limits + for i in bounds_indices + sbi = get_stepsize_bound(M, p, d, i)::TF + + if sbi > 0 + push!(F_list, (sbi, i)) + if sbi < smallest_positive_limit + smallest_positive_limit = sbi + end + end + has_finite_limit |= isfinite(sbi) + end + + # In this case we can't move in the direction `d` at all, though it's usually not + # a problem relevant to the end user because it can be handled by step_solver! that + # uses the GCD subsolver. + if isempty(F_list) + return (:not_found, NaN) + end + heapify!(F_list, ordering) + + f_prime = inner(M, p, X, d) + f_double_prime = hessian_value_diag(gcd.ha, M, p, d) + + if iszero(f_prime) || iszero(f_double_prime) + return (:not_found, NaN) + end + + dt_min = -f_prime / f_double_prime + t_old = 0.0 + + t_current, b = heappop!(F_list, ordering) + dt = t_current - t_old + + init_updater!(M, gcd.hessian_segment_updater, p, d, gcd.ha) + # b can be -1 if it corresponds to the max stepsize limit on the manifold part + while dt_min > dt && b != -1 + db = get_at_bound_index(M, d, b)::TF + gb = get_at_bound_index(M, X, b)::TF + + hv_eb_dz, hv_eb_d = gcd.hessian_segment_updater(M, p, t_current, dt, b, db, gcd.ha)::Tuple{TF, TF} + + f_prime += dt * f_double_prime - db * (gb + hv_eb_dz) + f_double_prime += (2 * -db * hv_eb_d) + db^2 * hessian_value_diag(gcd.ha, M, p, UnitVector(b)) + + t_old = t_current + + # If f_prime is 0, we've found the local minimizer (GCD) + if iszero(f_prime) || iszero(f_double_prime) + # It means that GCD is at the beginning of the t_current, so we want to set dt_min to 0 (stay in the point) + dt_min = 0.0 + break + end + + dt_min = -f_prime / f_double_prime + isempty(F_list) && break + + t_current, b = heappop!(F_list, ordering) + dt = t_current - t_old + end + + dt_min = max(dt_min, 0.0) + t_old = t_old + dt_min + d_out .*= t_old + # by construction, there is no bound achievable before stepsize 1.0 in direction d_out + # there first bound after that is achieved at smallest_positive_limit / t_old + max_feasible_stepsize = max(1.0, smallest_positive_limit / t_old) + + set_stepsize_bound!(M, d_out, p, gcd.d_original, t_old) + if has_finite_limit + return (:found_limited, max_feasible_stepsize) + else + return (:found_unlimited, Inf) + end +end + +""" + struct MaxStepsizeInDirectionSubsolver end + +Helper container for finding the maximum stepsize in a direction. Stores the manifold `M`, +container for the list of bounds `F_list`, and the bound indices. + +## Constructor + + MaxStepsizeInDirectionSubsolver(M::AbstractManifold, p) + +Initialize the `MaxStepsizeInDirectionSubsolver` for manifold `M` and point `p`. The `F_list` +is initialized to be empty and will be populated during the search for the maximum stepsize +in a direction. Floating point type of the elements bounds in `F_list` is determined by the +number type of `p`. + +The `MaxStepsizeInDirectionSubsolver` can be reused for multiple different points and +directions on the same manifold, but it is not thread-safe. +""" +struct MaxStepsizeInDirectionSubsolver{TFT <: Tuple{<:Real, Any}, TBI} + F_list::Vector{TFT} + bounds_indices::TBI +end +function MaxStepsizeInDirectionSubsolver(M::AbstractManifold, p) + bounds_indices = get_bounds_index(M) + TInd = eltype(bounds_indices) + TF = number_eltype(p) + F_list = Tuple{TF, TInd}[] + sizehint!(F_list, length(bounds_indices) + 1) + return MaxStepsizeInDirectionSubsolver{Tuple{TF, TInd}, typeof(bounds_indices)}(F_list, bounds_indices) +end + +""" + find_max_stepsize_in_direction(M::AbstractManifold, gcd::MaxStepsizeInDirectionSubsolver, p, d) + +Find the maximum stepsize that can be performed from point `p` in direction `d`. + +The function returns a pair (status, max_stepsize) where `status` is a symbol describing +the result of the search, and `max_stepsize` is the maximum stepsize that can be taken in +the direction `d_out`. + +The `status` can be one of the following: +* `:found_limited` if the point was found and we can perform a step of length at most 1 + in direction `d_out` afterwards, +* `:found_unlimited` if the point was found and we can perform a step of length at most + `max_stepsize(M, p)` in direction `d_out` afterwards, +* `:not_found` if the search cannot be performed in direction `d`. +""" +function find_max_stepsize_in_direction( + M::AbstractManifold, + sdf::MaxStepsizeInDirectionSubsolver{<:Tuple{TF, Any}}, + p, d + ) where {TF <: Real} + + F_list = sdf.F_list + empty!(F_list) + bounds_indices = sdf.bounds_indices + + # isotropic limits + has_finite_limit, smallest_positive_limit = collect_isotropic_limits!(M, F_list, p, d) + # anisotropic limits + for i in bounds_indices + sbi = get_stepsize_bound(M, p, d, i)::TF + + if sbi > 0 + push!(F_list, (sbi, i)) + if sbi < smallest_positive_limit + smallest_positive_limit = sbi + end + end + has_finite_limit |= isfinite(sbi) + end + + if isempty(F_list) + return (:not_found, NaN) + end + if has_finite_limit + return (:found_limited, smallest_positive_limit) + else + return (:found_unlimited, Inf) + end + +end + +function show(io::IO, qns::QuasiNewtonLimitedMemoryBoxDirectionUpdate) + print(io, "QuasiNewtonLimitedMemoryBoxDirectionUpdate with internal state:\n") + return print(io, qns.qn_du) +end diff --git a/src/plans/bundle_plan.jl b/src/plans/bundle_plan.jl index 96db9f76d6..3f3c1e230b 100644 --- a/src/plans/bundle_plan.jl +++ b/src/plans/bundle_plan.jl @@ -140,20 +140,20 @@ function get_reason(sc::StopWhenLagrangeMultiplierLess) return "" end -function status_summary(sc::StopWhenLagrangeMultiplierLess) +function status_summary(sc::StopWhenLagrangeMultiplierLess; context::Symbol = :default) s = (sc.at_iteration >= 0) ? "reached" : "not reached" msg = "Lagrange multipliers" isnothing(sc.names) && (msg *= " with tolerances $(sc.tolerances)") if !isnothing(sc.names) msg *= join(["$si < $bi" for (si, bi) in zip(sc.names, sc.tolerances)], ", ") end - return "$(msg) :\t$(s)" + return (_is_inline(context) ? "" : "A stopping criterion to stop when the Lagrange multipliers are less than $(sc.tolerances).\n$(_MANOPT_INDENT)") * "$(msg):$(_MANOPT_INDENT)$(s)" end function show(io::IO, sc::StopWhenLagrangeMultiplierLess) n = isnothing(sc.names) ? "" : ", $(names)" return print( io, - "StopWhenLagrangeMultiplierLess($(sc.tolerances); mode=:$(sc.mode)$n)\n $(status_summary(sc))", + "StopWhenLagrangeMultiplierLess($(sc.tolerances); mode=:$(sc.mode)$n)", ) end @@ -181,6 +181,12 @@ mutable struct DebugWarnIfLagrangeMultiplierIncreases <: DebugAction return new(warn, Float64(Inf), tol) end end -function show(io::IO, di::DebugWarnIfLagrangeMultiplierIncreases) - return print(io, "DebugWarnIfLagrangeMultiplierIncreases(; tol=\"$(di.tol)\")") +function show(io::IO, d::DebugWarnIfLagrangeMultiplierIncreases) + m = (d.status === :No ? "" : ":$(d.status)") + return print(io, "DebugWarnIfLagrangeMultiplierIncreases($(m); tol=\"$(d.tol)\")") +end +function status_summary(d::DebugWarnIfLagrangeMultiplierIncreases; context::Symbol = :default) + (context === :short) && return repr(d) + m = (d.status === :Once) ? "once" : (d.status === :No ? "(inactive)" : "") + return "a DebugAction warning if the lagange multiplier increases in an iteration $m." end diff --git a/src/plans/cache.jl b/src/plans/cache.jl index b6a4d43d00..d90d5734e2 100644 --- a/src/plans/cache.jl +++ b/src/plans/cache.jl @@ -17,7 +17,7 @@ It otherwise does call the original differential. This simple cache does not take into account, that some first order objectives have a common function for cost & grad. It only caches the function that is actually called. -# Constructor +# Constructors SimpleManifoldCachedObjective(M::AbstractManifold, obj::AbstractManifoldFirstOrderObjective; kwargs...) @@ -28,6 +28,13 @@ common function for cost & grad. It only caches the function that is actually ca see also `initialize=` * `c=[`get_cost`](@ref)`(M, obj, p)` or `0.0`: a value to store the cost function in `initialize` * `initialized=true`: whether to initialize the cached `X` and `c` or not. + +where both for `p` and `X` copies are generated before they are stored. + + SimpleManifoldCachedObjective(obj::AbstractManifoldFirstOrderObjective, p, X, c; initialized = false) + +Similar as above but initialising all fields directly and without copies and `initialized` indicated whether +the three values correspond to an evaluation from `obj`. """ mutable struct SimpleManifoldCachedObjective{ E <: AbstractEvaluationType, O <: AbstractManifoldObjective{E}, P, T, C, @@ -41,16 +48,22 @@ mutable struct SimpleManifoldCachedObjective{ end function SimpleManifoldCachedObjective( - M::AbstractManifold, - obj::O; - initialized = true, - p = rand(M), + M::AbstractManifold, obj::O; + initialized = true, p = rand(M), X = initialized ? get_gradient(M, obj, p) : zero_vector(M, p), c = initialized ? get_cost(M, obj, p) : 0.0, ) where {E <: AbstractEvaluationType, O <: AbstractManifoldObjective{E}} q = copy(M, p) - return SimpleManifoldCachedObjective{E, O, typeof(q), typeof(X), typeof(c)}( - obj, q, X, initialized, c, initialized + return SimpleManifoldCachedObjective( + obj, q, X, c; initialized = initialized + ) +end + +function SimpleManifoldCachedObjective( + obj::O, p, X, c; initialized::Bool = false + ) where {E <: AbstractEvaluationType, O <: AbstractManifoldObjective{E}} + return SimpleManifoldCachedObjective{E, O, typeof(p), typeof(X), typeof(c)}( + obj, p, X, initialized, c, initialized ) end @@ -161,6 +174,15 @@ function get_gradient_function( return (M, X, p) -> get_gradient!(M, X, sco, p) end +function Base.show(io::IO, smco::SimpleManifoldCachedObjective) + print(io, "SimpleManifoldCachedObjective(") + print(io, smco.objective); print(io, ", ") + print(io, smco.p); print(io, ", ") + print(io, smco.X); print(io, ", ") + print(io, smco.c) + return print(io, "; initialized = $(smco.X_valid && smco.c_valid))") +end + # # ManifoldCachedObjective constructor which errors by default # since LRUCache.jl extension is required @@ -756,11 +778,7 @@ function get_grad_inequality_constraint!( return X end function get_grad_inequality_constraint!( - M::AbstractManifold, - X, - co::ManifoldCachedObjective, - p, - i, + M::AbstractManifold, X, co::ManifoldCachedObjective, p, i, range::Union{AbstractPowerRepresentation, Nothing} = NestedPowerRepresentation(), ) key = copy(M, p) @@ -807,8 +825,7 @@ end function get_hessian(M::AbstractManifold, co::ManifoldCachedObjective, p, X) !(haskey(co.cache, :Hessian)) && return get_hessian(M, co.objective, p, X) return copy( - M, - p, + M, p, get!(co.cache[:Hessian], (copy(M, p), copy(M, p, X))) do get_hessian(M, co.objective, p, X) end, @@ -817,9 +834,7 @@ end function get_hessian!(M::AbstractManifold, Y, co::ManifoldCachedObjective, p, X) !(haskey(co.cache, :Hessian)) && return get_hessian!(M, Y, co.objective, p, X) copyto!( - M, - Y, - p, # perform an in-place cache evaluation, see also `get_gradient!` + M, Y, p, # perform an in-place cache evaluation, see also `get_gradient!` get!(co.cache[:Hessian], (copy(M, p), copy(M, p, X))) do get_hessian!(M, Y, co.objective, p, X) copy(M, p, Y) #store a copy of Y @@ -843,8 +858,7 @@ end function get_preconditioner(M::AbstractManifold, co::ManifoldCachedObjective, p, X) !(haskey(co.cache, :Preconditioner)) && return get_preconditioner(M, co.objective, p, X) return copy( - M, - p, + M, p, get!(co.cache[:Preconditioner], (copy(M, p), copy(M, p, X))) do get_preconditioner(M, co.objective, p, X) end, @@ -854,9 +868,7 @@ function get_preconditioner!(M::AbstractManifold, Y, co::ManifoldCachedObjective !(haskey(co.cache, :Preconditioner)) && return get_preconditioner!(M, Y, co.objective, p, X) copyto!( - M, - Y, - p, # perform an in-place cache evaluation, see also `get_gradient!` + M, Y, p, # perform an in-place cache evaluation, see also `get_gradient!` get!(co.cache[:Preconditioner], (copy(M, p), copy(M, p, X))) do get_preconditioner!(M, Y, co.objective, p, X) copy(M, p, Y) @@ -880,8 +892,7 @@ function get_proximal_map!(M::AbstractManifold, q, co::ManifoldCachedObjective, !(haskey(co.cache, :ProximalMap)) && return get_proximal_map!(M, q, co.objective, λ, p, i) copyto!( - M, - q, + M, q, get!(co.cache[:ProximalMap], (copy(M, p), λ, i)) do get_proximal_map!(M, q, co.objective, λ, p, i) #compute in-place of q copy(M, q) #store copy of q @@ -894,8 +905,7 @@ end function get_gradient(M::AbstractManifold, co::ManifoldCachedObjective, p, i) !(haskey(co.cache, :StochasticGradient)) && return get_gradient(M, co.objective, p, i) return copy( - M, - p, + M, p, get!(co.cache[:StochasticGradient], (copy(M, p), i)) do get_gradient(M, co.objective, p, i) end, @@ -905,9 +915,7 @@ function get_gradient!(M::AbstractManifold, X, co::ManifoldCachedObjective, p, i !(haskey(co.cache, :StochasticGradient)) && return get_gradient!(M, X, co.objective, p, i) copyto!( - M, - X, - p, + M, X, p, get!(co.cache[:StochasticGradient], (copy(M, p), i)) do # This evaluates in place of X get_gradient!(M, X, co.objective, p, i) @@ -920,8 +928,7 @@ end function get_gradients(M::AbstractManifold, co::ManifoldCachedObjective, p) !(haskey(co.cache, :StochasticGradients)) && return get_gradients(M, co.objective, p) return copy.( - Ref(M), - Ref(p), + Ref(M), Ref(p), get!(co.cache[:StochasticGradients], copy(M, p)) do get_gradients(M, co.objective, p) end, @@ -946,8 +953,7 @@ end function get_subgradient(M::AbstractManifold, co::ManifoldCachedObjective, p) !(haskey(co.cache, :SubGradient)) && return get_subgradient(M, co.objective, p) return copy( - M, - p, + M, p, get!(co.cache[:SubGradient], copy(M, p)) do get_subgradient(M, co.objective, p) end, @@ -956,9 +962,7 @@ end function get_subgradient!(M::AbstractManifold, X, co::ManifoldCachedObjective, p) !(haskey(co.cache, :SubGradient)) && return get_subgradient!(M, X, co.objective, p) copyto!( - M, - X, - p, # perform an in-place cache evaluation, see also `get_gradient!` + M, X, p, # perform an in-place cache evaluation, see also `get_gradient!` get!(co.cache[:SubGradient], copy(M, p)) do get_subgradient!(M, X, co.objective, p) copy(M, p, X) @@ -973,8 +977,7 @@ function get_subtrahend_gradient(M::AbstractManifold, co::ManifoldCachedObjectiv !(haskey(co.cache, :SubtrahendGradient)) && return get_subtrahend_gradient(M, co.objective, p) return copy( - M, - p, + M, p, get!(co.cache[:SubtrahendGradient], copy(M, p)) do get_subtrahend_gradient(M, co.objective, p) end, @@ -984,8 +987,7 @@ function get_subtrahend_gradient!(M::AbstractManifold, X, co::ManifoldCachedObje !(haskey(co.cache, :SubtrahendGradient)) && return get_subtrahend_gradient!(M, X, co.objective, p) copyto!( - M, - X, + M, X, p, # perform an in-place cache evaluation, see also `get_gradient!` get!(co.cache[:SubtrahendGradient], copy(M, p)) do get_subtrahend_gradient!(M, X, co.objective, p) @@ -1049,13 +1051,10 @@ function objective_cache_factory(M, o, cache::Tuple{Symbol, <:AbstractArray}) (cache[1] === :LRU) && return ManifoldCachedObjective(M, o, cache[2]) return o end -function show(io::IO, smco::SimpleManifoldCachedObjective{E}) where {E} - return print(io, "SimpleManifoldCachedObjective{$E,$(smco.objective)}") -end function show( io::IO, t::Tuple{<:SimpleManifoldCachedObjective, S} ) where {S <: AbstractManoptSolverState} - return print(io, "$(t[2])\n\n$(status_summary(t[1]))") + return print(io, "$(status_summary(t[2]))\n\n$(status_summary(t[1]))") end function show(io::IO, mco::ManifoldCachedObjective) return print(io, "$(status_summary(mco))") @@ -1065,21 +1064,28 @@ function show( ) where {S <: AbstractManoptSolverState} return print(io, "$(t[2])\n\n$(status_summary(t[1]))") end - -function status_summary(smco::SimpleManifoldCachedObjective) +function status_summary(smco::SimpleManifoldCachedObjective; context::Symbol = :default) + (context === :short) && (return repr(smco)) + (context === :inline) && (return "A simple cache objective caching one p, X, and c for $(status_summary(smco.objective; context = context))") s = """ ## Cache - A `SimpleManifoldCachedObjective` to cache one point and one tangent vector for the iterate and gradient, respectively + A `SimpleManifoldCachedObjective` to cache one point, one tangent vector, and real number + for the iterate, the gradient, and the cost function, respectively. + + At the current iterate + * the tangent vector is cached:$(_MANOPT_INDENT)$(smco.X_valid ? "Yes" : "No") + * the cost is cached:$(_MANOPT_INDENT)$(smco.c_valid ? "Yes" : "No") """ - s2 = status_summary(smco.objective) - length(s2) > 0 && (s2 = "\n$(s2)") - return "$(s)$(s2)" + s2 = status_summary(smco.objective; context = context) + length(s2) > 0 && (s2 = "$(s2)\n\n") + return "$(s2)$(s)" end -function status_summary(mco::ManifoldCachedObjective) +function status_summary(mco::ManifoldCachedObjective; context::Symbol = :default) + _is_inline(context) && (return repr(mco)) s = "## Cache\n" - s2 = status_summary(mco.objective) - (length(s2) > 0) && (s2 = "\n$(s2)") - length(mco.cache) == 0 && return "$(s) No caches active\n$(s2)" + s2 = status_summary(mco.objective; context = context) + (length(s2) > 0) && (s2 = "$(s2)\n\n") + length(mco.cache) == 0 && return "$(s2)$(s) No caches active" longest_key_length = max(length.(["$k" for k in keys(mco.cache)])...) cache_strings = [ " * :" * @@ -1087,5 +1093,5 @@ function status_summary(mco::ManifoldCachedObjective) " : $(v.currentsize)/$(v.maxsize) entries of type $(valtype(v)) used" for (k, v) in zip(keys(mco.cache), values(mco.cache)) ] - return "$(s)$(join(cache_strings, "\n"))\n$s2" + return "$(s2)$(s)$(join(cache_strings, "\n"))\n" end diff --git a/src/plans/conjugate_gradient_plan.jl b/src/plans/conjugate_gradient_plan.jl index 2efb22fcd2..b18374eb36 100644 --- a/src/plans/conjugate_gradient_plan.jl +++ b/src/plans/conjugate_gradient_plan.jl @@ -47,7 +47,7 @@ and `δ` is initialized to a copy of this vector. ## Keyword arguments -The following fields from above Y` performing the matrix vector multiplication in the tangent space, + and a function `b(M,p)` returning the vector on the right hand side in the current tangent space. + Both can also be defined in-place. Here they are $(E === InplaceEvaluation ? "in place" : "allocating"). + + # Fields + * A: $(slso.A!!) + * b: $(slso.b!!)""" +end @doc """ get_cost(TpM::TangentSpace, slso::SymmetricLinearSystemObjective, X) @@ -219,33 +238,24 @@ mutable struct ConjugateResidualState{T, R, TStop <: StoppingCriterion} <: α::R β::R stop::TStop + function ConjugateResidualState(; + X::T, r::T, d::T, Ar::T, Ad::T, α::R, β::R, rAr::R, stopping_criterion::SC, + ) where {T, R, SC <: StoppingCriterion} + crs = new{T, R, SC}() + crs.X = X; crs.r = r; crs.d = d; crs.Ar = Ar; crs.Ad = Ad + crs.α = α; crs.β = β; crs.rAr = rAr; crs.stop = stopping_criterion + return crs + end function ConjugateResidualState( TpM::TangentSpace, slso::SymmetricLinearSystemObjective; - X::T = rand(TpM), - r::T = (-get_gradient(TpM, slso, X)), - d::T = copy(TpM, r), - Ar::T = get_hessian(TpM, slso, X, r), - Ad::T = copy(TpM, Ar), - α::R = 0.0, - β::R = 0.0, - stopping_criterion::SC = StopAfterIteration(manifold_dimension(TpM)) | - StopWhenGradientNormLess(1.0e-8), + X::T = rand(TpM), r::T = (-get_gradient(TpM, slso, X)), d::T = copy(TpM, r), + Ar::T = get_hessian(TpM, slso, X, r), Ad::T = copy(TpM, Ar), α::Real = 0.0, β::Real = 0.0, + stopping_criterion::SC = StopAfterIteration(manifold_dimension(TpM)) | StopWhenGradientNormLess(1.0e-8), kwargs..., - ) where {T, R, SC <: StoppingCriterion} - M = base_manifold(TpM) - p = base_point(TpM) - crs = new{T, R, SC}() - crs.X = X - crs.r = r - crs.d = d - crs.Ar = Ar - crs.Ad = Ad - crs.α = α - crs.β = β - crs.rAr = zero(R) - crs.stop = stopping_criterion - return crs + ) where {T, SC <: StoppingCriterion} + R = promote_type(typeof(α), typeof(β)) + return ConjugateResidualState(; X = X, r = r, d = d, Ar = Ar, Ad = Ad, α = α, β = β, rAr = zero(R), stopping_criterion = stopping_criterion) end end @@ -261,10 +271,11 @@ function set_gradient!(crs::ConjugateResidualState, ::AbstractManifold, r) return crs end -function show(io::IO, crs::ConjugateResidualState) +function status_summary(crs::ConjugateResidualState; context::Symbol = :default) i = get_count(crs, :Iterations) Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(crs.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(crs)) – $(Iter) $(has_converged(crs) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Conjugate Residual Method $Iter @@ -273,11 +284,18 @@ function show(io::IO, crs::ConjugateResidualState) * β: $(crs.β) ## Stopping criterion - $(status_summary(crs.stop)) - + $(_in_str(status_summary(crs.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv """ - return print(io, s) + return s +end + +function Base.show(io::IO, crs::ConjugateResidualState) + print(io, "ConjugateResidualState(;") + print(io, " X = ", crs.X, ", d = ", crs.d, ", r = ", crs.r, ", α = ", crs.α, ", β = ", crs.β) + print(io, "Ar = ", crs.Ar, ", Ad = ", crs.Ad, ", rAr = ", crs.rAr) + print(io, ", stopping_criterion = ", status_summary(crs.stop; context = :short)) + return print(io, ")") end # @@ -350,15 +368,12 @@ function get_reason(swrr::StopWhenRelativeResidualLess) end return "" end -function status_summary(swrr::StopWhenRelativeResidualLess) +function status_summary(swrr::StopWhenRelativeResidualLess; context::Symbol = :default) has_stopped = (swrr.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "‖r^(k)‖ / c < ε:\t$s" + return _is_inline(context) ? "‖r^(k)‖ / c < ε:$(_MANOPT_INDENT)$s" : "A stopping criterion to stop when the relative residual is less than the threshold of $(swrr.ε)\n$(_MANOPT_INDENT)$s" end indicates_convergence(::StopWhenRelativeResidualLess) = true function show(io::IO, swrr::StopWhenRelativeResidualLess) - return print( - io, - "StopWhenRelativeResidualLess($(swrr.c), $(swrr.ε))\n $(status_summary(swrr))", - ) + return print(io, "StopWhenRelativeResidualLess($(swrr.c), $(swrr.ε))") end diff --git a/src/plans/constrained_plan.jl b/src/plans/constrained_plan.jl index a835588d50..2c88b1ab57 100644 --- a/src/plans/constrained_plan.jl +++ b/src/plans/constrained_plan.jl @@ -160,15 +160,8 @@ function _number_of_constraints( end function ConstrainedManifoldObjective( - f, - grad_f, - g, - grad_g, - h, - grad_h; - hess_f = nothing, - hess_g = nothing, - hess_h = nothing, + f, grad_f, g, grad_g, h, grad_h; + hess_f = nothing, hess_g = nothing, hess_h = nothing, evaluation::AbstractEvaluationType = AllocatingEvaluation(), equality_type::AbstractVectorialType = _vector_function_type_hint(h), equality_gradient_type::AbstractVectorialType = _vector_function_type_hint(grad_h), @@ -178,9 +171,7 @@ function ConstrainedManifoldObjective( inequality_hessian_type::AbstractVectorialType = _vector_function_type_hint(hess_g), equality_constraints::Union{Integer, Nothing} = nothing, inequality_constraints::Union{Integer, Nothing} = nothing, - M::Union{AbstractManifold, Nothing} = nothing, - p = isnothing(M) ? nothing : rand(M), - atol = 0, + M::Union{AbstractManifold, Nothing} = nothing, p = isnothing(M) ? nothing : rand(M), atol = 0, ) if isnothing(hess_f) objective = ManifoldGradientObjective(f, grad_f; evaluation = evaluation) @@ -194,37 +185,22 @@ function ConstrainedManifoldObjective( if isnothing(equality_constraints) # try to guess num_eq = _number_of_constraints( - h, - grad_h; - function_type = equality_type, - jacobian_type = equality_gradient_type, - M = M, - p = p, + h, grad_h; + function_type = equality_type, jacobian_type = equality_gradient_type, + M = M, p = p, ) end # if it is still < 0, this can not be used - (num_eq < 0) && error( - "Please specify a positive number of `equality_constraints` (provided $(equality_constraints))", - ) + (num_eq < 0) && error("Please specify a positive number of `equality_constraints` (provided $(equality_constraints))") if isnothing(hess_h) eq = VectorGradientFunction( - h, - grad_h, - num_eq; - evaluation = evaluation, - function_type = equality_type, - jacobian_type = equality_gradient_type, + h, grad_h, num_eq; evaluation = evaluation, + function_type = equality_type, jacobian_type = equality_gradient_type, ) else eq = VectorHessianFunction( - h, - grad_h, - hess_h, - num_eq; - evaluation = evaluation, - function_type = equality_type, - jacobian_type = equality_gradient_type, - hessian_type = equality_hessian_type, + h, grad_h, hess_h, num_eq; evaluation = evaluation, + function_type = equality_type, jacobian_type = equality_gradient_type, hessian_type = equality_hessian_type, ) end end @@ -235,37 +211,22 @@ function ConstrainedManifoldObjective( if isnothing(inequality_constraints) # try to guess num_ineq = _number_of_constraints( - g, - grad_g; - function_type = inequality_type, - jacobian_type = inequality_gradient_type, - M = M, - p = p, + g, grad_g; + function_type = inequality_type, jacobian_type = inequality_gradient_type, + M = M, p = p, ) end # if it is still < 0, this can not be used - (num_ineq < 0) && error( - "Please specify a positive number of `inequality_constraints` (provided $(inequality_constraints))", - ) + (num_ineq < 0) && error("Please specify a positive number of `inequality_constraints` (provided $(inequality_constraints))") if isnothing(hess_g) ineq = VectorGradientFunction( - g, - grad_g, - num_ineq; - evaluation = evaluation, - function_type = inequality_type, - jacobian_type = inequality_gradient_type, + g, grad_g, num_ineq; evaluation = evaluation, + function_type = inequality_type, jacobian_type = inequality_gradient_type, ) else ineq = VectorHessianFunction( - g, - grad_g, - hess_g, - num_ineq; - evaluation = evaluation, - function_type = inequality_type, - jacobian_type = inequality_gradient_type, - hessian_type = inequality_hessian_type, + g, grad_g, hess_g, num_ineq; evaluation = evaluation, + function_type = inequality_type, jacobian_type = inequality_gradient_type, hessian_type = inequality_hessian_type, ) end end @@ -274,11 +235,8 @@ function ConstrainedManifoldObjective( ) end function ConstrainedManifoldObjective( - objective::MO; - equality_constraints::EMO = nothing, - inequality_constraints::IMO = nothing, - atol = 0, - kwargs..., + objective::MO; atol = 0, + equality_constraints::EMO = nothing, inequality_constraints::IMO = nothing, kwargs..., ) where {E <: AbstractEvaluationType, MO <: AbstractManifoldObjective{E}, IMO, EMO} if isnothing(equality_constraints) && isnothing(inequality_constraints) throw( @@ -304,6 +262,34 @@ function ConstrainedManifoldObjective( return ConstrainedManifoldObjective(f, grad_f, g, grad_g, h, grad_h; kwargs...) end +function status_summary(cmo::ConstrainedManifoldObjective; context::Symbol = :default) + _is_inline(context) && (return "A constrained objective based on $(status_summary(cmo.objective; context = context)) with $(length(cmo.equality_constraints)) equality and $(length(cmo.inequality_constraints)) inequality constraints.") + s = status_summary(cmo.objective; context = context) + return """ + A constrained objective with $(length(cmo.equality_constraints)) equality and $(cmo.inequality_constraints) inequality constraints. + For verifications, the inequalities are checked with an absolute tolerance of `atol = $(cmo.atol)` + + ## Unconstrained Objective + $(_in_str(s)) + + ## Equality constrains + $(_in_str(status_summary(cmo.equality_constraints; context = context))) + + ## Inequality constrains + $(_in_str(status_summary(cmo.inequality_constraints; context = context)))""" +end +function show(io::IO, cmo::ConstrainedManifoldObjective) + print(io, "ConstrainedManifoldObjective("); print(io, cmo.objective) + print(io, "; atol = ") + print(io, cmo.atol) + if !isnothing(cmo.equality_constraints) + print(io, "; equality_constraints = "); print(io, cmo.equality_constraints) + end + if !isnothing(cmo.inequality_constraints) + print(io, "; inequality_constraints = "); print(io, cmo.inequality_constraints) + end + return print(io, ")") +end @doc """ ConstrainedManoptProblem{ TM <: AbstractManifold, @@ -356,12 +342,9 @@ Creates a constrained Manopt problem specifying an [`AbstractPowerRepresentation for both the `gradient_equality_range` and the `gradient_inequality_range`, respectively. """ struct ConstrainedManoptProblem{ - TM <: AbstractManifold, - O <: AbstractManifoldObjective, - HR <: Union{AbstractPowerRepresentation, Nothing}, - GR <: Union{AbstractPowerRepresentation, Nothing}, - HHR <: Union{AbstractPowerRepresentation, Nothing}, - GHR <: Union{AbstractPowerRepresentation, Nothing}, + TM <: AbstractManifold, O <: AbstractManifoldObjective, + HR <: Union{AbstractPowerRepresentation, Nothing}, GR <: Union{AbstractPowerRepresentation, Nothing}, + HHR <: Union{AbstractPowerRepresentation, Nothing}, GHR <: Union{AbstractPowerRepresentation, Nothing}, } <: AbstractManoptProblem{TM} manifold::TM grad_equality_range::HR @@ -372,33 +355,52 @@ struct ConstrainedManoptProblem{ end function ConstrainedManoptProblem( - M::TM, - objective::O; + M::TM, objective::O; range::AbstractPowerRepresentation = NestedPowerRepresentation(), - gradient_equality_range::HR = range, - gradient_inequality_range::GR = range, - hessian_equality_range::HHR = range, - hessian_inequality_range::GHR = range, + gradient_equality_range::HR = range, gradient_inequality_range::GR = range, + hessian_equality_range::HHR = range, hessian_inequality_range::GHR = range, ) where { - TM <: AbstractManifold, - O <: AbstractManifoldObjective, - GR <: Union{AbstractPowerRepresentation, Nothing}, - HR <: Union{AbstractPowerRepresentation, Nothing}, - GHR <: Union{AbstractPowerRepresentation, Nothing}, - HHR <: Union{AbstractPowerRepresentation, Nothing}, + TM <: AbstractManifold, O <: AbstractManifoldObjective, + GR <: Union{AbstractPowerRepresentation, Nothing}, HR <: Union{AbstractPowerRepresentation, Nothing}, + GHR <: Union{AbstractPowerRepresentation, Nothing}, HHR <: Union{AbstractPowerRepresentation, Nothing}, } return ConstrainedManoptProblem{TM, O, HR, GR, HHR, GHR}( - M, - gradient_equality_range, - gradient_inequality_range, - hessian_equality_range, - hessian_inequality_range, - objective, + M, gradient_equality_range, gradient_inequality_range, + hessian_equality_range, hessian_inequality_range, objective, ) end get_manifold(cmp::ConstrainedManoptProblem) = cmp.manifold get_objective(cmp::ConstrainedManoptProblem) = cmp.objective +function show(io::IO, cmp::ConstrainedManoptProblem) + print(io, "ConstrainedManoptProblem(", cmp.manifold, ", ", cmp.objective, ";") + print(io, " gradient_equality_range = ", cmp.grad_equality_range) + print(io, ", gradient_inequality_range = ", cmp.grad_inequality_range) + print(io, ", hessian_equality_range = ", cmp.hess_equality_range) + print(io, ", hessian_inequality_range = ", cmp.hess_inequality_range) + return print(io, ")") +end + +function status_summary(cmp::ConstrainedManoptProblem; context::Symbol = :default) + _is_inline(context) && return "A constrained optimization problem to minimize $(cmp.objective) on the manifold $(cmp.manifold)" + return """ + A constrained optimization problem for Manopt.jl + + ## Manifold + $(_in_str(repr(cmp.manifold); indent = 1, headers = 0)) + + ## Objective + $(_in_str(status_summary(cmp.objective, context = context); indent = 1)) + + ## Ranges + * gradient equality range: $(_MANOPT_INDENT)$(cmp.grad_equality_range) + * gradient inequality range: $(_MANOPT_INDENT)$(cmp.grad_inequality_range) + * hessian equality range: $(_MANOPT_INDENT)$(cmp.hess_equality_range) + * hessian inequality range: $(_MANOPT_INDENT)$(cmp.hess_inequality_range) + """ +end + + @doc """ LagrangianCost{CO,T} <: AbstractConstrainedFunctor{T} @@ -441,7 +443,7 @@ function (lc::LagrangianCost)(M, p) return c end function show(io::IO, lc::LagrangianCost) - return print(io, "LagrangianCost\n\twith μ=$(lc.μ), λ=$(lc.λ)") + return print(io, "LagrangianCost\n$(_MANOPT_INDENT)with μ=$(lc.μ), λ=$(lc.λ)") end @doc """ @@ -495,7 +497,7 @@ function (lg::LagrangianGradient)(M, X, p) return X end function show(io::IO, lg::LagrangianGradient) - return print(io, "LagrangianGradient\n\twith μ=$(lg.μ), λ=$(lg.λ)") + return print(io, "LagrangianGradient\n$(_MANOPT_INDENT)with μ=$(lg.μ), λ=$(lg.λ)") end @doc """ @@ -549,7 +551,7 @@ function (lH::LagrangianHessian)(M, Y, p, X) return Y end function show(io::IO, lh::LagrangianHessian) - return print(io, "LagrangianHessian\n\twith μ=$(lh.μ), λ=$(lh.λ)") + return print(io, "LagrangianHessian\n$(_MANOPT_INDENT)with μ=$(lh.μ), λ=$(lh.λ)") end @doc """ @@ -1055,10 +1057,3 @@ function get_feasibility_status( ) """ end - -function Base.show( - io::IO, ::ConstrainedManifoldObjective{E, V, Eq, IEq} - ) where {E <: AbstractEvaluationType, V, Eq, IEq} - # return print(io, "ConstrainedManifoldObjective{$E,$V,$Eq,$IEq}.") - return print(io, "ConstrainedManifoldObjective{$E}") -end diff --git a/src/plans/cost_plan.jl b/src/plans/cost_plan.jl index 1682b2e973..c561ee8049 100644 --- a/src/plans/cost_plan.jl +++ b/src/plans/cost_plan.jl @@ -58,3 +58,10 @@ get_cost_function(mco::AbstractManifoldCostObjective, recursive = false) = mco.c function get_cost_function(admo::AbstractDecoratedManifoldObjective, recursive = false) return get_cost_function(get_objective(admo, recursive)) end + +function show(io::IO, ::ManifoldCostObjective{E, TC}) where {E, TC} + return print(io, "ManifoldCostObjective(f)") +end +function status_summary(::ManifoldCostObjective{E, TC}; context::Symbol = :default) where {E, TC} + return "A cost function on a Riemannian manifold `f = (M,p) -> ℝ`." +end diff --git a/src/plans/count.jl b/src/plans/count.jl index e6b8c8ae9c..703440d612 100644 --- a/src/plans/count.jl +++ b/src/plans/count.jl @@ -48,8 +48,9 @@ struct ManifoldCountObjective{ counts::Dict{Symbol, I} objective::O end +ManifoldCountObjective(::AbstractManifold, o::AbstractManifoldObjective, c) = ManifoldCountObjective(o, c) function ManifoldCountObjective( - ::AbstractManifold, objective::O, counts::Dict{Symbol, I} + objective::O, counts::Dict{Symbol, I} ) where { E <: AbstractEvaluationType, I <: Union{<:Integer, AbstractVector{<:Integer}}, @@ -59,7 +60,7 @@ function ManifoldCountObjective( end # Store the undecorated type of the input is decorated function ManifoldCountObjective( - ::AbstractManifold, objective::O, counts::Dict{Symbol, I} + objective::O, counts::Dict{Symbol, I} ) where { E <: AbstractEvaluationType, I <: Union{<:Integer, AbstractVector{<:Integer}}, @@ -81,8 +82,7 @@ function ManifoldCountObjective( l = _get_counter_size(M, objective, symbol, p) push!(counts, Pair(symbol, l == 1 ? init : fill(init, l))) end - - return ManifoldCountObjective(M, objective, Dict(counts)) + return ManifoldCountObjective(objective, Dict(counts)) end function _get_counter_size( @@ -531,23 +531,27 @@ function objective_count_factory( return ManifoldCountObjective(M, o, counts) end -function status_summary(co::ManifoldCountObjective) +# change default – do not unwrap but call the one below +function status_summary(io::IO, co::ManifoldCountObjective; kwargs...) + return print(io, status_summary(co; kwargs...)) +end +function status_summary(co::ManifoldCountObjective; context::Symbol = :default) + so = status_summary(co.objective; context = context) + if _is_inline(context) + return "$so (statistics: $(join([ ":$(c[1])=$(c[2])" for c in co.counts ], ", ")))" + end s = "## Statistics on function calls\n" - s2 = status_summary(co.objective) - (length(s2) > 0) && (s2 = "\n$(s2)") - length(co.counts) == 0 && return "$(s) No counters active\n$(s2)" + (length(so) > 0) && (so = "$(so)") + length(co.counts) == 0 && return "$(s) No counters active\n$(so)" longest_key_length = max(length.(["$c" for c in keys(co.counts)])...) count_strings = [ " * :$(rpad("$(c[1])", longest_key_length)) : $(c[2])" for c in co.counts ] - return "$(s)$(join(count_strings, "\n"))$s2" + return "$(so)\n\n$(s)$(join(count_strings, "\n"))" end - -function show(io::IO, co::ManifoldCountObjective) - return print(io, "$(status_summary(co))") +function status_summary(t::Tuple{<:ManifoldCountObjective, S}; context::Symbol = :default) where {S <: AbstractManoptSolverState} + return "$(status_summary(t[2], context = context))\n\n$(status_summary(t[1]; context = context))" end -function show( - io::IO, t::Tuple{<:ManifoldCountObjective, S} - ) where {S <: AbstractManoptSolverState} - return print(io, "$(t[2])\n\n$(t[1])") +function show(io::IO, co::ManifoldCountObjective) + return print(io, "ManifoldCountObjective($(repr(co.objective)), $(repr(co.counts)))") end diff --git a/src/plans/debug.jl b/src/plans/debug.jl index 65ad676292..b5104181bb 100644 --- a/src/plans/debug.jl +++ b/src/plans/debug.jl @@ -27,7 +27,7 @@ The original options can still be accessed using the [`get_state`](@ref) functio # Fields * `options`: the options that are extended by debug information -* `debugDictionary`: a `Dict{Symbol,DebugAction}` to keep track of Debug for different actions +* `debug_dictionary`: a `Dict{Symbol,DebugAction}` to keep track of Debug for different actions # Constructors DebugSolverState(o,dA) @@ -40,7 +40,7 @@ construct debug decorated options, where `dD` can be """ mutable struct DebugSolverState{S <: AbstractManoptSolverState} <: AbstractManoptSolverState state::S - debugDictionary::Dict{Symbol, <:DebugAction} + debug_dictionary::Dict{Symbol, <:DebugAction} function DebugSolverState{S}( st::S, dA::Dict{Symbol, <:DebugAction} ) where {S <: AbstractManoptSolverState} @@ -75,10 +75,10 @@ end """ set_parameter!(ams::DebugSolverState, ::Val{:Debug}, args...) -Set certain values specified by `args...` into the elements of the `debugDictionary` +Set certain values specified by `args...` into the elements of the `debug_dictionary` """ function set_parameter!(dss::DebugSolverState, ::Val{:Debug}, args...) - for d in values(dss.debugDictionary) + for d in values(dss.debug_dictionary) set_parameter!(d, args...) end return dss @@ -95,20 +95,21 @@ function get_parameter(dss::DebugSolverState, v::Val{T}, args...) where {T} return get_parameter(dss.state, v, args...) end -function status_summary(dst::DebugSolverState) - if length(dst.debugDictionary) > 0 +function status_summary(dst::DebugSolverState; context::Symbol = :default) + if length(dst.debug_dictionary) > 0 s = "" - for (k, v) in dst.debugDictionary - s = "$s\n :$k = $(status_summary(v))" + for (k, v) in dst.debug_dictionary + s = "$s\n :$k = $(status_summary(v; context = context))" end - return "$(dst.state)\n\n## Debug$s" - else # for length 1 the group is equivalent to the summary of the single state - return status_summary(dst.state) + return "$(status_summary(dst.state; context = context))\n\n## Debug$s" + else # if the dictionary has no entries, there is no actual debug in pretty print + return status_summary(dst.state; context = context) end end function show(io::IO, dst::DebugSolverState) - return print(io, status_summary(dst)) + return print(io, "DebugSolverState($(dst.state), $(dst.debug_dictionary))") end + dispatch_state_decorator(::DebugSolverState) = Val(true) # @@ -138,9 +139,12 @@ function (d::DebugGroup)(p::AbstractManoptProblem, st::AbstractManoptSolverState end return end -function status_summary(dg::DebugGroup) - str = join(["$(status_summary(di))" for di in dg.group], ", ") - return "[ $str ]" +function status_summary(dg::DebugGroup; context::Symbol = :default) + (context == :short) && return "[ " * join(["$(status_summary(di; context = context))" for di in dg.group], ", ") * " ]" + (context == :inline) && return "A DebugAction consisting of a group actions, " * join(["$(status_summary(di; context = context))" for di in dg.group], ", ", ", and ") + return """ + A DebugAction consisting of a group with the following elements + $(join(["* $(status_summary(di; context = context))" for di in dg.group], "\n"))""" end function show(io::IO, dg::DebugGroup) s = join(["$(di)" for di in dg.group], ", ") @@ -208,14 +212,16 @@ function show(io::IO, de::DebugEvery) "DebugEvery($(de.debug), $(de.every), $(de.always_update); activation_offset=$(de.activation_offset))", ) end -function status_summary(de::DebugEvery) +function status_summary(de::DebugEvery; context::Symbol = :default) s = "" - if de.debug isa DebugGroup - s = status_summary(de.debug)[3:(end - 2)] - else - s = "$(de.debug)" + if context == :short + s = status_summary(de.debug; context = context) + # If we have a group, remove outer brackets and spaces + (de.debug isa DebugGroup) && (s = s[3:(end - 2)]) + return "[$s, $(de.every)]" end - return "[$s, $(de.every)]" + (context == :inline) && return "The Debug $(status_summary(de.debug; context = context)) only printed every $(de.every) iteration" + return "A DebugAction wrapping the following DebugAction to only print it every $(de.every)th iteration.\n$(_in_str(status_summary(de.debug; context = context)))" end function set_parameter!(de::DebugEvery, e::Symbol, args...) set_parameter!(de, Val(e), args...) @@ -261,10 +267,12 @@ function (d::DebugCallback)( return nothing end function show(io::IO, dc::DebugCallback{CB}) where {CB} - return print(io, "DebugCallback containing a $(CB) callback $(dc.callback)") + return print(io, "DebugCallback($(dc.callback))") end -function status_summary(dc::DebugCallback) - return "$(dc.callback)" +function status_summary(dc::DebugCallback; context::Symbol = :default) + (context === :short) && return "$(dc.callback)" + # inline and default + return "A DebugAction with a callback that calls $(dc.callback)" end @doc """ @@ -291,15 +299,10 @@ mutable struct DebugChange{IR <: AbstractInverseRetractionMethod} <: DebugAction function DebugChange( M::AbstractManifold = DefaultManifold(); storage::Union{Nothing, StoreStateAction} = nothing, - io::IO = stdout, - prefix::String = "Last Change: ", - format::String = "$(prefix)%f", - inverse_retraction_method::AbstractInverseRetractionMethod = default_inverse_retraction_method( - M - ), + io::IO = stdout, prefix::String = "Last Change: ", format::String = "$(prefix)%f", + inverse_retraction_method::AbstractInverseRetractionMethod = default_inverse_retraction_method(M), ) irm = inverse_retraction_method - # Deprecated, remove in Manopt 0.5 if isnothing(storage) if M isa DefaultManifold storage = StoreStateAction(M; store_fields = [:Iterate]) @@ -316,9 +319,7 @@ function (d::DebugChange)(mp::AbstractManoptProblem, st::AbstractManoptSolverSta d.io, Printf.Format(d.format), distance( - M, - get_iterate(st), - get_storage(d.storage, PointStorageKey(:Iterate)), + M, get_iterate(st), get_storage(d.storage, PointStorageKey(:Iterate)), d.inverse_retraction_method, ), ) @@ -331,8 +332,11 @@ function show(io::IO, dc::DebugChange) "DebugChange(; format=\"$(escape_string(dc.format))\", inverse_retraction=$(dc.inverse_retraction_method))", ) end -status_summary(dc::DebugChange) = "(:Change, \"$(escape_string(dc.format))\")" - +function status_summary(dc::DebugChange; context::Symbol = :default) + (context === :short) && (return "(:Change, \"$(escape_string(dc.format))\")") + # Inline and Default + return "A DebugAction to print the change of the iterate from one iteration to the next with format “$(escape_string(dc.format))”" +end @doc """ DebugCost <: DebugAction @@ -366,7 +370,11 @@ end function show(io::IO, di::DebugCost) return print(io, "DebugCost(; format=\"$(escape_string(di.format))\", at_init=$(di.at_init))") end -status_summary(di::DebugCost) = "(:Cost, \"$(escape_string(di.format))\")" +function status_summary(di::DebugCost; context::Symbol = :default) + (context === :short) && return "(:Cost, \"$(escape_string(di.format))\")" + # inline & default + return "A DebugAction printing the current cost value" +end @doc """ DebugDivider <: DebugAction @@ -392,7 +400,11 @@ end function show(io::IO, di::DebugDivider) return print(io, "DebugDivider(; divider=\"$(escape_string(di.divider))\", at_init=$(di.at_init))") end -status_summary(di::DebugDivider) = "\"$(escape_string(di.divider))\"" +function status_summary(di::DebugDivider; context::Symbol = :default) + (context === :short) && (return "\"$(escape_string(di.divider))\"") + # inline and default + return "A DebugAction printing the String “$(escape_string(di.divider))” as a divider" +end @doc """ DebugEntry <: DebugAction @@ -425,6 +437,10 @@ end function show(io::IO, di::DebugEntry) return print(io, "DebugEntry(:$(di.field); format=\"$(escape_string(di.format))\", at_init=$(di.at_init))") end +function status_summary(di::DebugEntry; context::Symbol = :default) + (context === :short) && return "(:$(di.field), format=\"$(escape_string(di.format))\")" + return "A DebugAction to print the field :$(di.field) of the solver state with format \"$(escape_string(di.format))\"" +end """ DebugFeasibility <: DebugAction @@ -500,9 +516,11 @@ function show(io::IO, d::DebugFeasibility) sf = "[" * (join([e isa String ? "\"$e\"" : ":$e" for e in d.format], ", ")) * "]" return print(io, "DebugFeasibility($sf, at_init=$(d.at_init))") end -function status_summary(d::DebugFeasibility) +function status_summary(d::DebugFeasibility; context::Symbol = :default) sf = "[" * (join([e isa String ? "\"$e\"" : ":$e" for e in d.format], ", ")) * "]" - return "(:Feasibility, $sf)" + (context === :short) && (return "(:Feasibility, $sf)") + # inline and Default + return "A DebugAction printing Feasibility information of the current iterate, namely $sf" end @doc """ @@ -553,10 +571,14 @@ function (d::DebugIfEntry)(::AbstractManoptProblem, st::AbstractManoptSolverStat end return nothing end -function show(io::IO, di::DebugIfEntry) - return print(io, "DebugIfEntry(:$(di.field), $(di.check); type=:$(di.type), at_init=$(di.at_init))") +function show(io::IO, d::DebugIfEntry) + return print(io, "DebugIfEntry(:$(d.field), $(d.check); type=:$(d.type), at_init=$(d.at_init))") +end +function status_summary(d::DebugIfEntry; context::Symbol = :Default) + (context === :short) && (return repr(d)) + # Inline and default + return "A DebugAction printing the entry :$(d.field) of the solver state if $(d.check) of that field is true, in format “$(escape_string(d.msg))” as $(d.type)" end - @doc """ DebugEntryChange{T} <: DebugAction @@ -624,6 +646,10 @@ function show(io::IO, dec::DebugEntryChange) "DebugEntryChange(:$(dec.field), $(dec.distance); format=\"$(escape_string(dec.format))\")", ) end +function status_summary(d::DebugEntryChange; context::Symbol = :default) + (context === :short) && return repr(d) + return "A DebugAction that prints the change of the entry :$(d.field) of the solver state in format “$(escape_string(d.format))”" +end @doc """ DebugGradientChange() @@ -686,8 +712,10 @@ function show(io::IO, dgc::DebugGradientChange) "DebugGradientChange(; format=\"$(escape_string(dgc.format))\", vector_transport_method=$(dgc.vector_transport_method))", ) end -function status_summary(di::DebugGradientChange) - return "(:GradientChange, \"$(escape_string(di.format))\")" +function status_summary(di::DebugGradientChange; context::Symbol = :Default) + (context === :short) && (return "(:GradientChange, \"$(escape_string(di.format))\")") + # Inline and default + return "A DebugAction printing the change of the gradient with format “$(escape_string(di.format))”" end @doc """ @@ -727,7 +755,11 @@ end function show(io::IO, di::DebugIterate) return print(io, "DebugIterate(; format=\"$(escape_string(di.format))\", at_init=$(di.at_init))") end -status_summary(di::DebugIterate) = "(:Iterate, \"$(escape_string(di.format))\")" +function status_summary(di::DebugIterate; context::Symbol = :default) + (context === :short) && (return "(:Iterate, \"$(escape_string(di.format))\")") + # Inline and default + return "A DebugAction printing the current iterate in format “$(escape_string(di.format))”" +end @doc """ DebugIteration <: DebugAction @@ -756,8 +788,11 @@ end function show(io::IO, di::DebugIteration) return print(io, "DebugIteration(; format=\"$(escape_string(di.format))\")") end -status_summary(di::DebugIteration) = "(:Iteration, \"$(escape_string(di.format))\")" - +function status_summary(di::DebugIteration; context::Symbol = :default) + (context === :short) && return "(:Iteration, \"$(escape_string(di.format))\")" + # Inline and default + return "A DebugAction that prints the current iteration number in format “$(escape_string(di.format))”" +end @doc """ DebugMessages <: DebugAction @@ -806,12 +841,18 @@ function (d::DebugMessages)(::AbstractManoptProblem, st::AbstractManoptSolverSta return nothing end show(io::IO, d::DebugMessages) = print(io, "DebugMessages(:$(d.mode), :$(d.status))") -function status_summary(d::DebugMessages) - (d.mode == :Warning) && return "(:WarningMessages, :$(d.status))" - (d.mode == :Error) && return "(:ErrorMessages, :$(d.status))" - # default - # (d.mode == :Info) && return "(:InfoMessages, $(d.status)" - return "(:Messages, :$(d.status))" +function status_summary(d::DebugMessages; context::Symbol = :default) + if context === :short + s = ":Messages" + (d.mode == :Warning) && (s = ":WarningMessages") + (d.mode == :Error) && (s = ":ErrorMessages") + (d.mode == :Info) && (s = ":InfoMessages") + return d.status === :No ? s : "($s, :$(d.status))" + end + # Inline and default + m = "a $(d.mode == :Warning ? "warning " : (d.mode == :Error ? "error " : ""))message" + s = d.status === :No ? " (inactive)" : (d.status === :Once ? " once" : "") + return "A DebugAction printing messages collected during the last iteration as $(m)$(s)." end @doc """ @@ -845,8 +886,10 @@ function show(io::IO, c::DebugStoppingCriterion) s = length(c.prefix) > 0 ? "\"$(c.prefix)\"" : "" return print(io, "DebugStoppingCriterion($s)") end -function status_summary(c::DebugStoppingCriterion) - return length(c.prefix) == 0 ? ":Stop" : "(:Stop, \"$(c.prefix)\")" +function status_summary(c::DebugStoppingCriterion; context::Symbol = :default) + (context === :short) && (return length(c.prefix) == 0 ? ":Stop" : "(:Stop, \"$(c.prefix)\")") + # Inline and default + return "A DebugAction printing the reason why a solver has stopped." end @doc """ @@ -890,8 +933,18 @@ end function show(io::IO, dwa::DebugWhenActive) return print(io, "DebugWhenActive($(dwa.debug), $(dwa.active), $(dwa.always_update))") end -function status_summary(dwa::DebugWhenActive) - return repr(dwa) +function status_summary(dwa::DebugWhenActive; context::Symbol = :default) + (context === :short) && (return repr(dwa)) + (context === :inline) && return "A DebugAction only printing its internal criterion ($(status_summary(dwa.debug; context = context))) when active (currently: $(dwa.active))" + return """ + a DebugActin only printing its internal DebugAction when activated + + ## DebugAction + $(status_summary(dwa.debug; context = context))$(dwa.always_update ? "\nwhich is always updated for negative iteration numbers still." : "") + + ## Current activity + $(dwa.active ? "active" : "inactive") – use `set_parameter!(debug_action, :Activity, $(!dwa.active))` to toggle + """ end function set_parameter!(dwa::DebugWhenActive, v::Val, args...) set_parameter!(dwa.debug, v, args...) @@ -956,11 +1009,15 @@ function show(io::IO, di::DebugTime) io, "DebugTime(; format=\"$(escape_string(di.format))\", mode=:$(di.mode))" ) end -function status_summary(di::DebugTime) - if di.mode === :iterative - return "(:IterativeTime, \"$(escape_string(di.format))\")" +function status_summary(di::DebugTime; context::Symbol = :default) + if context == :short + if di.mode === :iterative + return "(:IterativeTime, \"$(escape_string(di.format))\")" + end + return "(:Time, \"$(escape_string(di.format))\")" end - return "(:Time, \"$(escape_string(di.format))\")" + # Default and inline + return "a DebugActin to print time per step $(di.mode === :iterative ? "iteratively" : "cumulatively")" end """ reset!(d::DebugTime) @@ -1036,8 +1093,14 @@ function (d::DebugWarnIfCostIncreases)( end return nothing end -function show(io::IO, di::DebugWarnIfCostIncreases) - return print(io, "DebugWarnIfCostIncreases(; tol=\"$(di.tol)\")") +function show(io::IO, d::DebugWarnIfCostIncreases) + m = (d.status === :No ? "" : ":$(d.status)") + return print(io, "DebugWarnIfCostIncreases($(m); tol=\"$(d.tol)\")") +end +function status_summary(d::DebugWarnIfCostIncreases; context::Symbol = :default) + (context === :short) && return repr(d) + m = (d.status === :Once) ? "once" : (d.status === :No ? "(inactive)" : "") + return "A DebugAction warning if the cost increases in an iteration $m." end @doc """ @@ -1076,8 +1139,15 @@ function (d::DebugWarnIfCostNotFinite)( end return nothing end -show(io::IO, ::DebugWarnIfCostNotFinite) = print(io, "DebugWarnIfCostNotFinite()") -status_summary(::DebugWarnIfCostNotFinite) = ":WarnCost" +show(io::IO, d::DebugWarnIfCostNotFinite) = print(io, "DebugWarnIfCostNotFinite(:$(d.status))") +function status_summary(d::DebugWarnIfCostNotFinite; context::Symbol = :default) + (context == :short) && (return ":WarnCost") + # Default and inline + s = "" + (d.status === :Once) && (s = " It will only warn once.") + (d.status === :No) && (s = " It either has warned already or was deactivated by setting its status to `:No`.") + return "A DebugAction to issue a warning when the cost is no longer finite.$s" +end @doc """ DebugWarnIfFieldNotFinite <: DebugAction @@ -1135,7 +1205,14 @@ end function show(io::IO, dw::DebugWarnIfFieldNotFinite) return print(io, "DebugWarnIfFieldNotFinite(:$(dw.field), :$(dw.status))") end - +function status_summary(dw::DebugWarnIfFieldNotFinite; context::Symbol = :default) + (context == :short) && (return repr(dw)) + # Default and inline + s = "" + (dw.status === :Once) && (s = " It will only warn once.") + (dw.status === :No) && (s = " It either has warned already or was deactivated by setting its status to `:No`.") + return "A DebugAction to warn if the field “:$(dw.field)” is or has entries that are not finite.$s" +end @doc """ DebugWarnIfGradientNormTooLarge{T} <: DebugAction @@ -1175,7 +1252,7 @@ function (d::DebugWarnIfGradientNormTooLarge)( p_inj = d.factor * max_stepsize(M, p) if Xn > p_inj @warn """At iteration #$k - the gradient norm ($Xn) is larger that $(d.factor) times the injectivity radius $(p_inj) at the current iterate. + the gradient norm ($Xn) is larger than $(d.factor) times the injectivity radius $(p_inj) at the current iterate. """ if d.status === :Once @warn "Further warnings will be suppressed, use DebugWarnIfGradientNormTooLarge($(d.factor), :Always) to get all warnings." @@ -1186,7 +1263,14 @@ function (d::DebugWarnIfGradientNormTooLarge)( return nothing end function show(io::IO, d::DebugWarnIfGradientNormTooLarge) - return print(io, "DebugWarnIfGradientNormTooLarge($(d.factor), :$(d.status))") + # only print status if active + m = (d.status === :No ? "" : ", :$(d.status)") + return print(io, "DebugWarnIfGradientNormTooLarge($(d.factor)$(m))") +end +function status_summary(d::DebugWarnIfGradientNormTooLarge; context::Symbol = :default) + (context === :short) && return repr(d) + m = (d.status === :Once) ? " once" : (d.status === :No ? " (inactive)" : "") + return "A DebugAction warning if the gradient norm gets larger than the maximal stepsize$m." end @doc """ @@ -1213,9 +1297,6 @@ mutable struct DebugWarnIfStepsizeCollapsed{T} <: DebugAction return new{T}(warn, tol) end end -function show(io::IO, di::DebugWarnIfStepsizeCollapsed) - return print(io, "DebugWarnIfStepsizeCollapsed($(di.stop_when_stepsize_less), :$(di.status))") -end function (d::DebugWarnIfStepsizeCollapsed)( amp::AbstractManoptProblem, st::AbstractManoptSolverState, k::Int ) @@ -1231,7 +1312,15 @@ function (d::DebugWarnIfStepsizeCollapsed)( end return nothing end - +function show(io::IO, d::DebugWarnIfStepsizeCollapsed) + m = (d.status === :No ? "" : ", :$(d.status)") + return print(io, "DebugWarnIfStepsizeCollapsed($(d.stop_when_stepsize_less)$(m))") +end +function status_summary(d::DebugWarnIfStepsizeCollapsed; context::Symbol = :default) + (context === :short) && return repr(d) + m = (d.status === :Once) ? " once" : (d.status === :No ? " (inactive)" : "") + return "A DebugAction warning if the step size collapses (below $(d.stop_when_stepsize_less))$m." +end # # Convenience constructors using Symbols # @@ -1396,13 +1485,14 @@ Note that the Shortcut symbols should all start with a capital letter. * `:Iterate` creates a [`DebugIterate`](@ref) * `:Iteration` creates a [`DebugIteration`](@ref) * `:IterativeTime` creates a [`DebugTime`](@ref)`(:Iterative)` +* `:ProxParameter` creates a [`DebugProximalParameter`](@ref)`()` * `:Stepsize` creates a [`DebugStepsize`](@ref) * `:Stop` creates a [`StoppingCriterion`](@ref)`()` +* `:Time` creates a [`DebugTime`](@ref) * `:WarnStepsize` creates a [`DebugWarnIfStepsizeCollapsed`](@ref) * `:WarnBundle` creates a [`DebugWarnIfLagrangeMultiplierIncreases`](@ref) * `:WarnCost` creates a [`DebugWarnIfCostNotFinite`](@ref) * `:WarnGradient` creates a [`DebugWarnIfFieldNotFinite`](@ref) for the `::Gradient`. -* `:Time` creates a [`DebugTime`](@ref) * `:WarningMessages` creates a [`DebugMessages`](@ref)`(:Warning)` * `:InfoMessages` creates a [`DebugMessages`](@ref)`(:Info)` * `:ErrorMessages` creates a [`DebugMessages`](@ref)`(:Error)` @@ -1419,6 +1509,7 @@ function DebugActionFactory(d::Symbol) (d == :Iterate) && return DebugIterate() (d == :Iteration) && return DebugIteration() (d == :Feasibility) && return DebugFeasibility() + (d == :ProxParameter) && return DebugProximalParameter() (d == :Stepsize) && return DebugStepsize() (d == :Stop) && return DebugStoppingCriterion() (d == :WarnStepsize) && return DebugWarnIfStepsizeCollapsed() @@ -1451,6 +1542,7 @@ Note that the Shortcut symbols `t[1]` should all start with a capital letter. * `:GradientNorm` creates a [`DebugGradientNorm`](@ref) * `:Iterate` creates a [`DebugIterate`](@ref) * `:Iteration` creates a [`DebugIteration`](@ref) +* `:ProxParameter` creates a [`DebugProximalParameter`](@ref) * `:Stepsize` creates a [`DebugStepsize`](@ref) * `:Stop` creates a [`DebugStoppingCriterion`](@ref) * `:Time` creates a [`DebugTime`](@ref) @@ -1468,11 +1560,12 @@ function DebugActionFactory(t::Tuple{Symbol, Any}) (t[1] == :Iteration) && return DebugIteration(; format = t[2]) (t[1] == :Iterate) && return DebugIterate(; format = t[2]) (t[1] == :IterativeTime) && return DebugTime(; mode = :Iterative, format = t[2]) + (t[1] == :ProxParameter) && return DebugProximalParameter(; format = t[2]) (t[1] == :Stepsize) && return DebugStepsize(; format = t[2]) (t[1] == :Stop) && return DebugStoppingCriterion(t[2]) (t[1] == :Time) && return DebugTime(; format = t[2]) ((t[1] == :Messages) || (t[1] == :InfoMessages)) && return DebugMessages(:Info, t[2]) (t[1] == :WarningMessages) && return DebugMessages(:Warning, t[2]) - (t[1] == :ErrorMessages) && return DebugMessages(:error, t[2]) + (t[1] == :ErrorMessages) && return DebugMessages(:Error, t[2]) return DebugEntry(t[1]; format = t[2]) end diff --git a/src/plans/difference_of_convex_plan.jl b/src/plans/difference_of_convex_plan.jl index 12c3fc42fd..8cf55f664c 100644 --- a/src/plans/difference_of_convex_plan.jl +++ b/src/plans/difference_of_convex_plan.jl @@ -15,14 +15,22 @@ Furthermore the subdifferential ``∂h`` of ``h`` is required. # Fields * `cost`: an implementation of ``f(p) = g(p)-h(p)`` as a function `f(M,p)`. +* `gradient!!` a gradient of the smooth component `g` * `∂h!!`: a deterministic version of ``∂h: $(_math(:Manifold))→ T$(_math(:Manifold)))``, in the sense that calling `∂h(M, p)` returns a subgradient of ``h`` at `p` and if there is more than one, it returns a deterministic choice. -Note that the subdifferential might be given in two possible signatures +Note that the gradient and the subdifferential might be given in two possible signatures -* `∂h(M,p)` which does an [`AllocatingEvaluation`](@ref) -* `∂h!(M, X, p)` which does an [`InplaceEvaluation`](@ref) in place of `X`. +* `(M,p) -> X` which does an [`AllocatingEvaluation`](@ref) +* `(M, X, p) -> X` which does an [`InplaceEvaluation`](@ref) in place of `X`. + +# Constructor + + ManifoldDifferenceOfConvexObjective(cost, ∂h; gradient = nothing, evaluation = AllocatingEvaluation()) + +Create the difference of convex objective given a `cost` function and the subdifferential `∂h` of the non-smooth part +The `gradient` of the smooth part and the `evaluation = ` type are keywords. """ struct ManifoldDifferenceOfConvexObjective{E, F, G, S} <: AbstractManifoldFirstOrderObjective{E, Tuple{F, G}} @@ -52,10 +60,7 @@ function get_gradient( return doco.gradient!!(M, X, p) end function get_gradient!( - M::AbstractManifold, - X, - doco::ManifoldDifferenceOfConvexObjective{AllocatingEvaluation}, - p, + M::AbstractManifold, X, doco::ManifoldDifferenceOfConvexObjective{AllocatingEvaluation}, p, ) return copyto!(M, X, p, doco.gradient!!(M, p)) end @@ -104,10 +109,7 @@ function get_subtrahend_gradient( end function get_subtrahend_gradient!( - M::AbstractManifold, - X, - doco::ManifoldDifferenceOfConvexObjective{AllocatingEvaluation}, - p, + M::AbstractManifold, X, doco::ManifoldDifferenceOfConvexObjective{AllocatingEvaluation}, p, ) return copyto!(M, X, p, doco.∂h!!(M, p)) end @@ -122,6 +124,30 @@ function get_subtrahend_gradient!( return get_subtrahend_gradient!(M, X, get_objective(admo, false), p) end +function Base.show(io::IO, doco::ManifoldDifferenceOfConvexObjective{E}) where {E} + print(io, "ManifoldDifferenceOfConvexObjective("); print(io, doco.cost); print(io, ", ") + print(io, doco.∂h!!); print(io, "; ") + print(io, _to_kw(E)) + if !isnothing(doco.gradient!!) + print(io, ", gradient = ") + print(io, doco.gradient!!) + end + return print(io, ")") +end +function status_summary(doco::ManifoldDifferenceOfConvexObjective; context::Symbol = :default) + (context === :short) && (return repr(doco)) + gs = isnothing(doco.gradient!!) ? "" : "including a gradient of the smooth component" + (context === :inline) && (return "A difference of convex objective on a manifold $gs") + gsd = isnothing(doco.gradient!!) ? "" : "\n* gradient of `g`: $(_MANOPT_INDENT)$(doco.gradient!!)" + return """ + A difference of convex objective on a manifold. + + ## Functions + * cost `f = g + h`: $(_MANOPT_INDENT)$(doco.cost)$(gsd) + * ∂h: $(_MANOPT_INDENT)$(doco.∂h!!)""" +end + + @doc """ LinearizedDCCost @@ -263,10 +289,13 @@ as allocating or in-place. # Constructor - ManifoldDifferenceOfConvexProximalObjective(gradh; cost=nothing, gradient=nothing) + ManifoldDifferenceOfConvexProximalObjective( + grad_h; + cost = nothing, gradient = nothing, evaluation = AllocatingEvaluation() + ) an note that neither cost nor gradient are required for the algorithm, -just for eventual debug or stopping criteria. +just for eventual debug or recording functionality or for the stopping criterion. """ struct ManifoldDifferenceOfConvexProximalObjective{E <: AbstractEvaluationType, GH, F, G} <: AbstractManifoldFirstOrderObjective{E, Tuple{F, G}} @@ -275,42 +304,30 @@ struct ManifoldDifferenceOfConvexProximalObjective{E <: AbstractEvaluationType, grad_h!!::GH function ManifoldDifferenceOfConvexProximalObjective( grad_h::THG; - cost::TC = nothing, - gradient::TG = nothing, - evaluation::ET = AllocatingEvaluation(), + cost::TC = nothing, gradient::TG = nothing, evaluation::ET = AllocatingEvaluation(), ) where {ET <: AbstractEvaluationType, TC, TG, THG} return new{ET, THG, TC, TG}(cost, gradient, grad_h) end end function get_gradient( - M::AbstractManifold, - dcpo::ManifoldDifferenceOfConvexProximalObjective{AllocatingEvaluation}, - p, + M::AbstractManifold, dcpo::ManifoldDifferenceOfConvexProximalObjective{AllocatingEvaluation}, p, ) return dcpo.gradient!!(M, p) end function get_gradient( - M::AbstractManifold, - dcpo::ManifoldDifferenceOfConvexProximalObjective{InplaceEvaluation}, - p, + M::AbstractManifold, dcpo::ManifoldDifferenceOfConvexProximalObjective{InplaceEvaluation}, p, ) X = zero_vector(M, p) return dcpo.gradient!!(M, X, p) end function get_gradient!( - M::AbstractManifold, - X, - dcpo::ManifoldDifferenceOfConvexProximalObjective{AllocatingEvaluation}, - p, + M::AbstractManifold, X, dcpo::ManifoldDifferenceOfConvexProximalObjective{AllocatingEvaluation}, p, ) return copyto!(M, X, p, dcpo.gradient!!(M, p)) end function get_gradient!( - M::AbstractManifold, - X, - dcpo::ManifoldDifferenceOfConvexProximalObjective{InplaceEvaluation}, - p, + M::AbstractManifold, X, dcpo::ManifoldDifferenceOfConvexProximalObjective{InplaceEvaluation}, p, ) return dcpo.gradient!!(M, X, p) end @@ -333,39 +350,61 @@ get_subtrahend_gradient( ) function get_subtrahend_gradient( - M::AbstractManifold, - dcpo::ManifoldDifferenceOfConvexProximalObjective{AllocatingEvaluation}, - p, + M::AbstractManifold, dcpo::ManifoldDifferenceOfConvexProximalObjective{AllocatingEvaluation}, p, ) return dcpo.grad_h!!(M, p) end function get_subtrahend_gradient( - M::AbstractManifold, - dcpo::ManifoldDifferenceOfConvexProximalObjective{InplaceEvaluation}, - p, + M::AbstractManifold, dcpo::ManifoldDifferenceOfConvexProximalObjective{InplaceEvaluation}, p, ) X = zero_vector(M, p) dcpo.grad_h!!(M, X, p) return X end function get_subtrahend_gradient!( - M::AbstractManifold, - X, - dcpo::ManifoldDifferenceOfConvexProximalObjective{AllocatingEvaluation}, - p, + M::AbstractManifold, X, dcpo::ManifoldDifferenceOfConvexProximalObjective{AllocatingEvaluation}, p, ) return copyto!(M, X, p, dcpo.grad_h!!(M, p)) end function get_subtrahend_gradient!( - M::AbstractManifold, - X, - dcpo::ManifoldDifferenceOfConvexProximalObjective{InplaceEvaluation}, - p, + M::AbstractManifold, X, dcpo::ManifoldDifferenceOfConvexProximalObjective{InplaceEvaluation}, p, ) dcpo.grad_h!!(M, X, p) return X end + +function Base.show(io::IO, dcpo::ManifoldDifferenceOfConvexProximalObjective{E}) where {E} + print(io, "ManifoldDifferenceOfConvexProximalObjective(") + print(io, dcpo.grad_h!!); print(io, "; ") + if !isnothing(dcpo.cost) + print(io, "cost = ") + print(io, dcpo.cost) + print(io, ", ") + end + print(io, _to_kw(E)) + if !isnothing(dcpo.gradient!!) + print(io, ", gradient = ") + print(io, dcpo.gradient!!) + end + return print(io, ")") +end +function status_summary(dcpo::ManifoldDifferenceOfConvexProximalObjective; context::Symbol = :default) + (context === :short) && (return repr(dcpo)) + cs = isnothing(dcpo.cost) ? "" : "an overall cost" + gs = isnothing(dcpo.gradient!!) ? "" : "an overall gradient" + cgs = length(cs) * length(gs) > 0 ? "$cs and $gs" : "$cs$gs" + s = length(cgs) == 0 ? "" : "including $cgs" + (context === :inline) && (return "A difference of convex proximal objective on a manifold $s") + csd = isnothing(dcpo.cost) ? "" : "\n* cost `f = g + h`:$(_MANOPT_INDENT)$(dcpo.cost)" + gsd = isnothing(dcpo.gradient!!) ? "" : "\n* gradient of `f` :$(_MANOPT_INDENT)$(dcpo.gradient!!)" + return """ + A difference of convex proximal objective on a manifold. + + ## Functions$(csd)$(gsd) + * gradient of `h` :$(_MANOPT_INDENT)$(dcpo.grad_h!!)""" +end + @doc """ ProximalDCCost @@ -436,8 +475,8 @@ Both interim values can be set using `set_parameter!(::ProximalDCGrad, ::Val{:p}, p)` and `set_parameter!(::ProximalDCGrad, ::Val{:λ}, λ)`, respectively. - # Constructor + ProximalDCGrad(grad_g, pk, λ; evaluation=AllocatingEvaluation()) Where you specify whether `grad_g` is [`AllocatingEvaluation`](@ref) or [`InplaceEvaluation`](@ref), diff --git a/src/plans/embedded_objective.jl b/src/plans/embedded_objective.jl index ac72af000f..ce519259d7 100644 --- a/src/plans/embedded_objective.jl +++ b/src/plans/embedded_objective.jl @@ -371,7 +371,18 @@ function get_grad_inequality_constraint!( Y .= [riemannian_gradient(M, p, X) for X in Z] return Y end +function show(io::IO, emo::EmbeddedManifoldObjective) + return print(io, "EmbeddedManifoldObjective($(emo.objective), $(emo.p), $(emo.X))") +end +function status_summary(emo::EmbeddedManifoldObjective{P, T}; context::Symbol = :default) where {P, T} + (context === :short) && return repr(emo) + (context === :inline) && return "An embedded objective of $(status_summary(emo.objective; context = context))" + p_str = !(ismissing(emo.p)) ? "* for a point of type $P" : "" + X_str = !(ismissing(emo.X)) ? "* for a tangent vector of type $T" : "" + pX_str = (length(p_str) + length(X_str) > 0) ? "\n\n## Temporary memory (in the embedding)\n$(p_str)$(length(p_str) > 0 ? "\n" : "")$(X_str)" : "" + return """ + An embedded objective -function show(io::IO, emo::EmbeddedManifoldObjective{P, T}) where {P, T} - return print(io, "EmbeddedManifoldObjective{$P,$T} of an $(emo.objective)") + ## Objective + $(_in_str(status_summary(emo.objective, context = context); indent = 1, headers = 1))$(pX_str)""" end diff --git a/src/plans/first_order_plan.jl b/src/plans/first_order_plan.jl index 578736bd38..9117bc1873 100644 --- a/src/plans/first_order_plan.jl +++ b/src/plans/first_order_plan.jl @@ -319,13 +319,8 @@ function get_differential( return real(inner(M, p, gradient, X)) end function get_differential( - M::AbstractManifold, - mfo::ManifoldFirstOrderObjective, - p, - X; - gradient = nothing, - evaluated::Bool = false, - kwargs..., + M::AbstractManifold, mfo::ManifoldFirstOrderObjective, p, X; + gradient = nothing, evaluated::Bool = false, kwargs..., ) # If we have a differential – evaluate that haskey(mfo.functions, :differential) && (return mfo.functions[:differential](M, p, X)) @@ -375,19 +370,14 @@ end # (a) alloc function get_gradient( - M::AbstractManifold, - mfo::ManifoldFirstOrderObjective{AllocatingEvaluation, <:NamedTuple}, - p, + M::AbstractManifold, mfo::ManifoldFirstOrderObjective{AllocatingEvaluation, <:NamedTuple}, p, ) haskey(mfo.functions, :gradient) && (return mfo.functions[:gradient](M, p)) haskey(mfo.functions, :costgradient) && (return mfo.functions[:costgradient](M, p)[2]) return error("$mfo does not seem to provide a gradient") end function get_gradient!( - M::AbstractManifold, - X, - mfo::ManifoldFirstOrderObjective{AllocatingEvaluation, <:NamedTuple}, - p, + M::AbstractManifold, X, mfo::ManifoldFirstOrderObjective{AllocatingEvaluation, <:NamedTuple}, p, ) haskey(mfo.functions, :gradient) && (return copyto!(M, X, p, mfo.functions[:gradient](M, p))) @@ -403,14 +393,10 @@ function get_gradient( return get_gradient!(M, X, mfo, p) end function get_gradient!( - M::AbstractManifold, - X, - mfo::ManifoldFirstOrderObjective{InplaceEvaluation, <:NamedTuple}, - p, + M::AbstractManifold, X, mfo::ManifoldFirstOrderObjective{InplaceEvaluation, <:NamedTuple}, p, ) haskey(mfo.functions, :gradient) && (return mfo.functions[:gradient](M, X, p)) - haskey(mfo.functions, :costgradient) && - (return mfo.functions[:costgradient](M, X, p)[2]) + haskey(mfo.functions, :costgradient) && (return mfo.functions[:costgradient](M, X, p)[2]) return error("$mfo does not seem to provide a gradient") end @@ -524,8 +510,16 @@ function get_cost_and_gradient!( return error("$mfo seems to either have no access to a cost or a gradient") end -function show(io::IO, ::ManifoldFirstOrderObjective{E, FG}) where {E, FG} - return print(io, "ManifoldFirstOrderObjective{$E, $FG}") +function status_summary(mfo::ManifoldFirstOrderObjective; context::Symbol = :default) + _is_inline(context) && (return repr(mfo)) + return "A first order objective with $(length(mfo.functions)) provided functions.\n\n" * join([ "* $k:$(_MANOPT_INDENT) $(v)" for (k, v) in zip(keys(mfo.functions), mfo.functions) ], "\n") +end +function Base.show(io::IO, mfo::ManifoldFirstOrderObjective{E}) where {E} + print(io, "ManifoldFirstOrderObjective(; ") + print(io, join([ "$k = $v" for (k, v) in zip(keys(mfo.functions), mfo.functions)], ", ")) + print(io, ", ") + print(io, _to_kw(E)) + return print(io, ")") end # @@ -623,6 +617,11 @@ this parameter-free instantiation to later. """ struct IdentityUpdateRule <: DirectionUpdateRule end Gradient() = ManifoldDefaultsFactory(Manopt.IdentityUpdateRule; requires_manifold = false) +Base.show(io::IO, agr::IdentityUpdateRule) = print(io, "IdentityUpdateRule()") +function status_summary(ir::IdentityUpdateRule; context::Symbol = :default) + (context === :short) && return repr(ir) + return "A gradient processor that evaluates the gradient" +end """ MomentumGradientRule <: DirectionUpdateRule @@ -641,7 +640,6 @@ $(_fields(:X; name = "X_old")) # Constructors - MomentumGradientRule(M::AbstractManifold; kwargs...) MomentumGradientRule(M::AbstractManifold, p; kwargs...) @@ -665,6 +663,11 @@ mutable struct MomentumGradientRule{ direction::D vector_transport_method::VTM X_old::T + function MomentumGradientRule(; + momentum::R, p_old::P, direction::D, vector_transport_method::VTM, X_old::T + ) where {P, T, D <: DirectionUpdateRule, R <: Real, VTM <: AbstractVectorTransportMethod} + return new{P, T, D, R, VTM}(momentum, p_old, direction, vector_transport_method, X_old) + end end function MomentumGradientRule(M::AbstractManifold, p; kwargs...) return MomentumGradientRule(M; p = copy(M, p), kwargs...) @@ -678,8 +681,8 @@ function MomentumGradientRule( momentum::F = 0.2, ) where {P, Q, F <: Real, VTM <: AbstractVectorTransportMethod} dir = _produce_type(direction, M) - return MomentumGradientRule{P, Q, typeof(dir), F, VTM}( - momentum, p, dir, vector_transport_method, X + return MomentumGradientRule(; + momentum = momentum, p_old = p, direction = dir, vector_transport_method = vector_transport_method, X_old = X, ) end function (mg::MomentumGradientRule)( @@ -695,6 +698,24 @@ function (mg::MomentumGradientRule)( copyto!(M, mg.p_old, p) return step, -mg.X_old end +function Base.show(io::IO, mgr::MomentumGradientRule) + print(io, "MomentumGradientRule(; momentum = ", mgr.momentum) + print(io, ", p_old = ", mgr.p_old, ", X_old ", mgr.X_old) + print(io, ", direction = ", mgr.direction) + print(io, "vector_transport_method = ", mgr.vector_transport_method) + return print(io, ")") +end +function status_summary(mgr::MomentumGradientRule; context::Symbol = :default) + (context === :short) && return repr(agr) + (context === :inline) && return "A momentum gradient direction processor with m=$(mgr.momentum)) using $(agr.vector_transport_method)" + return """ + Momentum Gradient Rule + + ## Parameters + * direction: $(_MANOPT_INDENT)$(status_summary(mgr.direction; context = context)) + * momentum: $(_MANOPT_INDENT)$(mgr.momentum) + * vector transport method:$(_MANOPT_INDENT)$(mgr.vector_transport_method)""" +end """ MomentumGradient() @@ -759,12 +780,17 @@ Add average to a gradient problem, where $(_kwargs(:vector_transport_method)) """ mutable struct AverageGradientRule{ - P, T, D <: DirectionUpdateRule, VTM <: AbstractVectorTransportMethod, + P, T, D <: DirectionUpdateRule, VTM <: AbstractVectorTransportMethod, A <: AbstractVector{<:T}, } <: DirectionUpdateRule - gradients::AbstractVector{T} + gradients::A last_iterate::P direction::D vector_transport_method::VTM + function AverageGradientRule(; + gradients::A, last_iterate::P, direction::D, vector_transport_method::VTM + ) where {P, A <: AbstractVector, D <: DirectionUpdateRule, VTM <: AbstractVectorTransportMethod} + return new{P, eltype(gradients), D, VTM, A}(gradients, last_iterate, direction, vector_transport_method) + end end function AverageGradientRule(M::AbstractManifold, p; kwargs...) return AverageGradientRule(M; p = copy(M, p), kwargs...) @@ -778,8 +804,9 @@ function AverageGradientRule( vector_transport_method::VTM = default_vector_transport_method(M, typeof(p)), ) where {P, VTM} dir = _produce_type(direction, M) - return AverageGradientRule{P, eltype(gradients), typeof(dir), VTM}( - gradients, copy(M, p), dir, vector_transport_method + return AverageGradientRule(; + gradients = gradients, last_iterate = copy(M, p), direction = dir, + vector_transport_method = vector_transport_method, ) end function (a::AverageGradientRule)( @@ -797,7 +824,23 @@ function (a::AverageGradientRule)( copyto!(M, a.last_iterate, p) return 1.0, 1 / length(a.gradients) .* sum(a.gradients) end +function Base.show(io::IO, agr::AverageGradientRule) + print(io, "AverageGradientRule(; gradients = ", agr.gradients) + print(io, "last_iterate = ", agr.last_iterate, ", direction = ", agr.direction) + print(io, "vector_transport_method = ", agr.vector_transport_method) + return print(io, ")") +end +function status_summary(agr::AverageGradientRule; context::Symbol = :default) + (context === :short) && return repr(agr) + (context === :inline) && return "An average gradient direction processor with n=$(length(agr.gradients)) gradients to average over using $(agr.vector_transport_method)" + return """ + Average Gradient Rule + ## Parameters + * direction: $(_MANOPT_INDENT)$(status_summary(agr.direction; context = context)) + * number of gradients: $(_MANOPT_INDENT)$(length(agr.gradients)) + * vector transport method:$(_MANOPT_INDENT)$(agr.vector_transport_method)""" +end """ AverageGradient(; kwargs...) AverageGradient(M::AbstractManifold; kwargs...) @@ -854,29 +897,30 @@ $(_kwargs(:inverse_retraction_method)) [`Nesterov`](@ref) """ -mutable struct NesterovRule{P, R <: Real} <: DirectionUpdateRule +mutable struct NesterovRule{P, R <: Real, IRM <: AbstractInverseRetractionMethod, RM <: AbstractRetractionMethod, F} <: DirectionUpdateRule γ::R μ::R v::P - shrinkage::Function - inverse_retraction_method::AbstractInverseRetractionMethod + shrinkage::F + inverse_retraction_method::IRM + retraction_method::RM + function NesterovRule(; + γ::R, μ::R, v::P, shrinkage::F, inverse_retraction_method::IRM, retraction_method::RM + ) where {P, R <: Real, IRM <: AbstractInverseRetractionMethod, RM <: AbstractRetractionMethod, F} + return new{P, R, IRM, RM, F}(γ, μ, v, shrinkage, inverse_retraction_method, retraction_method) + end end function NesterovRule(M::AbstractManifold, p; kwargs...) return NesterovRule(M; p = copy(M, p), kwargs...) end function NesterovRule( - M::AbstractManifold; - p::P = rand(M), - γ::T = 0.001, - μ::T = 0.9, - shrinkage::Function = i -> 0.8, - inverse_retraction_method::AbstractInverseRetractionMethod = default_inverse_retraction_method( - M, typeof(p) - ), + M::AbstractManifold; p::P = rand(M), γ::T = 0.001, μ::T = 0.9, shrinkage::Function = i -> 0.8, + inverse_retraction_method::AbstractInverseRetractionMethod = default_inverse_retraction_method(M, typeof(p)), + retraction_method::AbstractRetractionMethod = default_retraction_method(M, typeof(p)), ) where {P, T} p_ = _ensure_mutating_variable(p) - return NesterovRule{typeof(p_), T}( - γ, μ, copy(M, p_), shrinkage, inverse_retraction_method + return NesterovRule( + γ = γ, μ = μ, v = copy(M, p_), shrinkage = shrinkage, inverse_retraction_method = inverse_retraction_method, retraction_method = retraction_method, ) end function (n::NesterovRule)(mp::AbstractManoptProblem, s::AbstractGradientSolverState, k) @@ -886,20 +930,37 @@ function (n::NesterovRule)(mp::AbstractManoptProblem, s::AbstractGradientSolverS α = (h * (n.γ - n.μ) + sqrt(h^2 * (n.γ - n.μ)^2 + 4 * h * n.γ)) / 2 γbar = (1 - α) * n.γ + α * n.μ y = retract( - M, - p, - ((α * n.γ) / (n.γ + α * n.μ)) * - inverse_retract(M, p, n.v, n.inverse_retraction_method), + M, p, + ((α * n.γ) / (n.γ + α * n.μ)) * inverse_retract(M, p, n.v, n.inverse_retraction_method), + n.retraction_method, ) gradf_yk = get_gradient(mp, y) - xn = retract(M, y, -h * gradf_yk) + xn = retract(M, y, -h * gradf_yk, n.retraction_method) d = (((1 - α) * n.γ) / γbar) * inverse_retract(M, y, n.v, n.inverse_retraction_method) - (α / γbar) * gradf_yk - n.v = retract(M, y, d, s.retraction_method) + retract!(M, n.v, y, d, n.retraction_method) n.γ = 1 / (1 + n.shrinkage(k)) * γbar return h, (-1 / h) * inverse_retract(M, p, xn, n.inverse_retraction_method) # outer update end +function Base.show(io::IO, nr::NesterovRule) + print(io, "NesterovRule(; γ = ", nr.γ, ", μ = ", nr.μ, ", v = ", nr.v, ", shrinkage = ", nr.shrinkage) + return print(io, ", inverse_retraction_method = ", nr. inverse_retraction_method, ", retraction_method = ", nr.retraction_method, ")") +end +function status_summary(nr::NesterovRule; context::Symbol = :default) + (context === :short) && return repr(nr) + (context === :inline) && return "A Nesterov gradient direction processor using $(nr.retraction_method) and $(nr.inverse_retraction_method)" + return """ + Nesterov Rule + + ## Parameters + γ: $(_MANOPT_INDENT)$(nr.γ) + μ: $(_MANOPT_INDENT)$(nr.μ) + shrinkage: $(_MANOPT_INDENT)$(nr.shrinkage) + inverse_retraction_method:$(_MANOPT_INDENT)$(nr.inverse_retraction_method) + retraction_method: $(_MANOPT_INDENT)$(nr.retraction_method) + """ +end @doc """ Nesterov(; kwargs...) @@ -978,6 +1039,11 @@ mutable struct PreconditionedDirectionRule{ } <: DirectionUpdateRule preconditioner::F direction::D + function PreconditionedDirectionRule(; + preconditioner::F, direction::D, evaluation::E + ) where {E <: AbstractEvaluationType, D <: DirectionUpdateRule, F} + return new{E, D, F}(preconditioner, direction) + end end function PreconditionedDirectionRule( M::AbstractManifold, @@ -986,7 +1052,7 @@ function PreconditionedDirectionRule( evaluation::E = AllocatingEvaluation(), ) where {E <: AbstractEvaluationType, F} dir = _produce_type(direction, M) - return PreconditionedDirectionRule{E, typeof(dir), F}(preconditioner, dir) + return PreconditionedDirectionRule(; preconditioner = preconditioner, direction = dir, evaluation = evaluation) end function (pg::PreconditionedDirectionRule{AllocatingEvaluation})( mp::AbstractManoptProblem, s::AbstractGradientSolverState, k @@ -1008,6 +1074,23 @@ function (pg::PreconditionedDirectionRule{InplaceEvaluation})( pg.preconditioner(M, dir, p, dir) return step, dir end +function Base.show(io::IO, pg::PreconditionedDirectionRule{E}) where {E <: AbstractEvaluationType} + print(io, "PreconditionedDirectionRule(; direction = ", pg.direction, ", preconditioner = ", pg.preconditioner, ", ", _to_kw(E)) + return print(io, ")") +end +function status_summary(pg::PreconditionedDirectionRule; context::Symbol = :default) + (context === :short) && return repr(pg) + (context === :inline) && return "A preconditioner gradient processor" + return """ + Preconditioned Direction Rule + + ## Parameters + preconditioner: $(_MANOPT_INDENT)$(pg.preconditioner) + + ## Direction Rule + $(_in_str(status_summary(pg.direction; context = context); indent = 1, headers = 1)) + """ +end """ PreconditionedDirection(preconditioner; kwargs...) @@ -1082,10 +1165,13 @@ function (d::DebugGradient)(::AbstractManoptProblem, s::AbstractManoptSolverStat Printf.format(d.io, Printf.Format(d.format), get_gradient(s)) return nothing end -function show(io::IO, dg::DebugGradient) +function Base.show(io::IO, dg::DebugGradient) return print(io, "DebugGradient(; format=\"$(dg.format)\", at_init=$(dg.at_init))") end -status_summary(dg::DebugGradient) = "(:Gradient, \"$(dg.format)\")" +function status_summary(dg::DebugGradient; context::Symbol = :default) + (context === :short) && (return "(:Gradient, \"$(dg.format)\")") + return "A DebugAction to print the gradient at the current iterate “$(dg.format)”" +end @doc """ DebugGradientNorm <: DebugAction @@ -1108,9 +1194,7 @@ mutable struct DebugGradientNorm <: DebugAction function DebugGradientNorm(; long::Bool = false, prefix = long ? "Norm of the Gradient: " : "|grad f(p)|:", - format = "$prefix%s", - io::IO = stdout, - at_init::Bool = true, + format = "$prefix%s", io::IO = stdout, at_init::Bool = true, ) return new(io, format, at_init) end @@ -1126,11 +1210,13 @@ function (d::DebugGradientNorm)( ) return nothing end -function show(io::IO, dgn::DebugGradientNorm) +function Base.show(io::IO, dgn::DebugGradientNorm) return print(io, "DebugGradientNorm(; format=\"$(dgn.format)\", at_init=$(dgn.at_init))") end -status_summary(dgn::DebugGradientNorm) = "(:GradientNorm, \"$(dgn.format)\")" - +function status_summary(dgn::DebugGradientNorm; context::Symbol = :default) + (context === :short) && return "(:GradientNorm, \"$(dgn.format)\")" + return "A debug action to display the gradient norm (format. \"$(dgn.format)\")" +end @doc """ DebugStepsize <: DebugAction @@ -1162,10 +1248,13 @@ function (d::DebugStepsize)( Printf.format(d.io, Printf.Format(d.format), get_last_stepsize(p, s, k)) return nothing end -function show(io::IO, ds::DebugStepsize) - return print(io, "DebugStepsize(; format=\"$(ds.format)\", at_init=$(ds.at_init))") +function Base.show(io::IO, ds::DebugStepsize) + return print(io, "DebugStepsize(; format=\"$(escape_string(ds.format))\", at_init=$(ds.at_init))") +end +function status_summary(ds::DebugStepsize; context::Symbol = :default) + (context === :short) && return "(:Stepsize, \"$(escape_string(ds.format))\")" + return "A DebugAction that prints the current step size to $(ds.io) in format “$(escape_string(ds.format))”" end -status_summary(ds::DebugStepsize) = "(:Stepsize, \"$(ds.format)\")" # # Records # @@ -1189,16 +1278,22 @@ function (r::RecordGradient{T})( ) where {T} return record_or_reset!(r, get_gradient(s), k) end -show(io::IO, ::RecordGradient{T}) where {T} = print(io, "RecordGradient{$T}()") - +show(io::IO, ::RecordGradient{T}) where {T} = print(io, "RecordGradient($T)") +function status_summary(rg::RecordGradient; context::Symbol = :default) + (context === :short) && return ":Gradient" + return "A RecordAction to record the current gradient" +end @doc """ - RecordGradientNorm <: RecordAction + RecordGradientNorm{R<:Real} <: RecordAction record the norm of the current gradient + +## Constructor + RecordGradientNorm(r::Type{<:Real}=Float64) """ -mutable struct RecordGradientNorm <: RecordAction - recorded_values::Array{Float64, 1} - RecordGradientNorm() = new(Array{Float64, 1}()) +mutable struct RecordGradientNorm{R <: Real} <: RecordAction + recorded_values::Array{R, 1} + RecordGradientNorm(r::Type{<:Real} = Float64) = new{r}(Array{r, 1}()) end function (r::RecordGradientNorm)( mp::AbstractManoptProblem, ast::AbstractManoptSolverState, k::Int @@ -1207,16 +1302,28 @@ function (r::RecordGradientNorm)( return record_or_reset!(r, norm(M, get_iterate(ast), get_gradient(ast)), k) end show(io::IO, ::RecordGradientNorm) = print(io, "RecordGradientNorm()") +function status_summary(rg::RecordGradientNorm; context::Symbol = :default) + (context === :short) && return ":GradientNorm" + return "A RecordAction to record the current gradient norm" +end @doc """ RecordStepsize <: RecordAction -record the step size +record the step size. + +## Constructor + RecordStepsise(r::Type{<:Real}=Float64) """ -mutable struct RecordStepsize <: RecordAction - recorded_values::Array{Float64, 1} - RecordStepsize() = new(Array{Float64, 1}()) +mutable struct RecordStepsize{R <: Real} <: RecordAction + recorded_values::Array{R, 1} + RecordStepsize(r::Type{<:Real} = Float64) = new{r}(Array{r, 1}()) end function (r::RecordStepsize)(p::AbstractManoptProblem, s::AbstractGradientSolverState, k) return record_or_reset!(r, get_last_stepsize(p, s, k), k) end +show(io::IO, ::RecordStepsize{R}) where {R} = print(io, "RecordStepsize($R)") +function status_summary(rg::RecordStepsize{R}; context::Symbol = :default) where {R} + (context === :short) && return ":Stepsize" + return "A RecordAction to record the current stepsize (of type $R)" +end diff --git a/src/plans/hessian_plan.jl b/src/plans/hessian_plan.jl index c1adefaf50..7dd96ba7f4 100644 --- a/src/plans/hessian_plan.jl +++ b/src/plans/hessian_plan.jl @@ -23,9 +23,9 @@ specify a problem for Hessian based algorithms. # Fields * `cost`: a function ``f:$(_math(:Manifold))nifold)))→ℝ`` to minimize -* `gradient`: the gradient ``$(_tex(:grad))f:$(_math(:Manifold))) → $(_math(:TangentBundle))`` of the cost function ``f`` -* `hessian`: the Hessian ``$(_tex(:Hess))f(x)[⋅]: $(_math(:TangentSpace; p = "x")) → $(_math(:TangentSpace; p = "x"))`` of the cost function ``f`` -* `preconditioner`: the symmetric, positive definite preconditioner +* `gradient!!`: the gradient ``$(_tex(:grad))f:$(_math(:Manifold))) → $(_math(:TangentBundle))`` of the cost function ``f`` +* `hessian!!`: the Hessian ``$(_tex(:Hess))f(x)[⋅]: $(_math(:TangentSpace; p = "x")) → $(_math(:TangentSpace; p = "x"))`` of the cost function ``f`` +* `preconditioner!!`: the symmetric, positive definite preconditioner as an approximation of the inverse of the Hessian of ``f``, a map with the same input variables as the `hessian` to numerically stabilize iterations when the Hessian is ill-conditioned @@ -56,13 +56,7 @@ struct ManifoldHessianObjective{T <: AbstractEvaluationType, C, G, H, Pre} <: precond = nothing; evaluation::AbstractEvaluationType = AllocatingEvaluation(), ) where {C, G, H} - if isnothing(precond) - if evaluation isa InplaceEvaluation - precond = (M, Y, p, X) -> (Y .= X) - else - precond = (M, p, X) -> X - end - end + # We store `Nothing` as a type for the preconditioner return new{typeof(evaluation), C, G, H, typeof(precond)}(cost, grad, hess, precond) end end @@ -173,7 +167,6 @@ end function get_preconditioner!(amp::AbstractManoptProblem, Y, p, X) return get_preconditioner!(get_manifold(amp), Y, get_objective(amp), p, X) end - @doc """ get_preconditioner(M::AbstractManifold, mho::ManifoldHessianObjective, p, X) @@ -185,12 +178,14 @@ tangent vector `X`. function get_preconditioner( M::AbstractManifold, mho::ManifoldHessianObjective{AllocatingEvaluation}, p, X ) + isnothing(mho.preconditioner!!) && return (copy(M, p, X)) return mho.preconditioner!!(M, p, X) end function get_preconditioner( M::AbstractManifold, mho::ManifoldHessianObjective{InplaceEvaluation}, p, X ) Y = zero_vector(M, p) + isnothing(mho.preconditioner!!) && return copyto!(M, Y, p, X) mho.preconditioner!!(M, Y, p, X) return Y end @@ -203,7 +198,7 @@ end function get_preconditioner!( M::AbstractManifold, Y, mho::ManifoldHessianObjective{AllocatingEvaluation}, p, X ) - copyto!(M, Y, p, mho.preconditioner!!(M, p, X)) + copyto!(M, Y, p, isnothing(mho.preconditioner!!) ? X : mho.preconditioner!!(M, p, X)) return Y end function get_preconditioner!( @@ -214,14 +209,36 @@ end function get_preconditioner!( M::AbstractManifold, Y, mho::ManifoldHessianObjective{InplaceEvaluation}, p, X ) - mho.preconditioner!!(M, Y, p, X) - return Y + return isnothing(mho.preconditioner!!) ? copyto!(M, Y, p, X) : mho.preconditioner!!(M, Y, p, X) end update_hessian!(M, f, p, p_proposal, X) = f update_hessian_basis!(M, f, p) = f +function status_summary(mho::ManifoldHessianObjective{E}; context::Symbol = :default) where {E} + _is_inline(context) && return "A second order objective with cost, gradient$(isnothing(mho.preconditioner!!) ? ", and" : "") Hessian$(isnothing(mho.preconditioner!!) ? "" : ", and a preconditioner")" + precon_str = isnothing(mho.preconditioner!!) ? "" : "\n* preconditioner: $(mho.preconditioner!!)" + return """ + A second order objective providing a cost, a gradient$(isnothing(mho.preconditioner!!) ? ", and" : "") a Hessian$(isnothing(mho.preconditioner!!) ? "" : ", and a preconditioner") + + ## Functions + * cost: $(_MANOPT_INDENT)$(mho.cost) + * gradient:$(_MANOPT_INDENT)$(mho.gradient!!) + * Hessian: $(_MANOPT_INDENT)$(mho.hessian!!)$(precon_str)""" +end + +function Base.show(io::IO, mho::ManifoldHessianObjective{E}) where {E} + print(io, "ManifoldHessianObjective(") + print(io, "$(mho.cost), ") + print(io, "$(mho.gradient!!), ") + print(io, "$(mho.hessian!!)") + !isnothing(mho.preconditioner!!) && print(io, ", $(mho.preconditioner!!)") + print(io, "; ") + print(io, _to_kw(E)) + return print(io, ")") +end + @doc """ AbstractApproxHessian <: Function @@ -275,8 +292,7 @@ $(_kwargs(:evaluation)) * `steplength=`2^{-14}``: step length ``c`` to approximate the gradient evaluations $(_kwargs([:retraction_method, :vector_transport_method])) """ -mutable struct ApproxHessianFiniteDifference{E, P, T, G, RTR, VTR, R <: Real} <: - AbstractApproxHessian +mutable struct ApproxHessianFiniteDifference{E, P, T, G, RTR, VTR, R <: Real} <: AbstractApproxHessian p_dir::P gradient!!::G grad_tmp::T @@ -364,8 +380,7 @@ $(_kwargs(:vector_transport_method)). * `nu` (`-1`) $(_kwargs([:evaluation, :vector_transport_method])) """ -mutable struct ApproxHessianSymmetricRankOne{E, P, G, T, B <: AbstractBasis{ℝ}, VTR, R <: Real} <: - AbstractApproxHessian +mutable struct ApproxHessianSymmetricRankOne{E, P, G, T, B <: AbstractBasis{ℝ}, VTR, R <: Real} <: AbstractApproxHessian p_tmp::P gradient!!::G grad_tmp::T diff --git a/src/plans/higher_order_primal_dual_plan.jl b/src/plans/higher_order_primal_dual_plan.jl index 0ba1bcedac..f3521be5f8 100644 --- a/src/plans/higher_order_primal_dual_plan.jl +++ b/src/plans/higher_order_primal_dual_plan.jl @@ -31,35 +31,15 @@ mutable struct PrimalDualManifoldSemismoothNewtonObjective{ Λ!!::L end function PrimalDualManifoldSemismoothNewtonObjective( - cost, - prox_F, - diff_prox_F, - prox_G_dual, - diff_prox_G_dual, - linearized_forward_operator, - adjoint_linearized_operator; - Λ = missing, - evaluation::AbstractEvaluationType = AllocatingEvaluation(), - ) + cost::C, prox_F::PF, diff_prox_F::DPF, prox_G_dual::PG, diff_prox_G_dual::DPG, + linearized_forward_operator::LFO, adjoint_linearized_operator::AL; + Λ::L = missing, evaluation::E = AllocatingEvaluation(), + ) where {C, PF, DPF, PG, DPG, LFO, AL, L, E <: AbstractEvaluationType} return PrimalDualManifoldSemismoothNewtonObjective{ - typeof(evaluation), - typeof(cost), - typeof(prox_F), - typeof(diff_prox_F), - typeof(prox_G_dual), - typeof(diff_prox_G_dual), - typeof(linearized_forward_operator), - typeof(adjoint_linearized_operator), - typeof(Λ), + E, C, PF, DPF, PG, DPG, LFO, AL, L, }( - cost, - prox_F, - diff_prox_F, - prox_G_dual, - diff_prox_G_dual, - linearized_forward_operator, - adjoint_linearized_operator, - Λ, + cost, prox_F, diff_prox_F, prox_G_dual, + diff_prox_G_dual, linearized_forward_operator, adjoint_linearized_operator, Λ, ) end @@ -107,12 +87,8 @@ $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(50)`")) $(_kwargs(:vector_transport_method)) """ mutable struct PrimalDualSemismoothNewtonState{ - P, - Q, - T, - RM <: AbstractRetractionMethod, - IRM <: AbstractInverseRetractionMethod, - VTM <: AbstractVectorTransportMethod, + P, Q, T, RM <: AbstractRetractionMethod, + IRM <: AbstractInverseRetractionMethod, VTM <: AbstractVectorTransportMethod, } <: AbstractPrimalDualSolverState m::P n::Q @@ -127,13 +103,9 @@ mutable struct PrimalDualSemismoothNewtonState{ retraction_method::RM inverse_retraction_method::IRM vector_transport_method::VTM - function PrimalDualSemismoothNewtonState( M::AbstractManifold; - m::P = rand(M), - n::Q = rand(N), - p::P = rand(M), - X::T = zero_vector(M, p), + m::P = rand(M), n::Q = rand(N), p::P = rand(M), X::T = zero_vector(M, p), primal_stepsize::Float64 = 1 / sqrt(8), dual_stepsize::Float64 = 1 / sqrt(8), regularization_parameter::Float64 = 1.0e-5, @@ -144,56 +116,18 @@ mutable struct PrimalDualSemismoothNewtonState{ inverse_retraction_method::IRM = default_inverse_retraction_method(M, typeof(p)), vector_transport_method::VTM = default_vector_transport_method(M, typeof(p)), ) where { - P, - Q, - T, - RM <: AbstractRetractionMethod, - IRM <: AbstractInverseRetractionMethod, - VTM <: AbstractVectorTransportMethod, + P, Q, T, RM <: AbstractRetractionMethod, + IRM <: AbstractInverseRetractionMethod, VTM <: AbstractVectorTransportMethod, } return new{P, Q, T, RM, IRM, VTM}( - m, - n, - p, - X, - primal_stepsize, - dual_stepsize, - regularization_parameter, - stopping_criterion, - update_primal_base, - update_dual_base, - retraction_method, - inverse_retraction_method, - vector_transport_method, + m, n, p, X, + primal_stepsize, dual_stepsize, regularization_parameter, + stopping_criterion, update_primal_base, update_dual_base, + retraction_method, inverse_retraction_method, vector_transport_method, ) end end -function show(io::IO, pdsns::PrimalDualSemismoothNewtonState) - i = get_count(pdsns, :Iterations) - Iter = (i > 0) ? "After $i iterations\n" : "" - Conv = indicates_convergence(pdsns.stop) ? "Yes" : "No" - s = """ - # Solver state for `Manopt.jl`s primal dual semismooth Newton - $Iter - ## Parameters - * primal_stepsize: $(pdsns.primal_stepsize) - * dual_stepsize: $(pdsns.dual_stepsize) - * regularization_parameter: $(pdsns.regularization_parameter) - * retraction_method: $(pdsns.retraction_method) - * inverse_retraction_method: $(pdsns.inverse_retraction_method) - * vector_transport_method: $(pdsns.vector_transport_method) - ## Stopping criterion - - $(status_summary(pdsns.stop)) - This indicates convergence: $Conv""" - return print(io, s) -end -get_iterate(pdsn::PrimalDualSemismoothNewtonState) = pdsn.p -function set_iterate!(pdsn::PrimalDualSemismoothNewtonState, p) - pdsn.p = p - return pdsn -end @doc """ y = get_differential_primal_prox(M::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective σ, x) get_differential_primal_prox!(p::TwoManifoldProblem, y, σ, x) @@ -224,19 +158,13 @@ end function get_differential_primal_prox( M::AbstractManifold, - pdsno::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, - σ, - p, - X, + pdsno::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, σ, p, X, ) return pdsno.diff_prox_f!!(M, σ, p, X) end function get_differential_primal_prox( M::AbstractManifold, - pdsno::PrimalDualManifoldSemismoothNewtonObjective{InplaceEvaluation}, - σ, - p, - X, + pdsno::PrimalDualManifoldSemismoothNewtonObjective{InplaceEvaluation}, σ, p, X, ) Y = allocate_result(M, get_differential_primal_prox, p, X) pdsno.diff_prox_f!!(M, Y, σ, p, X) @@ -248,23 +176,15 @@ function get_differential_primal_prox( return get_differential_primal_prox(M, get_objective(admo, false), σ, p, X) end function get_differential_primal_prox!( - M::AbstractManifold, - Y, - pdsno::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, - σ, - p, - X, + M::AbstractManifold, Y, + pdsno::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, σ, p, X, ) copyto!(M, Y, p, pdsno.diff_prox_f!!(M, σ, p, X)) return Y end function get_differential_primal_prox!( - M::AbstractManifold, - Y, - pdsno::PrimalDualManifoldSemismoothNewtonObjective{InplaceEvaluation}, - σ, - p, - X, + M::AbstractManifold, Y, + pdsno::PrimalDualManifoldSemismoothNewtonObjective{InplaceEvaluation}, σ, p, X, ) pdsno.diff_prox_f!!(M, Y, σ, p, X) return Y @@ -288,9 +208,7 @@ D$(_tex(:prox))_{τG_n^*}(X)[ξ] which can also be computed in place of `η`. """ get_differential_dual_prox( - ::AbstractManifold, - ::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, - Any..., + ::AbstractManifold, ::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, Any..., ) function get_differential_dual_prox(tmo::TwoManifoldProblem, n, τ, X, ξ) @@ -307,21 +225,13 @@ end function get_differential_dual_prox( N::AbstractManifold, - pdsno::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, - n, - τ, - X, - ξ, + pdsno::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, n, τ, X, ξ, ) return pdsno.diff_prox_g_dual!!(N, n, τ, X, ξ) end function get_differential_dual_prox( N::AbstractManifold, - pdsno::PrimalDualManifoldSemismoothNewtonObjective{InplaceEvaluation}, - n, - τ, - X, - ξ, + pdsno::PrimalDualManifoldSemismoothNewtonObjective{InplaceEvaluation}, n, τ, X, ξ, ) η = allocate_result(N, get_differential_dual_prox, X, ξ) pdsno.diff_prox_g_dual!!(N, η, n, τ, X, ξ) @@ -333,25 +243,15 @@ function get_differential_dual_prox( return get_differential_dual_prox(M, get_objective(admo, false), n, τ, X, ξ) end function get_differential_dual_prox!( - N::AbstractManifold, - η, - pdsno::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, - n, - τ, - X, - ξ, + N::AbstractManifold, η, + pdsno::PrimalDualManifoldSemismoothNewtonObjective{AllocatingEvaluation}, n, τ, X, ξ, ) copyto!(N, n, η, pdsno.diff_prox_g_dual!!(N, n, τ, X, ξ)) return η end function get_differential_dual_prox!( - N::AbstractManifold, - η, - pdsno::PrimalDualManifoldSemismoothNewtonObjective{InplaceEvaluation}, - n, - τ, - X, - ξ, + N::AbstractManifold, η, + pdsno::PrimalDualManifoldSemismoothNewtonObjective{InplaceEvaluation}, n, τ, X, ξ, ) pdsno.diff_prox_g_dual!!(N, η, n, τ, X, ξ) return η @@ -361,3 +261,63 @@ function get_differential_dual_prox!( ) return get_differential_dual_prox!(M, η, get_objective(admo, false), n, τ, X, ξ) end + +get_iterate(pdsn::PrimalDualSemismoothNewtonState) = pdsn.p + +function set_iterate!(pdsn::PrimalDualSemismoothNewtonState, p) + pdsn.p = p + return pdsn +end + +function status_summary(pdsns::PrimalDualSemismoothNewtonState; context::Symbol = :default) + i = get_count(pdsns, :Iterations) + Iter = (i > 0) ? "After $i iterations\n" : "" + Conv = indicates_convergence(pdsns.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(pdsns)) – $(Iter) $(has_converged(pdsns) ? "(converged)" : "")") + s = """ + # Solver state for `Manopt.jl`s primal dual semismooth Newton + $Iter + ## Parameters + * primal_stepsize: $(_MANOPT_INDENT)$(pdsns.primal_stepsize) + * dual_stepsize: $(_MANOPT_INDENT)$(pdsns.dual_stepsize) + * regularization_parameter: $(_MANOPT_INDENT)$(pdsns.regularization_parameter) + * retraction_method: $(_MANOPT_INDENT)$(pdsns.retraction_method) + * inverse_retraction_method:$(_MANOPT_INDENT)$(pdsns.inverse_retraction_method) + * vector_transport_method: $(_MANOPT_INDENT)$(pdsns.vector_transport_method) + + ## Stopping criterion + $(_in_str(status_summary(pdsns.stop; context = context); indent = 0, headers = 1)) + This indicates convergence: $Conv""" + return s +end +function Base.show(io::IO, pdmssno::PrimalDualManifoldSemismoothNewtonObjective{E}) where {E} + print(io, "PrimalDualManifoldSemismoothNewtonObjective(") + print(io, pdmssno.cost); print(io, ", ") + print(io, pdmssno.prox_f!!); print(io, ", ") + print(io, pdmssno.diff_prox_f!!); print(io, ", ") + print(io, pdmssno.prox_g_dual!!); print(io, ", ") + print(io, pdmssno.diff_prox_g_dual!!); print(io, ", ") + print(io, pdmssno.linearized_forward_operator!!); print(io, ", ") + print(io, pdmssno.adjoint_linearized_operator!!); print(io, "; ") + if !ismissing(pdmssno.Λ!!) + print(io, "Λ = "); print(io, pdmssno.Λ!!); print(io, ", ") + end + print(io, _to_kw(E)) + return print(io, ")") +end +function status_summary(pdmssno::PrimalDualManifoldSemismoothNewtonObjective; context::Symbol = :default) + (context === :short) && return repr(pdmssno) + (context === :inline) && return "A primal dual semismooth Newton objective" + Λs = ismissing(pdmssno.Λ!!) ? "" : "\n* Λ: $(_MANOPT_INDENT)$(pdmssno.Λ!!)" + return """ + A primal dual semismooth Newton objective + + ## Functions + * cost: $(_MANOPT_INDENT)$(pdmssno.cost) + * prox_f: $(_MANOPT_INDENT)$(pdmssno.prox_f!!) + * D prox_f: $(_MANOPT_INDENT)$(pdmssno.diff_prox_f!!) + * prox_g*: $(_MANOPT_INDENT)$(pdmssno.prox_g_dual!!) + * D prox_g*: $(_MANOPT_INDENT)$(pdmssno.diff_prox_g_dual!!) + * lin. forward Op: $(_MANOPT_INDENT)$(pdmssno.linearized_forward_operator!!) + * adj. lin. fw. Op.:$(_MANOPT_INDENT)$(pdmssno.adjoint_linearized_operator!!)$(Λs)""" +end diff --git a/src/plans/interior_point_Newton_plan.jl b/src/plans/interior_point_Newton_plan.jl index 5a81fe60a4..630c2b02aa 100644 --- a/src/plans/interior_point_Newton_plan.jl +++ b/src/plans/interior_point_Newton_plan.jl @@ -21,12 +21,18 @@ if these are different from the iterate and search direction of the main solver. struct StepsizeState{P, T} <: AbstractManoptSolverState p::P X::T + StepsizeState(; p::P, X::T) where {P, T} = new{P, T}(p, X) end -StepsizeState(M::AbstractManifold; p = rand(M), X = zero_vector(M, p)) = StepsizeState(p, X) +StepsizeState(M::AbstractManifold; p = rand(M), X = zero_vector(M, p)) = StepsizeState(; p = p, X = X) get_iterate(s::StepsizeState) = s.p get_gradient(s::StepsizeState) = s.X set_iterate!(s::StepsizeState, M, p) = copyto!(M, s.p, p) set_gradient!(s::StepsizeState, M, p, X) = copyto!(M, s.X, p, X) +Base.show(io::IO, sss::StepsizeState) = print(io, "StepsizeState(; p = ", sss.p, ", X = ", sss.X, ")") +function status_summary(sss::StepsizeState{P, T}; context::Symbol = :default) where {P, T} + (context === :short) && return repr(sss) + return "A state for a stepsize problem." +end @doc """ InteriorPointNewtonState{P,T} <: AbstractHessianSolverState @@ -53,15 +59,24 @@ $(_fields([:retraction_method, :stepsize])) # Constructor InteriorPointNewtonState( - M::AbstractManifold, - cmo::ConstrainedManifoldObjective, - sub_problem::Pr, - sub_state::St; + M::AbstractManifold, cmo::ConstrainedManifoldObjective, sub_problem::Pr, sub_state::St; + kwargs... + ) + InteriorPointNewtonState( + M::AbstractManifold, cmo::ConstrainedManifoldObjective, sub_problem::Pr; + evaluation = AllocatingEvaluation(), kwargs... + ) + InteriorPointNewtonState( + sub_problem::Pr, sub_state::St; kwargs... ) Initialize the state, where both the [`AbstractManifold`](@extref `ManifoldsBase.AbstractManifold`) and the [`ConstrainedManifoldObjective`](@ref) are used to fill in reasonable defaults for the keywords. +For a closed form solution of the sub solver, you can provide the evaluation either as `St` in the first +constructor or as a keyword like in the second. +The third constructor is considered an internal constructor accepting the same keywords, +but those that are filled by defaults based on `M` or `cmo` become mandatory # Input @@ -74,14 +89,14 @@ $(_args([:sub_problem, :sub_state])) Let `m` and `n` denote the number of inequality and equality constraints, respectively $(_kwargs(:p; add_properties = [:as_Initial])) -* `μ=ones(m)` +* `μ=ones(m)` Lagrange multipliers for the inequality constraints +* `λ=zeros(n)` Lagrange multipliers for the equality constraints * `X=`[`zero_vector`](@extref `ManifoldsBase.zero_vector-Tuple{AbstractManifold, Any}`)`(M,p)` -* `Y=zero(μ)` -* `λ=zeros(n)` -* `Z=zero(λ)` -* `s=ones(m)` -* `W=zero(s)` -* `ρ=μ's/m` +* `Y=zero(μ)` tangent vector (gradient) for the inequality constraints +* `Z=zero(λ)` tangent vector (gradient) for the equality constraints +* `s=ones(m)` slack variables for the inequality constraints +* `W=zero(s)` tangent vector (gradient) for the slack variables +* `ρ=μ's/m` storage for the orthogonality check * `σ=`[`calculate_σ`](@ref)`(M, cmo, p, μ, λ, s)` $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(200)`[` | `](@ref StopWhenAny)[`StopWhenChangeLess`](@ref)`(1e-8)")) $(_kwargs(:retraction_method)) @@ -96,17 +111,10 @@ $(_kwargs(:stepsize; default = " `[`ArmijoLinesearch`](@ref)`()")) and internally `_step_M` and `_step_p` for the manifold and point in the stepsize. """ mutable struct InteriorPointNewtonState{ - P, - T, - Pr <: Union{AbstractManoptProblem, F} where {F}, - St <: AbstractManoptSolverState, - V, - R <: Real, - SC <: StoppingCriterion, - TRTM <: AbstractRetractionMethod, - TStepsize <: Stepsize, - TStepPr <: AbstractManoptProblem, - TStepSt <: AbstractManoptSolverState, + P, T, Pr <: Union{AbstractManoptProblem, F} where {F}, St <: AbstractManoptSolverState, + V, R <: Real, + SC <: StoppingCriterion, TRTM <: AbstractRetractionMethod, TStepsize <: Stepsize, + TStepPr <: AbstractManoptProblem, TStepSt <: AbstractManoptSolverState, } <: AbstractHessianSolverState p::P X::T @@ -127,76 +135,66 @@ mutable struct InteriorPointNewtonState{ step_state::TStepSt is_feasible_error::Symbol function InteriorPointNewtonState( - M::AbstractManifold, - cmo::ConstrainedManifoldObjective, - sub_problem::Pr, - sub_state::St; - p::P = rand(M), - X::T = zero_vector(M, p), - μ::V = ones(length(get_inequality_constraint(M, cmo, p, :))), - Y::V = zero(μ), - λ::V = zeros(length(get_equality_constraint(M, cmo, p, :))), - Z::V = zero(λ), - s::V = ones(length(get_inequality_constraint(M, cmo, p, :))), - W::V = zero(s), - ρ::R = μ's / length(get_inequality_constraint(M, cmo, p, :)), - σ::R = calculate_σ(M, cmo, p, μ, λ, s), + sub_problem::Pr, sub_state::St; + p::P, X::T, μ::V, λ::V, s::V, + ρ::R, σ::R, is_feasible_error::Symbol = :error, + Y::V = zero(μ), Z::V = zero(λ), W::V = zero(s), stopping_criterion::SC = StopAfterIteration(200) | StopWhenChangeLess(1.0e-8), + retraction_method::RTM, + step_problem::StepPr, step_state::StepSt, stepsize::S, kwargs... + ) where { + P, T, V, R, + Pr <: Union{AbstractManoptProblem, F} where {F}, St <: AbstractManoptSolverState, + StepPr <: AbstractManoptProblem, StepSt <: AbstractManoptSolverState, + SC <: StoppingCriterion, RTM <: AbstractRetractionMethod, S <: Stepsize, + } + ips = new{P, T, Pr, St, V, R, SC, RTM, S, StepPr, StepSt}() + ips.p = p; ips.sub_problem = sub_problem; ips.sub_state = sub_state + ips.μ = μ; ips.λ = λ; ips.s = s; ips.ρ = ρ; ips.σ = σ + ips.X = X; ips.Y = Y; ips.Z = Z; ips.W = W + ips.stop = stopping_criterion + ips.retraction_method = retraction_method + ips.stepsize = stepsize + ips.step_problem = step_problem; ips.step_state = step_state + ips.is_feasible_error = is_feasible_error + return ips + end + function InteriorPointNewtonState( + M::AbstractManifold, cmo::ConstrainedManifoldObjective, sub_problem::Pr, sub_state::St; + p = rand(M), X = zero_vector(M, p), + μ = ones(length(get_inequality_constraint(M, cmo, p, :))), + λ = zeros(length(get_equality_constraint(M, cmo, p, :))), + s = ones(length(get_inequality_constraint(M, cmo, p, :))), + ρ = μ's / length(get_inequality_constraint(M, cmo, p, :)), + σ = calculate_σ(M, cmo, p, μ, λ, s), retraction_method::RTM = default_retraction_method(M), step_objective = ManifoldGradientObjective( - KKTVectorFieldNormSq(cmo), - KKTVectorFieldNormSqGradient(cmo); + KKTVectorFieldNormSq(cmo), KKTVectorFieldNormSqGradient(cmo); evaluation = InplaceEvaluation(), ), vector_space = Rn, - _step_M = M × vector_space(length(μ)) × vector_space(length(λ)) × - vector_space(length(s)), + _step_M = M × vector_space(length(μ)) × vector_space(length(λ)) × vector_space(length(s)), step_problem::StepPr = DefaultManoptProblem(_step_M, step_objective), _step_p = rand(_step_M), - step_state::StepSt = StepsizeState(_step_p, zero_vector(_step_M, _step_p)), - centrality_condition::F = (N, p) -> true, + step_state::StepSt = StepsizeState(; p = _step_p, X = zero_vector(_step_M, _step_p)), + centrality_condition = (N, p) -> true, stepsize::S = ArmijoLinesearchStepsize( get_manifold(step_problem); retraction_method = default_retraction_method(get_manifold(step_problem)), - initial_stepsize = 1.0, - additional_decrease_condition = centrality_condition, + initial_stepsize = 1.0, additional_decrease_condition = centrality_condition, ), - is_feasible_error::Symbol = :error, kwargs..., ) where { - P, - T, - Pr <: Union{AbstractManoptProblem, F} where {F}, - St <: AbstractManoptSolverState, - V, - R, - F, - SC <: StoppingCriterion, - StepPr <: AbstractManoptProblem, - StepSt <: AbstractManoptSolverState, - RTM <: AbstractRetractionMethod, - S <: Stepsize, + Pr <: Union{AbstractManoptProblem, F} where {F}, St <: AbstractManoptSolverState, + RTM <: AbstractRetractionMethod, S <: Stepsize, + StepPr <: AbstractManoptProblem, StepSt <: AbstractManoptSolverState, } - ips = new{P, T, Pr, St, V, R, SC, RTM, S, StepPr, StepSt}() - ips.p = p - ips.sub_problem = sub_problem - ips.sub_state = sub_state - ips.μ = μ - ips.λ = λ - ips.s = s - ips.ρ = ρ - ips.σ = σ - ips.X = X - ips.Y = Y - ips.Z = Z - ips.W = W - ips.stop = stopping_criterion - ips.retraction_method = retraction_method - ips.stepsize = stepsize - ips.step_problem = step_problem - ips.step_state = step_state - ips.is_feasible_error = is_feasible_error - return ips + return InteriorPointNewtonState( + sub_problem, sub_state; + p = p, X = X, μ = μ, λ = λ, s = s, ρ = ρ, σ = σ, retraction_method = retraction_method, + step_problem = step_problem, step_state, stepsize = stepsize, + kwargs... + ) end end function InteriorPointNewtonState( @@ -226,10 +224,11 @@ function get_message(ips::InteriorPointNewtonState) return get_message(ips.stepsize) end # pretty print state info -function show(io::IO, ips::InteriorPointNewtonState) +function status_summary(ips::InteriorPointNewtonState; context::Symbol = :default) i = get_count(ips, :Iterations) Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(ips.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(ips)) – $(Iter) $(has_converged(ips) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Interior Point Newton Method $Iter @@ -238,15 +237,22 @@ function show(io::IO, ips::InteriorPointNewtonState) * σ: $(ips.σ) * retraction method: $(ips.retraction_method) - ## Stopping criterion - $(status_summary(ips.stop)) ## Stepsize - $(ips.stepsize) - This indicates convergence: $Conv - """ - return print(io, s) -end + $(_in_str(status_summary(ips.stepsize; context = context); indent = 1, headers = 1)) + ## Stopping criterion + $(_in_str(status_summary(ips.stop; context = context); indent = 1, headers = 1)) This indicates convergence: $Conv""" + return s +end +function Base.show(io::IO, ipns::InteriorPointNewtonState) + print(io, "InteriorPointNewtonState(", ipns.sub_problem, ", ", ipns.sub_state, ";") + print(io, " is_feasibility_error = ", ipns.is_feasible_error, ", retraction_method = ", ipns.retraction_method) + print(io, ", p = ", ipns.p, ", X = ", ipns.X, ", μ = ", ipns.μ, ", Y = ", ipns.Y) + print(io, ", λ = ", ipns.λ, ", Z = ", ipns.Z, ", s = ", ipns.s, ", W = ", ipns.W) + print(io, ", ρ = ", ipns.ρ, ", σ = ", ipns.σ, ", step_problem = ", ipns.step_problem) + print(io, ", step_state = ", ipns.step_state) + return print(io, ")") +end # # Constraint functors # @@ -309,8 +315,7 @@ b(p,λ) = $( CondensedKKTVectorField(cmo, μ, s, β) """ -mutable struct CondensedKKTVectorField{O <: ConstrainedManifoldObjective, T, R} <: - AbstractConstrainedSlackFunctor{T, R} +mutable struct CondensedKKTVectorField{O <: ConstrainedManifoldObjective, T, R} <: AbstractConstrainedSlackFunctor{T, R} cmo::O μ::T s::T @@ -347,11 +352,17 @@ function (cKKTvf::CondensedKKTVectorField)(N, Y, q) end return Y end - -function show(io::IO, CKKTvf::CondensedKKTVectorField) - return print( - io, "CondensedKKTVectorField\n\twith μ=$(CKKTvf.μ), s=$(CKKTvf.s), β=$(CKKTvf.β)" - ) +function status_summary(CKKTvf::CondensedKKTVectorField; context::Symbol = :default) + _is_inline(context) && (return repr(CKKTvf)) + return """ + The condensed KKT vector field for the constrained objective + $(_in_str(status_summary(CKKTvf.cmo; context = context); indent = 1)) + with μ=$(CKKTvf.μ) s=$(CKKTvf.s) β=$(CKKTvf.β)""" +end +function Base.show(io::IO, CKKTvf::CondensedKKTVectorField) + print(io, "CondensedKKTVectorField(") + print(io, CKKTvf.cmo) + return print(io, ", $(CKKTvf.μ), $(CKKTvf.s), $(CKKTvf.β))") end @doc """ @@ -410,8 +421,7 @@ $( CondensedKKTVectorFieldJacobian(cmo, μ, s, β) """ -mutable struct CondensedKKTVectorFieldJacobian{O <: ConstrainedManifoldObjective, T, R} <: - AbstractConstrainedSlackFunctor{T, R} +mutable struct CondensedKKTVectorFieldJacobian{O <: ConstrainedManifoldObjective, T, R} <: AbstractConstrainedSlackFunctor{T, R} cmo::O μ::T s::T @@ -453,11 +463,17 @@ function (cKKTvfJ::CondensedKKTVectorFieldJacobian)(N, Y, q, X) end return Y end -function show(io::IO, CKKTvfJ::CondensedKKTVectorFieldJacobian) - return print( - io, - "CondensedKKTVectorFieldJacobian\n\twith μ=$(CKKTvfJ.μ), s=$(CKKTvfJ.s), β=$(CKKTvfJ.β)", - ) +function status_summary(CKKTvfJ::CondensedKKTVectorFieldJacobian; context::Symbol = :default) + _is_inline(context) && (return repr(CKKTvfJ)) + return """ + The Jacobian of the condensed KKT vector field for the constrained objective + $(_in_str(status_summary(CKKTvfJ.cmo; context = context); indent = 1)) + with μ=$(CKKTvfJ.μ) s=$(CKKTvfJ.s) β=$(CKKTvfJ.β)""" +end +function Base.show(io::IO, CKKTvfJ::CondensedKKTVectorFieldJacobian) + print(io, "CondensedKKTVectorFieldJacobian(") + print(io, CKKTvfJ.cmo) + return print(io, ", $(CKKTvfJ.μ), $(CKKTvfJ.s), $(CKKTvfJ.β))") end @doc """ @@ -533,8 +549,14 @@ function (KKTvf::KKTVectorField)(N, Y, q) (m > 0) && (Y4 .= μ .* s) return Y end -function show(io::IO, KKTvf::KKTVectorField) - return print(io, "KKTVectorField\nwith the objective\n\t$(KKTvf.cmo)") +function Base.show(io::IO, KKTvf::KKTVectorField) + print(io, "KKTVectorField(") + print(io, KKTvf.cmo) + return print(io, ")") +end +function status_summary(KKTvf::KKTVectorField; context::Symbol = :default) + _is_inline(context) && (return repr(KKTvf)) + return "The KKT vector field for the constrained objective\n$(_MANOPT_INDENT)$(status_summary(KKTvf.cmo; context = context))" end @doc """ @@ -611,8 +633,14 @@ function (KKTvfJ::KKTVectorFieldJacobian)(N, Z, q, Y) Z4 .= μ .* Y4 .+ s .* Y2 return Z end -function show(io::IO, KKTvfJ::KKTVectorFieldJacobian) - return print(io, "KKTVectorFieldJacobian\nwith the objective\n\t$(KKTvfJ.cmo)") +function Base.show(io::IO, KKTvfJ::KKTVectorFieldJacobian) + print(io, "KKTVectorFieldJacobian(") + print(io, KKTvfJ.cmo) + return print(io, ")") +end +function status_summary(KKTvfJ::KKTVectorFieldJacobian; context::Symbol = :default) + _is_inline(context) && (return repr(KKTvfJ)) + return "The Jacobian of the KKT vector field for the constrained objective\n$(_MANOPT_INDENT)$(status_summary(KKTvfJ.cmo; context = context))" end @doc """ @@ -688,8 +716,14 @@ function (KKTvfAdJ::KKTVectorFieldAdjointJacobian)(N, Z, q, Y) Z4 .= μ .* Y4 .+ Y2 return Z end -function show(io::IO, KKTvfAdJ::KKTVectorFieldAdjointJacobian) - return print(io, "KKTVectorFieldAdjointJacobian\nwith the objective\n\t$(KKTvfAdJ.cmo)") +function Base.show(io::IO, KKTvfAdJ::KKTVectorFieldAdjointJacobian) + print(io, "KKTVectorFieldAdjointJacobian(") + print(io, KKTvfAdJ.cmo) + return print(io, ")") +end +function status_summary(KKTvfAdJ::KKTVectorFieldAdjointJacobian; context::Symbol = :default) + _is_inline(context) && (return repr(KKTvfAdJ)) + return "The adjoint Jacobian of the KKT vector field for the constrained objective\n$(_MANOPT_INDENT)$(status_summary(KKTvfAdJ.cmo; context = context))" end @doc """ @@ -721,8 +755,14 @@ function (KKTvc::KKTVectorFieldNormSq)(N, q) KKTVectorField(KKTvc.cmo)(N, Y, q) return inner(N, q, Y, Y) end -function show(io::IO, KKTvfNSq::KKTVectorFieldNormSq) - return print(io, "KKTVectorFieldNormSq\nwith the objective\n\t$(KKTvfNSq.cmo)") +function Base.show(io::IO, KKTvfNSq::KKTVectorFieldNormSq) + print(io, "KKTVectorFieldNormSq(") + print(io, KKTvfNSq.cmo) + return print(io, ")") +end +function status_summary(KKTvfNSq::KKTVectorFieldNormSq; context::Symbol = :default) + _is_inline(context) && (return repr(KKTvfNSq)) + return "The KKT vector field in normed squared for the constrained objective\n$(_MANOPT_INDENT)$(status_summary(KKTvfNSq.cmo; context = context))" end @doc """ @@ -789,12 +829,15 @@ function (KKTcfNG::KKTVectorFieldNormSqGradient)(N, Y, q) Y .*= 2 return Y end -function show(io::IO, KKTvfNSqGrad::KKTVectorFieldNormSqGradient) - return print( - io, "KKTVectorFieldNormSqGradient\nwith the objective\n\t$(KKTvfNSqGrad.cmo)" - ) +function Base.show(io::IO, KKTvfNSqGrad::KKTVectorFieldNormSqGradient) + print(io, "KKTVectorFieldNormSqGradient(") + print(io, KKTvfNSqGrad.cmo) + return print(io, ")") +end +function status_summary(KKTvfNSqGrad::KKTVectorFieldNormSqGradient; context::Symbol = :default) + _is_inline(context) && (return repr(KKTvfNSqGrad)) + return "The gradient of the KKT vector field in normed squared for the constrained objective\n$(_MANOPT_INDENT)$(status_summary(KKTvfNSqGrad.cmo; context = context))" end - # # # A special linesearch for IP Newton @@ -974,14 +1017,14 @@ function get_reason(c::StopWhenKKTResidualLess) end return "" end -function status_summary(swrr::StopWhenKKTResidualLess) +function status_summary(swrr::StopWhenKKTResidualLess; context::Symbol = :default) has_stopped = (swrr.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "‖F(p, λ, μ)‖ < ε:\t$s" + return (_is_inline(context) ? "‖F(p, λ, μ)‖ < ε = $(swrr.ε):$(_MANOPT_INDENT)" : "Stop when the KKT residual is less than ε = $(swrr.ε)\n$(_MANOPT_INDENT)") * s end indicates_convergence(::StopWhenKKTResidualLess) = true -function show(io::IO, c::StopWhenKKTResidualLess) - return print(io, "StopWhenKKTResidualLess($(c.ε))\n $(status_summary(c))") +function Base.show(io::IO, c::StopWhenKKTResidualLess) + return print(io, "StopWhenKKTResidualLess($(c.ε))") end # An internal function to compute the new σ diff --git a/src/plans/mesh_adaptive_plan.jl b/src/plans/mesh_adaptive_plan.jl index 733ef0c024..90004a9d13 100644 --- a/src/plans/mesh_adaptive_plan.jl +++ b/src/plans/mesh_adaptive_plan.jl @@ -99,15 +99,8 @@ $(_fields([:retraction_method, :vector_transport_method])) $(_kwargs([:retraction_method, :vector_transport_method, :X])) """ mutable struct LowerTriangularAdaptivePoll{ - P, - T, - F <: Real, - V <: AbstractVector{F}, - M <: AbstractMatrix{F}, - I <: Int, - B, - VTM <: AbstractVectorTransportMethod, - RM <: AbstractRetractionMethod, + P, T, F <: Real, V <: AbstractVector{F}, M <: AbstractMatrix{F}, + I <: Int, B, VTM <: AbstractVectorTransportMethod, RM <: AbstractRetractionMethod, } <: AbstractMeshPollFunction base_point::P candidate::P @@ -120,33 +113,36 @@ mutable struct LowerTriangularAdaptivePoll{ last_poll_improved::Bool retraction_method::RM vector_transport_method::VTM + function LowerTriangularAdaptivePoll(; + base_point::P, candidate::P, poll_counter::I, random_vector::AbstractVector, + random_index::I, mesh::AbstractMatrix, basis::B, X::T, + last_poll_improved::Bool, retraction_method::RM, vector_transport_method::VTM, + ) where { + P, T, I <: Int, B, VTM <: AbstractVectorTransportMethod, RM <: AbstractRetractionMethod, + } + F = promote_type(eltype(random_vector), eltype(mesh)) + rv = convert.(Ref(F), random_vector) + m = convert.(Ref(F), mesh) + return new{P, T, F, typeof(rv), typeof(m), I, B, VTM, RM}( + base_point, candidate, poll_counter, rv, random_index, m, basis, + X, last_poll_improved, retraction_method, vector_transport_method + ) + end end - function LowerTriangularAdaptivePoll( - M::AbstractManifold, - p = rand(M); + M::AbstractManifold, p = rand(M); basis::AbstractBasis = default_basis(M, typeof(p)), - retraction_method::AbstractRetractionMethod = default_retraction_method(M), - vector_transport_method::AbstractVectorTransportMethod = default_vector_transport_method( - M - ), + retraction_method::AbstractRetractionMethod = default_retraction_method(M, typeof(p)), + vector_transport_method::AbstractVectorTransportMethod = default_vector_transport_method(M, typeof(p)), X = zero_vector(M, p), ) d = manifold_dimension(M) b_l = zeros(d) D_k = zeros(d, d + 1) - return LowerTriangularAdaptivePoll( - p, - copy(M, p), - 0, - b_l, - 0, - D_k, - basis, - X, - false, - retraction_method, - vector_transport_method, + return LowerTriangularAdaptivePoll(; + base_point = p, candidate = copy(M, p), poll_counter = 0, random_vector = b_l, random_index = 0, + mesh = D_k, basis = basis, X = X, last_poll_improved = false, + retraction_method = retraction_method, vector_transport_method = vector_transport_method, ) end """ @@ -198,14 +194,24 @@ function update_basepoint!(M, ltap::LowerTriangularAdaptivePoll{P}, p::P) where copyto!(M, ltap.candidate, ltap.base_point) return ltap end -function show(io::IO, ltap::LowerTriangularAdaptivePoll) - s = """LowerTriangularAdaptivePoll - with +function Base.show(io::IO, ltap::LowerTriangularAdaptivePoll) + print(io, "LowerTriangularAdaptivePoll(; base_point = ", ltap.base_point, ", candidate = ", ltap.candidate) + print(io, "poll_counter = ", ltap.poll_counter, ", random_vector = ", ltap.random_vector) + print(io, " random_index = ", ltap.random_index, ", mesh = ", ltap.mesh, "basis = ", ltap.basis) + print(io, "last_poll_improved = ", ltap.last_poll_improved, ", retraction_method = ", ltap.retraction_method) + print(io, "vector_transport_method = ", ltap.vector_transport_method) + return print(io, ")") +end +function status_summary(ltap::LowerTriangularAdaptivePoll; context::Symbol = :default) + (context === :short) && return repr(ltap) + (context === :inline) && return "A lower triangular adaptive poll using the $(ltap.retraction_method) and $(ltap.vector_transport_method)" + s = """A Lower triangular adaptive poll + + ## Parameters * basis on the tangent space: $(ltap.basis) * retraction_method: $(ltap.retraction_method) - * vector_transport_method: $(ltap.vector_transport_method) - """ - return print(io, s) + * vector_transport_method: $(ltap.vector_transport_method)""" + return s end function (ltap::LowerTriangularAdaptivePoll)( amp::AbstractManoptProblem, @@ -260,11 +266,7 @@ function (ltap::LowerTriangularAdaptivePoll)( i = i + 1 # runs for the last time for i=n+1 and hence the sum. # get vector – scale mesh get_vector!( - M, - ltap.X, - ltap.base_point, - mesh_size * scale_mesh .* ltap.mesh[:, i], - ltap.basis, + M, ltap.X, ltap.base_point, mesh_size * scale_mesh .* ltap.mesh[:, i], ltap.basis, ) # shorten if necessary ltap_X_norm = norm(M, ltap.base_point, ltap.X) @@ -313,21 +315,23 @@ $(_fields(:retraction_method)) $(_kwargs([:retraction_method, :X])) """ -mutable struct DefaultMeshAdaptiveDirectSearch{P, T, RM <: AbstractRetractionMethod} <: - AbstractMeshSearchFunction +mutable struct DefaultMeshAdaptiveDirectSearch{P, T, RM <: AbstractRetractionMethod} <: AbstractMeshSearchFunction p::P q::P X::T last_search_improved::Bool retraction_method::RM + function DefaultMeshAdaptiveDirectSearch(; + p::P, q::P, X::T, last_search_improved::Bool, retraction_method::RM + ) where {P, T, RM <: AbstractRetractionMethod} + return new{P, T, RM}(p, q, X, last_search_improved, retraction_method) + end end function DefaultMeshAdaptiveDirectSearch( - M::AbstractManifold, - p = rand(M); - X = zero_vector(M, p), - retraction_method::AbstractRetractionMethod = default_retraction_method(M), + M::AbstractManifold, p = rand(M); + X = zero_vector(M, p), retraction_method::AbstractRetractionMethod = default_retraction_method(M), ) - return DefaultMeshAdaptiveDirectSearch(p, copy(M, p), X, false, retraction_method) + return DefaultMeshAdaptiveDirectSearch(; p = p, q = copy(M, p), X = X, last_search_improved = false, retraction_method = retraction_method) end """ is_successful(dmads::DefaultMeshAdaptiveDirectSearch) @@ -345,20 +349,25 @@ Return the last candidate a [`DefaultMeshAdaptiveDirectSearch`](@ref) found function get_candidate(dmads::DefaultMeshAdaptiveDirectSearch) return dmads.p end -function show(io::IO, dmads::DefaultMeshAdaptiveDirectSearch) - s = """DefaultMeshAdaptiveDirectSearch - with - * retraction_method: $(dmads.retraction_method) - """ - return print(io, s) +function Base.show(io::IO, dmads::DefaultMeshAdaptiveDirectSearch) + print(io, "DefaultMeshAdaptiveDirectSearch(; p = ", dmads.p, ", q = ", dmads.q) + print(io, ", X = ", dmads.X, ", last_search_improved = ", dmads.last_search_improved) + print(io, ", retraction_method = ", dmads.retraction_method) + return print(io, ")") +end +function status_summary(dmads::DefaultMeshAdaptiveDirectSearch; context::Symbol = :default) + (context === :short) && return repr(dmads) + (context === :inline) && "The default mesh adaptive direct search along a given direction using the $(dmads.retraction_method)" + return """The default mesh adaptive direct search + along one given direction X. + + ## Parameters + * retraction_method: $(dmads.retraction_method) + * last search did $(dmads.last_search_improved ? "" : "not ")improve the cost""" end function (dmads::DefaultMeshAdaptiveDirectSearch)( - amp::AbstractManoptProblem, - mesh_size::Real, - p, - X; - scale_mesh::Real = 1.0, - max_stepsize::Real = Inf, + amp::AbstractManoptProblem, mesh_size::Real, p, X; + scale_mesh::Real = 1.0, max_stepsize::Real = Inf, ) M = get_manifold(amp) dmads.X .= (4 * mesh_size * scale_mesh) .* X @@ -391,11 +400,7 @@ $(_fields(:stopping_criterion; name = "stop")) """ mutable struct MeshAdaptiveDirectSearchState{ - P, - F <: Real, - PT <: AbstractMeshPollFunction, - ST <: AbstractMeshSearchFunction, - SC <: StoppingCriterion, + P, F <: Real, PT <: AbstractMeshPollFunction, ST <: AbstractMeshSearchFunction, SC <: StoppingCriterion, } <: AbstractManoptSolverState p::P mesh_size::F @@ -405,62 +410,68 @@ mutable struct MeshAdaptiveDirectSearchState{ stop::SC poll::PT search::ST + function MeshAdaptiveDirectSearchState(; + p::P, mesh_size::F, scale_mesh::F, max_stepsize::F, poll_size::F, stopping_criterion::SC, poll::PT, search::ST + ) where {P, F <: Real, PT <: AbstractMeshPollFunction, ST <: AbstractMeshSearchFunction, SC <: StoppingCriterion} + return new{P, F, PT, ST, SC}(p, mesh_size, scale_mesh, max_stepsize, poll_size, stopping_criterion, poll, search) + end end function MeshAdaptiveDirectSearchState( - M::AbstractManifold, - p::P = rand(M); - mesh_basis::B = default_basis(M, typeof(p)), - scale_mesh::F = injectivity_radius(M) / 2, - max_stepsize::F = injectivity_radius(M), + M::AbstractManifold, p::P = rand(M); + mesh_basis::B = default_basis(M, typeof(p)), scale_mesh::Real = injectivity_radius(M) / 2, + max_stepsize::Real = injectivity_radius(M), poll_size::Real = manifold_dimension(M), stopping_criterion::SC = StopAfterIteration(500) | StopWhenPollSizeLess(1.0e-7), retraction_method::AbstractRetractionMethod = default_retraction_method(M, typeof(p)), - vector_transport_method::AbstractVectorTransportMethod = default_vector_transport_method( - M, typeof(p) - ), + vector_transport_method::AbstractVectorTransportMethod = default_vector_transport_method(M, typeof(p)), poll::PT = LowerTriangularAdaptivePoll( - M, - copy(M, p); - basis = mesh_basis, - retraction_method = retraction_method, - vector_transport_method = vector_transport_method, + M, copy(M, p); + basis = mesh_basis, retraction_method = retraction_method, vector_transport_method = vector_transport_method, ), search::ST = DefaultMeshAdaptiveDirectSearch( M, copy(M, p); retraction_method = retraction_method ), ) where { - P, - F, - PT <: AbstractMeshPollFunction, - ST <: AbstractMeshSearchFunction, - SC <: StoppingCriterion, - B <: AbstractBasis, + P, PT <: AbstractMeshPollFunction, ST <: AbstractMeshSearchFunction, + SC <: StoppingCriterion, B <: AbstractBasis, } - poll_s = manifold_dimension(M) * 1.0 - return MeshAdaptiveDirectSearchState{P, F, PT, ST, SC}( - p, 1.0, scale_mesh, max_stepsize, poll_s, stopping_criterion, poll, search + R = promote_type(typeof(scale_mesh), typeof(max_stepsize)) + scale_mesh = convert(R, scale_mesh) + max_stepsize = convert(R, max_stepsize) + poll_size = convert(R, poll_size) + return MeshAdaptiveDirectSearchState(; + p = p, mesh_size = one(R), scale_mesh = scale_mesh, max_stepsize = max_stepsize, poll_size = poll_size, + stopping_criterion = stopping_criterion, poll = poll, search = search ) end get_iterate(mads::MeshAdaptiveDirectSearchState) = mads.p - -function show(io::IO, mads::MeshAdaptiveDirectSearchState) +function Base.show(io::IO, mads::MeshAdaptiveDirectSearchState) + print(io, "MeshAdaptiveDirectSearchState(; p = ", mads.p) + print(io, ", mesh_size = ", mads.mesh_size, ", scale_mesh = ", mads.scale_mesh, ", max_stepsize = ", mads.max_stepsize, ", poll_size = ", mads.poll_size) + return print(io, "stopping_criterion = ", mads.stop, ", poll = ", mads.poll, ", search = ", mads.search, ")") +end +function status_summary(mads::MeshAdaptiveDirectSearchState; context::Symbol = :default) + (context === :short) && return repr(mads) i = get_count(mads, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(mads.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the trust region solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" + Conv = indicates_convergence(mads.stop) ? "Yes" : "No" + (context === :inline) && (return "A Mesh adaptive direct search state – $(Iter) $(has_converged(trs) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s mesh adaptive direct search $Iter - ## Parameters * mesh_size: $(mads.mesh_size) * scale_mesh: $(mads.scale_mesh) * max_stepsize: $(mads.max_stepsize) * poll_size: $(mads.poll_size) - * poll:\n $(replace(repr(mads.poll), "\n" => "\n ")[1:(end - 3)]) - * search:\n $(replace(repr(mads.search), "\n" => "\n ")[1:(end - 3)]) + * poll:\n $(_in_str(status_summary(mads.poll; context = context); indent = 1)) + * search:\n $(_in_str(status_summary(mads.search; context = context); indent = 1)) ## Stopping criterion - $(status_summary(mads.stop)) + $(_in_str(status_summary(mads.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv """ - return print(io, s) + return s end get_solver_result(ips::MeshAdaptiveDirectSearchState) = ips.p @@ -484,11 +495,9 @@ function get_reason(c::StopWhenPollSizeLess) end return "" end -function status_summary(c::StopWhenPollSizeLess) +function status_summary(c::StopWhenPollSizeLess; context::Symbol = :default) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Poll step size s < $(c.threshold):\t$s" -end -function show(io::IO, c::StopWhenPollSizeLess) - return print(io, "StopWhenPollSizeLess($(c.threshold))\n $(status_summary(c))") + return (_is_inline(context) ? "Poll step size s < $(c.threshold):$(_MANOPT_INDENT)" : "Stop when the poll step size is less than the threshold $(c.threshold)\n$(_MANOPT_INDENT)") * s end +show(io::IO, c::StopWhenPollSizeLess) = print(io, "StopWhenPollSizeLess($(c.threshold))") diff --git a/src/plans/nonlinear_least_squares_plan.jl b/src/plans/nonlinear_least_squares_plan.jl index e1c9cbba54..8c77def7f7 100644 --- a/src/plans/nonlinear_least_squares_plan.jl +++ b/src/plans/nonlinear_least_squares_plan.jl @@ -1,5 +1,5 @@ @doc """ - NonlinearLeastSquaresObjective{E<:AbstractEvaluationType} <: AbstractManifoldObjective{T} + ManifoldNonlinearLeastSquaresObjective{E<:AbstractEvaluationType} <: AbstractManifoldObjective{T} An objective to model the nonlinear least squares problem @@ -13,13 +13,13 @@ Specify a nonlinear least squares problem cost functions ``f_i`` (or a function returning a vector of costs) as well as their gradients ``$(_tex(:grad)) f_i`` (or Jacobian of the vector-valued function). -This `NonlinearLeastSquaresObjective` then has the same [`AbstractEvaluationType`](@ref) `T` +This `ManifoldNonlinearLeastSquaresObjective` then has the same [`AbstractEvaluationType`](@ref) `T` as the (inner) `objective`. # Constructors - NonlinearLeastSquaresObjective(f, jacobian, range_dimension::Integer; kwargs...) - NonlinearLeastSquaresObjective(vf::AbstractVectorGradientFunction) + ManifoldNonlinearLeastSquaresObjective(f, jacobian, range_dimension::Integer; kwargs...) + ManifoldNonlinearLeastSquaresObjective(vf::AbstractVectorGradientFunction) # Arguments @@ -44,13 +44,13 @@ $(_kwargs(:evaluation)) [`LevenbergMarquardt`](@ref), [`LevenbergMarquardtState`](@ref) """ -struct NonlinearLeastSquaresObjective{ +struct ManifoldNonlinearLeastSquaresObjective{ E <: AbstractEvaluationType, F <: AbstractVectorGradientFunction{E}, } <: AbstractManifoldFirstOrderObjective{E, F} objective::F end -function NonlinearLeastSquaresObjective( +function ManifoldNonlinearLeastSquaresObjective( f, jacobian, range_dimension::Integer; @@ -68,13 +68,12 @@ function NonlinearLeastSquaresObjective( jacobian_type = jacobian_type, function_type = function_type, ) - return NonlinearLeastSquaresObjective(vgf; kwargs...) + return ManifoldNonlinearLeastSquaresObjective(vgf; kwargs...) end - # Cost function get_cost( M::AbstractManifold, - nlso::NonlinearLeastSquaresObjective{ + nlso::ManifoldNonlinearLeastSquaresObjective{ E, <:AbstractVectorFunction{E, <:ComponentVectorialType}, }, p; @@ -89,7 +88,7 @@ function get_cost( end function get_cost( M::AbstractManifold, - nlso::NonlinearLeastSquaresObjective{ + nlso::ManifoldNonlinearLeastSquaresObjective{ E, <:AbstractVectorFunction{E, <:FunctionVectorialType}, }, p; @@ -99,7 +98,7 @@ function get_cost( end function get_jacobian( - M::AbstractManifold, nlso::NonlinearLeastSquaresObjective, p; kwargs... + M::AbstractManifold, nlso::ManifoldNonlinearLeastSquaresObjective, p; kwargs... ) J = zeros(length(nlso.objective), manifold_dimension(M)) get_jacobian!(M, J, nlso, p; kwargs...) @@ -107,13 +106,13 @@ function get_jacobian( end # The jacobian is now just a pass-through function get_jacobian!( - M::AbstractManifold, J, nlso::NonlinearLeastSquaresObjective, p; kwargs... + M::AbstractManifold, J, nlso::ManifoldNonlinearLeastSquaresObjective, p; kwargs... ) get_jacobian!(M, J, nlso.objective, p; kwargs...) return J end function get_gradient( - M::AbstractManifold, nlso::NonlinearLeastSquaresObjective, p; kwargs... + M::AbstractManifold, nlso::ManifoldNonlinearLeastSquaresObjective, p; kwargs... ) X = zero_vector(M, p) return get_gradient!(M, X, nlso, p; kwargs...) @@ -121,7 +120,7 @@ end function get_gradient!( M::AbstractManifold, X, - nlso::NonlinearLeastSquaresObjective, + nlso::ManifoldNonlinearLeastSquaresObjective, p; basis = get_basis(nlso.objective.jacobian_type), jacobian_cache = get_jacobian(M, nlso, p; basis = basis), @@ -134,30 +133,30 @@ end # # --- Residuals _doc_get_residuals_nlso = """ - get_residuals(M::AbstractManifold, nlso::NonlinearLeastSquaresObjective, p) - get_residuals!(M::AbstractManifold, V, nlso::NonlinearLeastSquaresObjective, p) + get_residuals(M::AbstractManifold, nlso::ManifoldNonlinearLeastSquaresObjective, p) + get_residuals!(M::AbstractManifold, V, nlso::ManifoldNonlinearLeastSquaresObjective, p) Compute the vector of residuals ``f_i(p)``, ``i=1,…,m`` given the manifold `M`, -the [`NonlinearLeastSquaresObjective`](@ref) `nlso` and a current point ``p`` on `M`. +the [`ManifoldNonlinearLeastSquaresObjective`](@ref) `nlso` and a current point ``p`` on `M`. """ @doc "$(_doc_get_residuals_nlso)" -get_residuals(M::AbstractManifold, nlso::NonlinearLeastSquaresObjective, p; kwargs...) +get_residuals(M::AbstractManifold, nlso::ManifoldNonlinearLeastSquaresObjective, p; kwargs...) function get_residuals( - M::AbstractManifold, nlso::NonlinearLeastSquaresObjective, p; kwargs... + M::AbstractManifold, nlso::ManifoldNonlinearLeastSquaresObjective, p; kwargs... ) V = zeros(length(nlso.objective)) return get_residuals!(M, V, nlso, p; kwargs...) end @doc "$(_doc_get_residuals_nlso)" -get_residuals!(M::AbstractManifold, V, nlso::NonlinearLeastSquaresObjective, p; kwargs...) +get_residuals!(M::AbstractManifold, V, nlso::ManifoldNonlinearLeastSquaresObjective, p; kwargs...) function get_residuals!( M::AbstractManifold, V, - nlso::NonlinearLeastSquaresObjective{ + nlso::ManifoldNonlinearLeastSquaresObjective{ E, <:AbstractVectorFunction{E, <:ComponentVectorialType}, }, p; @@ -171,7 +170,7 @@ end function get_residuals!( M::AbstractManifold, V, - nlso::NonlinearLeastSquaresObjective{ + nlso::ManifoldNonlinearLeastSquaresObjective{ E, <:AbstractVectorFunction{E, <:FunctionVectorialType}, }, p, @@ -318,10 +317,11 @@ mutable struct LevenbergMarquardtState{ end end -function show(io::IO, lms::LevenbergMarquardtState) +function status_summary(lms::LevenbergMarquardtState; context::Symbol = :default) i = get_count(lms, :Iterations) Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(lms.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(lms)) – $(Iter) $(has_converged(lms) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Levenberg Marquardt Algorithm $Iter @@ -333,8 +333,22 @@ function show(io::IO, lms::LevenbergMarquardtState) * retraction method: $(lms.retraction_method) ## Stopping criterion - - $(status_summary(lms.stop)) + $(_in_str(status_summary(lms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s +end + +function status_summary(mnlso::ManifoldNonlinearLeastSquaresObjective; context::Symbol = :default) + (context === :short) && (return repr(mnlso)) + (context === :inline) && (return "A nonlinear least squares objective with the internal vector function given by $(status_summary(mnlso.objective; context = context))") + return """ + A nonlinear least squares objective. + + ## Vectorial objective + $(_in_str(status_summary(mnlso.objective; context = context); indent = 1))""" +end +function Base.show(io::IO, mnlso::ManifoldNonlinearLeastSquaresObjective) + print(io, "ManifoldNonlinearLeastSquaresObjective(") + print(io, mnlso.objective) + return print(io, ")") end diff --git a/src/plans/objective.jl b/src/plans/objective.jl index fee071919f..6f5c77d826 100644 --- a/src/plans/objective.jl +++ b/src/plans/objective.jl @@ -22,6 +22,11 @@ the type `T` indicates the global [`AbstractEvaluationType`](@ref). """ abstract type AbstractManifoldObjective{E <: AbstractEvaluationType} end +function Base.show(io::IO, ::MIME"text/plain", amo::AbstractManifoldObjective) + multiline = get(io, :multiline, true) + return multiline ? status_summary(io, amo) : show(io, amo) +end + @doc """ AbstractDecoratedManifoldObjective{E<:AbstractEvaluationType,O<:AbstractManifoldObjective} @@ -38,6 +43,7 @@ A parameter for a [`AbstractManoptProblem`](@ref) or a `Function` indicating tha the problem contains or the function(s) allocate memory for their result, they work out of place. """ struct AllocatingEvaluation <: AbstractEvaluationType end +_to_kw(::Type{AllocatingEvaluation}) = "evaluation = AllocatingEvaluation()" @doc """ InplaceEvaluation <: AbstractEvaluationType @@ -46,6 +52,7 @@ A parameter for a [`AbstractManoptProblem`](@ref) or a `Function` indicating tha the problem contains or the function(s) do not allocate memory but work on their input, in place. """ struct InplaceEvaluation <: AbstractEvaluationType end +_to_kw(::Type{InplaceEvaluation}) = "evaluation = InplaceEvaluation()" @doc """ ParentEvaluationType <: AbstractEvaluationType @@ -55,6 +62,7 @@ the problem contains or the function(s) do inherit their property from a parent [`AbstractManoptProblem`](@ref) or function. """ struct ParentEvaluationType <: AbstractEvaluationType end +_to_kw(::Type{ParentEvaluationType}) = "evaluation = ParentEvaluationType()" @doc """ AllocatingInplaceEvaluation <: AbstractEvaluationType @@ -64,6 +72,7 @@ the problem contains or the function(s) that provides both an allocating variant that does not allocate memory but work on their input, in place. """ struct AllocatingInplaceEvaluation <: AbstractEvaluationType end +_to_kw(::Type{AllocatingInplaceEvaluation}) = "evaluation = AllocatingInplaceEvaluation()" @doc """ ReturnManifoldObjective{E,O2,O1<:AbstractManifoldObjective{E}} <: @@ -92,7 +101,15 @@ function ReturnManifoldObjective( } return ReturnManifoldObjective{E, O2, O1}(o) end - +# The human readable version is “transparent” by default here +function status_summary(o::ReturnManifoldObjective; kwargs...) + return status_summary(o.objective; kwargs...) +end +function Base.show(io::IO, ro::ReturnManifoldObjective) + print(io, "ReturnManifoldObjective(") + show(io, ro.objective) + return print(io, ")") +end # # Internal converters if the variable in the high-level interface is a number. # @@ -209,25 +226,11 @@ function set_parameter!(amo::AbstractManifoldObjective, ::Val{:SubGradient}, arg return amo end -function show(io::IO, o::AbstractManifoldObjective{E}) where {E} - return print(io, "$(nameof(typeof(o))){$E}") -end -# Default: remove decorator for show -function show(io::IO, co::AbstractDecoratedManifoldObjective) - return show(io, get_objective(co, false)) -end -function show(io::IO, t::Tuple{<:AbstractManifoldObjective, P}) where {P} - s = "$(status_summary(t[1]))" - length(s) > 0 && (s = "$(s)\n\n") - return print( - io, "$(s)To access the solver result, call `get_solver_result` on this variable." - ) -end - -function status_summary(::AbstractManifoldObjective{E}) where {E} - return "" +# For decorators the human readable version is “transparent” by default, i.e. +# if no special addition is done, it just prints the human readable string from the child +function status_summary(io::IO, co::AbstractDecoratedManifoldObjective; context::Symbol = :default) + return status_summary(io, get_objective(co, false); context = :default) end -# Default: remove decorator for status summary -function status_summary(co::AbstractDecoratedManifoldObjective) +function status_summary(co::AbstractDecoratedManifoldObjective; context::Symbol = :default) return status_summary(get_objective(co, false)) end diff --git a/src/plans/plan.jl b/src/plans/plan.jl index 76d32e8de7..b0438faeb8 100644 --- a/src/plans/plan.jl +++ b/src/plans/plan.jl @@ -1,14 +1,60 @@ +function status_summary end + +_doc_status_summary = """ + status_summary(io, e; context::Symbol = :default) + status_summary(e; context::Symbol = :default) + +Returns a string reporting about the current status of an element `e` defined in `Manopt.jl`, +which can also directly be printed to an `IO` stream `io`. +This method should generate a human readable summary of `e`. + +By default, the variant with an `IO` stream dispatches to the one without to generate +a string and prints it to the `IO` stream. +If you implement the variant with the stream `io` remember to also provide the one without +Similarly, the + +The summary is meant to be used in different contexts +* `:default` should be the default and refers to a (multiline) context in REPL where a + human should read a comprehensive summary of `e` + This should also be the default +* `:inline` should be a shorter variant that can be used inline of other summaries, e.g. in lists +* `:short` should be a form even shorter or equal to `inline`, for example when in a list, + a certain element, like a [`DebugAction`](@ref) can be represented by a symbol. + The short variant should by default fall back to `:inline` """ - status_summary(e) -Return a string reporting about the current status of `e`, -where `e` is a type from Manopt. - -This method is similar to `show` but just returns a string. -It might also be more verbose in explaining, or hide internal information. -""" -status_summary(e) = "$(e)" +@doc "$(_doc_status_summary)" +status_summary(e; context::Symbol = :default) +@doc "$(_doc_status_summary)" +function status_summary(io::IO, e; context::Symbol = :default) + return print(io, status_summary(e; context = context)) +end +# +# +# status_summary string format helper +# --- +# check whether a context is inline or less +_is_inline(c) = (c == :inline || c == :short) +# _in_str - indent a string for use within another one +# * `indent = false` raise indentation by `indent_str` (`_MANOPT_INDENT` by default) +# * `headers = true` increase headers also on Headers that are indented with `indent_str` +# * `indent_str = _MANOPT_INDENT` string to use for indent +# * `indent_end = ""` a string to end the indentation, for example a `"| "` for visual distinction +function _in_str(s::String; indent = 0, headers = 1, indent_str = _MANOPT_INDENT, indent_end = "") + t = s + #add start + t = replace("$(indent_end)$t", "\n" => "\n$(indent_end)") + #add indent iteratively + for _ in 1:indent + t = replace("$(indent_str)$t", "\n" => "\n$(indent_str)") + end + # increase headers iteratively + for _ in 1:headers + t = replace(t, Regex("(?m)^($(indent_str)*)(#+)") => s"\1#\2") + end + return t +end """ set_parameter!(f, element::Symbol , args...) @@ -162,6 +208,8 @@ include("higher_order_primal_dual_plan.jl") include("stochastic_gradient_plan.jl") +include("box_plan.jl") + include("embedded_objective.jl") include("scaled_objective.jl") diff --git a/src/plans/primal_dual_plan.jl b/src/plans/primal_dual_plan.jl index 772d085120..b379c7b333 100644 --- a/src/plans/primal_dual_plan.jl +++ b/src/plans/primal_dual_plan.jl @@ -19,6 +19,25 @@ _get_manifold(tmp::TwoManifoldProblem, ::Val{2}) = tmp.second_manifold get_objective(tmo::TwoManifoldProblem) = tmo.objective +function show(io::IO, tmp::TwoManifoldProblem) + print(io, "TwoManifoldProblem("); show(io, tmp.first_manifold) + print(io, ", "); show(io, tmp.second_manifold) + print(io, ", "); show(io, tmp.objective) + return print(io, ")") +end +function status_summary(tmp::TwoManifoldProblem; context::Symbol = :default) + _is_inline(context) && return "An optimization problem to minimize $(tmp.objective) using a primal manifold $(tmp.first_manifold) and a dual manifold $(tmp.second_manifold)." + return """ + An optimization problem for Manopt.jl requiring a primal and a dual manifold + + ## Manifolds + * $(_in_str(repr(tmp.first_manifold); indent = 1)) + * $(_in_str(repr(tmp.second_manifold); indent = 1)) + + ## Objective + $(_in_str(status_summary(tmp.objective, context = context); indent = 1))""" +end + @doc """ AbstractPrimalDualManifoldObjective{E<:AbstractEvaluationType,C,P} <: AbstractManifoldCostObjective{E,C} @@ -70,29 +89,18 @@ mutable struct PrimalDualManifoldObjective{ Λ!!::L end function PrimalDualManifoldObjective( - cost, - prox_f, - prox_g_dual, - adjoint_linearized_operator; + cost::C, + prox_f::F, + prox_g_dual::G, + adjoint_linearized_operator::A; linearized_forward_operator::Union{Function, Missing} = missing, Λ::Union{Function, Missing} = missing, - evaluation::AbstractEvaluationType = AllocatingEvaluation(), - ) + evaluation::E = AllocatingEvaluation(), + ) where {E <: AbstractEvaluationType, C, F, G, A} return PrimalDualManifoldObjective{ - typeof(evaluation), - typeof(cost), - typeof(prox_f), - typeof(prox_g_dual), - typeof(linearized_forward_operator), - typeof(adjoint_linearized_operator), - typeof(Λ), + E, C, F, G, typeof(linearized_forward_operator), A, typeof(Λ), }( - cost, - prox_f, - prox_g_dual, - linearized_forward_operator, - adjoint_linearized_operator, - Λ, + cost, prox_f, prox_g_dual, linearized_forward_operator, adjoint_linearized_operator, Λ, ) end @@ -119,10 +127,7 @@ function get_primal_prox!(tmp::TwoManifoldProblem, q, σ, p) end function get_primal_prox( - M::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - σ, - p, + M::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, σ, p, ) return apdmo.prox_f!!(M, σ, p) end @@ -139,21 +144,13 @@ function get_primal_prox( end function get_primal_prox!( - M::AbstractManifold, - q, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - σ, - p, + M::AbstractManifold, q, apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, σ, p, ) copyto!(M, q, apdmo.prox_f!!(M, σ, p)) return q end function get_primal_prox!( - M::AbstractManifold, - q, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - σ, - p, + M::AbstractManifold, q, apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, σ, p, ) apdmo.prox_f!!(M, q, σ, p) return q @@ -187,20 +184,12 @@ function get_dual_prox!(tmp::TwoManifoldProblem, Y, n, τ, X) end function get_dual_prox( - M::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - n, - τ, - X, + M::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, n, τ, X, ) return apdmo.prox_g_dual!!(M, n, τ, X) end function get_dual_prox( - M::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - n, - τ, - X, + M::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, n, τ, X, ) Y = allocate_result(M, get_dual_prox, X) apdmo.prox_g_dual!!(M, Y, n, τ, X) @@ -213,23 +202,13 @@ function get_dual_prox( end function get_dual_prox!( - M::AbstractManifold, - Y, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - n, - τ, - X, + M::AbstractManifold, Y, apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, n, τ, X, ) copyto!(M, Y, apdmo.prox_g_dual!!(M, n, τ, X)) return Y end function get_dual_prox!( - M::AbstractManifold, - Y, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - n, - τ, - X, + M::AbstractManifold, Y, apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, n, τ, X, ) apdmo.prox_g_dual!!(M, Y, n, τ, X) return Y @@ -264,70 +243,42 @@ function linearized_forward_operator!(tmp::TwoManifoldProblem, Y, m, X, n) end function linearized_forward_operator( - M::AbstractManifold, - ::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - m, - X, - ::Any, + M::AbstractManifold, ::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, m, X, ::Any, ) return apdmo.linearized_forward_operator!!(M, m, X) end function linearized_forward_operator( - M::AbstractManifold, - N::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - m, - X, - n, + M::AbstractManifold, N::AbstractManifold, + apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, m, X, n, ) Y = zero_vector(N, n) apdmo.linearized_forward_operator!!(M, Y, m, X) return Y end function linearized_forward_operator( - M::AbstractManifold, - N::AbstractManifold, - admo::AbstractDecoratedManifoldObjective, - m, - X, - n, + M::AbstractManifold, N::AbstractManifold, + admo::AbstractDecoratedManifoldObjective, m, X, n, ) return linearized_forward_operator(M, N, get_objective(admo, false), m, X, n) end function linearized_forward_operator!( - M::AbstractManifold, - N::AbstractManifold, - Y, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - m, - X, - n, + M::AbstractManifold, N::AbstractManifold, + Y, apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, m, X, n, ) copyto!(N, Y, n, apdmo.linearized_forward_operator!!(M, m, X)) return Y end function linearized_forward_operator!( - M::AbstractManifold, - ::AbstractManifold, - Y, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - m, - X, - ::Any, + M::AbstractManifold, ::AbstractManifold, + Y, apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, m, X, ::Any, ) apdmo.linearized_forward_operator!!(M, Y, m, X) return Y end function linearized_forward_operator!( - M::AbstractManifold, - N::AbstractManifold, - Y, - admo::AbstractDecoratedManifoldObjective, - m, - X, - n, + M::AbstractManifold, N::AbstractManifold, + Y, admo::AbstractDecoratedManifoldObjective, m, X, n, ) return linearized_forward_operator!(M, N, Y, get_objective(admo, false), m, X, n) end @@ -353,18 +304,14 @@ function forward_operator!(tmp::TwoManifoldProblem, q, p) end function forward_operator( - M::AbstractManifold, - ::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - p, + M::AbstractManifold, ::AbstractManifold, + apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, p, ) return apdmo.Λ!!(M, p) end function forward_operator( - M::AbstractManifold, - N::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - p, + M::AbstractManifold, N::AbstractManifold, + apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, p, ) q = rand(N) apdmo.Λ!!(M, q, p) @@ -377,21 +324,15 @@ function forward_operator( end function forward_operator!( - M::AbstractManifold, - N::AbstractManifold, - q, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - p, + M::AbstractManifold, N::AbstractManifold, + q, apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, p, ) copyto!(N, q, apdmo.Λ!!(M, p)) return q end function forward_operator!( - M::AbstractManifold, - ::AbstractManifold, - q, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - p, + M::AbstractManifold, ::AbstractManifold, + q, apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, p, ) apdmo.Λ!!(M, q, p) return q @@ -426,74 +367,70 @@ function adjoint_linearized_operator!(tmp::TwoManifoldProblem, X, m, n, Y) ) end function adjoint_linearized_operator( - ::AbstractManifold, - N::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - m, - n, - Y, + ::AbstractManifold, N::AbstractManifold, + apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, m, n, Y, ) return apdmo.adjoint_linearized_operator!!(N, m, n, Y) end function adjoint_linearized_operator( - M::AbstractManifold, - N::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - m, - n, - Y, + M::AbstractManifold, N::AbstractManifold, + apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, m, n, Y, ) X = zero_vector(M, m) apdmo.adjoint_linearized_operator!!(N, X, m, n, Y) return X end function adjoint_linearized_operator( - M::AbstractManifold, - N::AbstractManifold, - admo::AbstractDecoratedManifoldObjective, - m, - n, - Y, + M::AbstractManifold, N::AbstractManifold, + admo::AbstractDecoratedManifoldObjective, m, n, Y, ) return adjoint_linearized_operator(M, N, get_objective(admo, false), m, n, Y) end function adjoint_linearized_operator!( - M::AbstractManifold, - N::AbstractManifold, - X, - apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, - m, - n, - Y, + M::AbstractManifold, N::AbstractManifold, + X, apdmo::AbstractPrimalDualManifoldObjective{AllocatingEvaluation}, m, n, Y, ) copyto!(M, X, apdmo.adjoint_linearized_operator!!(N, m, n, Y)) return X end function adjoint_linearized_operator!( - ::AbstractManifold, - N::AbstractManifold, - X, - apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, - m, - n, - Y, + ::AbstractManifold, N::AbstractManifold, + X, apdmo::AbstractPrimalDualManifoldObjective{InplaceEvaluation}, m, n, Y, ) apdmo.adjoint_linearized_operator!!(N, X, m, n, Y) return X end function adjoint_linearized_operator!( - M::AbstractManifold, - N::AbstractManifold, - X, - admo::AbstractDecoratedManifoldObjective, - m, - n, - Y, + M::AbstractManifold, N::AbstractManifold, + X, admo::AbstractDecoratedManifoldObjective, m, n, Y, ) return adjoint_linearized_operator!(M, N, X, get_objective(admo, false), m, n, Y) end +function status_summary(pdmo::PrimalDualManifoldObjective; context::Symbol = :default) + both_missing = ismissing(pdmo.Λ!!) && ismissing(pdmo.linearized_forward_operator!!) + _is_inline(context) && ("A primal dual objective with a cost of f+g, a prox for f, a prox for the dual of g, as well as $(!ismissing(pdmo.Λ!!) ? "an operator Λ," : "") $(!ismissing(pdmo.linearized_forward_operator!!) ? "DΛ, " : "")$(!both_missing ? "and " : "")an adjoint D^*Λ") + + maybe_line1 = ismissing(pdmo.Λ!!) ? "" : "\n* Λ: $(pdmo.Λ!!)" + maybe_line2 = ismissing(pdmo.linearized_forward_operator!!) ? "" : "\n* DΛ: $(pdmo.linearized_forward_operator!!)" + return """ + A primal dual objective with + + * cost: $(pdmo.cost) + * prox_f: $(pdmo.prox_f!!) + * prox_g*: $(pdmo.prox_g_dual!!) + * D^*Λ: $(pdmo.adjoint_linearized_operator!!)$(maybe_line1)$(maybe_line2)""" +end +function show(io::IO, pdmo::PrimalDualManifoldObjective{E}) where {E} + print(io, "PrimalDualManifoldObjective(", pdmo.cost, ", ", pdmo.prox_f!!, ", ") + print(io, pdmo.prox_g_dual!!, ", ", pdmo.adjoint_linearized_operator!!, "; ") + print(io, _to_kw(E)) + !ismissing(pdmo.Λ!!) && (print(io, ", Λ = ", pdmo.Λ!!)) + !ismissing(pdmo.linearized_forward_operator!!) && (print(io, ", linearized_forward_operator = ", pdmo.linearized_forward_operator!!)) + return print(io, ")") +end + @doc """ AbstractPrimalDualSolverState @@ -522,23 +459,12 @@ function primal_residual( tmp::TwoManifoldProblem, apds::AbstractPrimalDualSolverState, p_old, X_old, n_old ) return primal_residual( - get_manifold(tmp, 1), - get_manifold(tmp, 2), - get_objective(tmp), - apds, - p_old, - X_old, - n_old, + get_manifold(tmp, 1), get_manifold(tmp, 2), get_objective(tmp), apds, p_old, X_old, n_old, ) end function primal_residual( - M::AbstractManifold, - N::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective, - apds::AbstractPrimalDualSolverState, - p_old, - X_old, - n_old, + M::AbstractManifold, N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, + apds::AbstractPrimalDualSolverState, p_old, X_old, n_old, ) return norm( M, @@ -546,20 +472,12 @@ function primal_residual( 1 / apds.primal_stepsize * inverse_retract(M, apds.p, p_old, apds.inverse_retraction_method) - vector_transport_to( - M, - apds.m, + M, apds.m, adjoint_linearized_operator( - M, - N, - apdmo, - apds.m, - apds.n, - vector_transport_to( - N, n_old, X_old, apds.n, apds.vector_transport_method_dual - ) - apds.X, + M, N, apdmo, apds.m, apds.n, + vector_transport_to(N, n_old, X_old, apds.n, apds.vector_transport_method_dual) - apds.X, ), - apds.p, - apds.vector_transport_method, + apds.p, apds.vector_transport_method, ), ) end @@ -597,24 +515,13 @@ function dual_residual( tmp::TwoManifoldProblem, apds::AbstractPrimalDualSolverState, p_old, X_old, n_old ) return dual_residual( - get_manifold(tmp, 1), - get_manifold(tmp, 2), - get_objective(tmp), - apds, - p_old, - X_old, - n_old, + get_manifold(tmp, 1), get_manifold(tmp, 2), get_objective(tmp), apds, p_old, X_old, n_old, ) end function dual_residual( - M::AbstractManifold, - N::AbstractManifold, - apdmo::AbstractPrimalDualManifoldObjective, - apds::AbstractPrimalDualSolverState, - p_old, - X_old, - n_old, + M::AbstractManifold, N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, + apds::AbstractPrimalDualSolverState, p_old, X_old, n_old, ) if apds.variant === :linearized return norm( @@ -625,16 +532,11 @@ function dual_residual( N, n_old, X_old, apds.n, apds.vector_transport_method_dual ) - apds.X ) - linearized_forward_operator( - M, - N, - apdmo, - apds.m, + M, N, apdmo, apds.m, vector_transport_to( - M, - apds.p, + M, apds.p, inverse_retract(M, apds.p, p_old, apds.inverse_retraction_method), - apds.m, - apds.vector_transport_method, + apds.m, apds.vector_transport_method, ), apds.n, ), @@ -648,23 +550,17 @@ function dual_residual( N, n_old, X_old, apds.n, apds.vector_transport_method_dual ) - apds.n ) - inverse_retract( - N, - apds.n, + N, apds.n, forward_operator( - M, - N, - apdmo, + M, N, apdmo, retract( - M, - apds.m, + M, apds.m, vector_transport_to( - M, - apds.p, + M, apds.p, inverse_retract( M, apds.p, p_old, apds.inverse_retraction_method ), - apds.m, - apds.vector_transport_method, + apds.m, apds.vector_transport_method, ), apds.retraction_method, ), @@ -675,8 +571,7 @@ function dual_residual( else throw( DomainError( - apds.variant, - "Unknown Chambolle—Pock variant, allowed are `:exact` or `:linearized`.", + apds.variant, "Unknown Chambolle—Pock variant, allowed are `:exact` or `:linearized`.", ), ) end @@ -705,25 +600,22 @@ mutable struct DebugDualResidual <: DebugAction io::IO format::String storage::StoreStateAction + at_init::Bool function DebugDualResidual(; storage::StoreStateAction = StoreStateAction([:Iterate, :X, :n]), - io::IO = stdout, - prefix = "Dual Residual: ", - format = "$prefix%s", + io::IO = stdout, prefix = "Dual Residual: ", format = "$prefix%s", at_init::Bool = false, ) - return new(io, format, storage) + return new(io, format, storage, at_init) end function DebugDualResidual( initial_values::Tuple{P, T, Q}; storage::StoreStateAction = StoreStateAction([:Iterate, :X, :n]), - io::IO = stdout, - prefix = "Dual Residual: ", - format = "$prefix%s", + io::IO = stdout, prefix = "Dual Residual: ", format = "$prefix%s", at_init::Bool = false, ) where {P, T, Q} update_storage!( storage, Dict(k => v for (k, v) in zip((:Iterate, :X, :n), initial_values)) ) - return new(io, format, storage) + return new(io, format, storage, at_init) end end function (d::DebugDualResidual)( @@ -732,19 +624,25 @@ function (d::DebugDualResidual)( M = get_manifold(tmp, 1) N = get_manifold(tmp, 2) apdmo = get_objective(tmp) - if all(has_storage.(Ref(d.storage), [:Iterate, :X, :n])) && k > 0 # all values stored + if all(has_storage.(Ref(d.storage), [:Iterate, :X, :n])) && (k >= (d.at_init ? 0 : 1)) # all values stored #fetch p_old = get_storage(d.storage, :Iterate) X_old = get_storage(d.storage, :X) n_old = get_storage(d.storage, :n) Printf.format( - d.io, - Printf.Format(d.format), + d.io, Printf.Format(d.format), dual_residual(M, N, apdmo, apds, p_old, X_old, n_old), ) end return d.storage(tmp, apds, k) end +function show(io::IO, d::DebugDualResidual) + return print(io, "DebugDualResidual(; io = ", d.io, ", format=\"$(escape_string(d.format))\", at_init=$(d.at_init))") +end +function status_summary(d::DebugDualResidual; context::Symbol = :default) + (context === :short) && return repr(d) + return "A DebugAction to print the dual residual with format \"$(escape_string(d.format))\"" +end @doc """ DebugPrimalResidual <: DebugAction @@ -767,23 +665,20 @@ mutable struct DebugPrimalResidual <: DebugAction io::IO format::String storage::StoreStateAction + at_init::Bool function DebugPrimalResidual(; storage::StoreStateAction = StoreStateAction([:Iterate, :X, :n]), - io::IO = stdout, - prefix = "Primal Residual: ", - format = "$prefix%s", + io::IO = stdout, prefix = "Primal Residual: ", format = "$prefix%s", at_init::Bool = false ) - return new(io, format, storage) + return new(io, format, storage, at_init) end function DebugPrimalResidual( values::Tuple{P, T, Q}; storage::StoreStateAction = StoreStateAction([:Iterate, :X, :n]), - io::IO = stdout, - prefix = "Primal Residual: ", - format = "$prefix%s", + io::IO = stdout, prefix = "Primal Residual: ", format = "$prefix%s", at_init::Bool = false, ) where {P, T, Q} update_storage!(storage, Dict(k => v for (k, v) in zip((:Iterate, :X, :n), values))) - return new(io, format, storage) + return new(io, format, storage, at_init) end end function (d::DebugPrimalResidual)( @@ -792,19 +687,25 @@ function (d::DebugPrimalResidual)( M = get_manifold(tmp, 1) N = get_manifold(tmp, 2) apdmo = get_objective(tmp) - if all(has_storage.(Ref(d.storage), [:Iterate, :X, :n])) && k > 0 # all values stored + if all(has_storage.(Ref(d.storage), [:Iterate, :X, :n])) && (k >= (d.at_init ? 0 : 1)) # all values stored #fetch p_old = get_storage(d.storage, :Iterate) X_old = get_storage(d.storage, :X) n_old = get_storage(d.storage, :n) Printf.format( - d.io, - Printf.Format(d.format), + d.io, Printf.Format(d.format), primal_residual(M, N, apdmo, apds, p_old, X_old, n_old), ) end return d.storage(tmp, apds, k) end +function show(io::IO, d::DebugPrimalResidual) + return print(io, "DebugPrimalResidual(; io = ", d.io, ", format=\"$(escape_string(d.format))\", at_init=$(d.at_init))") +end +function status_summary(d::DebugPrimalResidual; context::Symbol = :default) + (context === :short) && return repr(d) + return "A DebugAction to print the primal residual with format \"$(escape_string(d.format))\"" +end @doc """ DebugPrimalDualResidual <: DebugAction @@ -829,23 +730,20 @@ mutable struct DebugPrimalDualResidual <: DebugAction io::IO format::String storage::StoreStateAction + at_init::Bool function DebugPrimalDualResidual(; storage::StoreStateAction = StoreStateAction([:Iterate, :X, :n]), - io::IO = stdout, - prefix = "PD Residual: ", - format = "$prefix%s", + io::IO = stdout, prefix = "PD Residual: ", format = "$prefix%s", at_init::Bool = false, ) - return new(io, format, storage) + return new(io, format, storage, at_init) end function DebugPrimalDualResidual( values::Tuple{P, T, Q}; storage::StoreStateAction = StoreStateAction([:Iterate, :X, :n]), - io::IO = stdout, - prefix = "PD Residual: ", - format = "$prefix%s", + io::IO = stdout, prefix = "PD Residual: ", format = "$prefix%s", at_init::Bool = false, ) where {P, Q, T} update_storage!(storage, Dict(k => v for (k, v) in zip((:Iterate, :X, :n), values))) - return new(io, format, storage) + return new(io, format, storage, at_init) end end function (d::DebugPrimalDualResidual)( @@ -854,19 +752,23 @@ function (d::DebugPrimalDualResidual)( M = get_manifold(tmp, 1) N = get_manifold(tmp, 2) apdmo = get_objective(tmp) - if all(has_storage.(Ref(d.storage), [:Iterate, :X, :n])) && k > 0 # all values stored + if all(has_storage.(Ref(d.storage), [:Iterate, :X, :n])) && (k >= (d.at_init ? 0 : 1)) # all values stored #fetch p_old = get_storage(d.storage, :Iterate) X_old = get_storage(d.storage, :X) n_old = get_storage(d.storage, :n) - v = - primal_residual(M, N, apdmo, apds, p_old, X_old, n_old) + - dual_residual(tmp, apds, p_old, X_old, n_old) + v = primal_residual(M, N, apdmo, apds, p_old, X_old, n_old) + dual_residual(tmp, apds, p_old, X_old, n_old) Printf.format(d.io, Printf.Format(d.format), v / manifold_dimension(M)) end return d.storage(tmp, apds, k) end - +function show(io::IO, d::DebugPrimalDualResidual) + return print(io, "DebugPrimalDualResidual(; io = ", d.io, ", format=\"$(escape_string(d.format))\", at_init=$(d.at_init))") +end +function status_summary(d::DebugPrimalDualResidual; context::Symbol = :default) + (context === :short) && return repr(d) + return "A DebugAction to print the primal dual residual with format \"$(escape_string(d.format))\"" +end # # Debugs # @@ -877,9 +779,7 @@ Print the change of the primal variable by using [`DebugChange`](@ref), see their constructors for detail. """ function DebugPrimalChange(; - storage::StoreStateAction = StoreStateAction([:Iterate]), - prefix = "Primal Change: ", - kwargs..., + storage::StoreStateAction = StoreStateAction([:Iterate]), prefix = "Primal Change: ", kwargs..., ) return DebugChange(; storage = storage, prefix = prefix, kwargs...) end @@ -912,20 +812,17 @@ mutable struct DebugDualChange <: DebugAction io::IO format::String storage::StoreStateAction + at_init::Bool function DebugDualChange(; storage::StoreStateAction = StoreStateAction([:X, :n]), - io::IO = stdout, - prefix = "Dual Change: ", - format = "$prefix%s", + io::IO = stdout, prefix = "Dual Change: ", format = "$prefix%s", at_init::Bool = false, ) return new(io, format, storage) end function DebugDualChange( values::Tuple{T, P}; storage::StoreStateAction = StoreStateAction([:X, :n]), - io::IO = stdout, - prefix = "Dual Change: ", - format = "$prefix%s", + io::IO = stdout, prefix = "Dual Change: ", format = "$prefix%s", ) where {P, T} update_storage!( storage, Dict{Symbol, Any}(k => v for (k, v) in zip((:X, :n), values)) @@ -942,16 +839,22 @@ function (d::DebugDualChange)( X_old = get_storage(d.storage, :X) n_old = get_storage(d.storage, :n) v = norm( - N, - apds.n, + N, apds.n, vector_transport_to( N, n_old, X_old, apds.n, apds.vector_transport_method_dual ) - apds.X, ) - Printf.format(d.io, Printf.Format(d.format), v) + (k >= (d.at_init ? 0 : 1)) && Printf.format(d.io, Printf.Format(d.format), v) end return d.storage(tmp, apds, k) end +function show(io::IO, ddc::DebugDualChange) + return print(io, "DebugDualChange(; io = ", ddc.io, ", format =\"$(escape_string(ddc.format))\", at_init=$(ddc.at_init))") +end +function status_summary(ddc::DebugDualChange; context::Symbol = :default) + (context === :short) && return repr(ddc) + return "A DebugAction to print the change of the dual variable with format \"$(escape_string(ddc.format))\"" +end """ DebugDualBaseIterate(io::IO=stdout) @@ -972,12 +875,8 @@ function DebugDualBaseChange(; storage::StoreStateAction = StoreStateAction([:n]), prefix = "Dual Base Change:", kwargs... ) return DebugEntryChange( - :n, - (p, o, x, y) -> - distance(get_manifold(p, 2), x, y, o.inverse_retraction_method_dual); - storage = storage, - prefix = prefix, - kwargs..., + :n, (p, o, x, y) -> distance(get_manifold(p, 2), x, y, o.inverse_retraction_method_dual); + storage = storage, prefix = prefix, kwargs..., ) end @@ -998,11 +897,8 @@ see their constructors for detail, on `o.n`. """ function DebugPrimalBaseChange(opts...; prefix = "Primal Base Change:", kwargs...) return DebugEntryChange( - :m, - (p, o, x, y) -> distance(get_manifold(p, 1), x, y), - opts...; - prefix = prefix, - kwargs..., + :m, (p, o, x, y) -> distance(get_manifold(p, 1), x, y), + opts...; prefix = prefix, kwargs..., ) end @@ -1070,6 +966,7 @@ Create an [`RecordAction`](@ref) that records the primal base point, an [`RecordEntry`](@ref) of `o.m`. """ RecordPrimalBaseIterate(m) = RecordEntry(m, :m) + """ RecordPrimalBaseChange() diff --git a/src/plans/problem.jl b/src/plans/problem.jl index 8c521615cc..dc4758d5af 100644 --- a/src/plans/problem.jl +++ b/src/plans/problem.jl @@ -17,6 +17,11 @@ Usually the cost should be within an [`AbstractManifoldObjective`](@ref). """ abstract type AbstractManoptProblem{M <: AbstractManifold} end +function Base.show(io::IO, ::MIME"text/plain", amp::AbstractManoptProblem) + multiline = get(io, :multiline, true) + return multiline ? status_summary(io, amp) : show(io, amp) +end + @doc """ DefaultManoptProblem{TM <: AbstractManifold, Objective <: AbstractManifoldObjective} @@ -29,6 +34,26 @@ struct DefaultManoptProblem{TM <: AbstractManifold, O <: AbstractManifoldObjecti objective::O end +function show(io::IO, dmp::DefaultManoptProblem) + print(io, "DefaultManoptProblem(") + show(io, dmp.manifold) + print(io, ", ") + show(io, dmp.objective) + return print(io, ")") +end + +function status_summary(dmp::DefaultManoptProblem; context::Symbol = :default) + _is_inline(context) && return "An optimization problem to minimize $(dmp.objective) on the manifold $(dmp.manifold)" + return """ + An optimization problem for Manopt.jl + + ## Manifold + $(_MANOPT_INDENT)$(replace(repr(dmp.manifold), "\n#" => "\n$(_MANOPT_INDENT)##", "\n" => "\n$(_MANOPT_INDENT)")) + + ## Objective + $(_in_str(status_summary(dmp.objective, context = context); indent = 1))""" +end + """ evaluation_type(mp::AbstractManoptProblem) diff --git a/src/plans/proximal_gradient_plan.jl b/src/plans/proximal_gradient_plan.jl index 4d3da05f8f..7e5848dafc 100644 --- a/src/plans/proximal_gradient_plan.jl +++ b/src/plans/proximal_gradient_plan.jl @@ -30,8 +30,7 @@ Generate the proximal gradient objective given the total cost ``f = g + h``, smo * `evaluation=`[`AllocatingEvaluation`](@ref): whether the gradient and proximal map is given as an allocation function or an in-place ([`InplaceEvaluation`](@ref)). """ -struct ManifoldProximalGradientObjective{E <: AbstractEvaluationType, TC, TG, TGG, TP} <: - AbstractManifoldCostObjective{E, TC} +struct ManifoldProximalGradientObjective{E <: AbstractEvaluationType, TC, TG, TGG, TP} <: AbstractManifoldCostObjective{E, TC} cost::TC # f = g + h cost_smooth::TG # smooth part gradient_g!!::TGG @@ -77,6 +76,26 @@ function get_gradient!( return X end +function Base.show(io::IO, mpgo::ManifoldProximalGradientObjective{E}) where {E} + print(io, "ManifoldProximalGradientObjective(", mpgo.cost, ", ", mpgo.cost_smooth, ", ") + print(io, mpgo.gradient_g!!, ", ", mpgo.proximal_map_h!!, "; ", _to_kw(E)) + return print(io, ")") +end + +function status_summary(mpgo::ManifoldProximalGradientObjective{E}; context::Symbol = :default) where {E} + (context === :short) && return repr(mpgo) + s = "A proximal gradient objective `f = g + h`, where `g` is smooth and `h` is possibly nonsmooth." + (context === :inline) && (return s) + e = (E === AllocatingEvaluation ? " (allocating)" : " (in-place)") + return """ + $s + + # Components + * `f`: $(mpgo.cost) + * `g`: $(mpgo.cost_smooth) + * `gradient_g`: $(mpgo.gradient_g!!)$e + * `prox_h`: $(mpgo.proximal_map_h!!)$e""" +end """ get_cost_smooth(M::AbstractManifold, objective, p) @@ -102,11 +121,7 @@ function get_proximal_map( end function get_proximal_map!( - M::AbstractManifold, - q, - mpgo::ManifoldProximalGradientObjective{AllocatingEvaluation}, - λ, - p, + M::AbstractManifold, q, mpgo::ManifoldProximalGradientObjective{AllocatingEvaluation}, λ, p, ) copyto!(M, q, mpgo.proximal_map_h!!(M, λ, p)) return q @@ -272,31 +287,36 @@ $(_kwargs(:sub_state; default = _glossary[:Variable][:evaluation][:default])) $(_kwargs(:X; add_properties = [:as_Memory])) """ mutable struct ProximalGradientMethodState{ - P, - T, - Pr <: Union{<:AbstractManoptProblem, F, Nothing} where {F}, - St <: Union{<:AbstractManoptSolverState, Nothing}, - A, - S <: StoppingCriterion, - TStepsize <: Stepsize, - RM <: AbstractRetractionMethod, - IRM <: AbstractInverseRetractionMethod, - R, + P, T, Pr <: Union{<:AbstractManoptProblem, F, Nothing} where {F}, St <: Union{<:AbstractManoptSolverState, Nothing}, + A, SC <: StoppingCriterion, S <: Stepsize, RM <: AbstractRetractionMethod, IRM <: AbstractInverseRetractionMethod, R, } <: AbstractManoptSolverState a::P acceleration::A - stepsize::TStepsize + stepsize::S last_stepsize::R p::P q::P - stop::S + stop::SC X::T retraction_method::RM inverse_retraction_method::IRM sub_problem::Pr sub_state::St + function ProximalGradientMethodState( + sub_problem::Pr, sub_state::St; + a::P, acceleration::A, stepsize::S, last_stepsize::R, p::P, q::P, + stopping_criterion::SC, X::T, retraction_method::RM, inverse_retraction_method::IRM, + ) where { + P, T, Pr <: Union{<:AbstractManoptProblem, F, Nothing} where {F}, St <: Union{<:AbstractManoptSolverState, Nothing}, + A, SC <: StoppingCriterion, S <: Stepsize, RM <: AbstractRetractionMethod, IRM <: AbstractInverseRetractionMethod, R, + } + return new{P, T, Pr, St, A, SC, S, RM, IRM, R}( + a, acceleration, stepsize, last_stepsize, p, q, stopping_criterion, X, + retraction_method, inverse_retraction_method, sub_problem, sub_state + ) + end end - +ProximalGradientMethodState(M::AbstractManifold, st::AbstractManoptSolverState; kwargs...) = error("Proximal Gradient Method state can not be constructed based on $M and the sub state $st, a sub_problem is missing") function ProximalGradientMethodState( M::AbstractManifold; p::P = rand(M), @@ -304,48 +324,25 @@ function ProximalGradientMethodState( copyto!(get_manifold(pr), st.a, st.p) return st end, - stepsize::TS = default_stepsize(M, ProximalGradientMethodState), - stopping_criterion::S = StopWhenGradientMappingNormLess(1.0e-2) | - StopAfterIteration(5000) | - StopWhenChangeLess(M, 1.0e-9), + stepsize::S = default_stepsize(M, ProximalGradientMethodState), + stopping_criterion::SC = StopWhenGradientMappingNormLess(1.0e-2) | StopAfterIteration(5000) | StopWhenChangeLess(M, 1.0e-9), X::T = zero_vector(M, p), retraction_method::RM = default_retraction_method(M, typeof(p)), inverse_retraction_method::IRM = default_inverse_retraction_method(M, typeof(p)), sub_problem::Pr = nothing, sub_state::St = nothing, #AllocatingEvaluation(), ) where { - P, - T, - S <: StoppingCriterion, - A, - Pr <: Union{<:AbstractManoptProblem, F, Nothing} where {F}, - St <: Union{<:AbstractManoptSolverState, <:AbstractEvaluationType, Nothing}, - RM <: AbstractRetractionMethod, - IRM <: AbstractInverseRetractionMethod, - TS <: Stepsize, + P, T, SC <: StoppingCriterion, A, + Pr <: Union{<:AbstractManoptProblem, F, Nothing} where {F}, St <: Union{<:AbstractManoptSolverState, <:AbstractEvaluationType, Nothing}, + RM <: AbstractRetractionMethod, IRM <: AbstractInverseRetractionMethod, S <: Stepsize, } - _sub_state = if sub_state isa AbstractEvaluationType - ClosedFormSubSolverState(; evaluation = sub_state) - else - sub_state - end - - last_stepsize = zero(number_eltype(p)) - return ProximalGradientMethodState{ - P, T, Pr, typeof(_sub_state), A, S, TS, RM, IRM, typeof(last_stepsize), - }( - copy(M, p), - acceleration, - stepsize, - last_stepsize, - p, - copy(M, p), - stopping_criterion, - X, - retraction_method, - inverse_retraction_method, - sub_problem, - _sub_state, + return ProximalGradientMethodState( + sub_problem, maybe_wrap_evaluation_type(sub_state); + a = copy(M, p), acceleration = acceleration, + stepsize = stepsize, last_stepsize = zero(number_eltype(p)), + p = p, q = copy(M, p), + stopping_criterion = stopping_criterion, X = X, + retraction_method = retraction_method, inverse_retraction_method = inverse_retraction_method, ) end @@ -356,26 +353,31 @@ function set_iterate!(pgms::ProximalGradientMethodState, M, p) copyto!(M, pgms.p, p) return pgms end - -function show(io::IO, pgms::ProximalGradientMethodState) +function Base.show(io::IO, pgms::ProximalGradientMethodState) + print(io, "ProximalGradientMethodState(", pgms.sub_problem, ", ", pgms.sub_problem, ";") + print(io, " a = ", pgms.a, ", acceleration = ", pgms.acceleration, ", stepsize = ", pgms.stepsize) + print(io, ", last_stepsize = ", pgms.last_stepsize, ", p = ", pgms.p, ", q = ", pgms.q) + print(io, ", stopping_criterion = ", pgms.stop, ", X = ", pgms.X) + print(io, ", retraction_method = ", pgms.retraction_method, ", inverse_retraction_method = ", pgms.inverse_retraction_method) + return print(io, ")") +end +function status_summary(pgms::ProximalGradientMethodState; context::Symbol = :default) i = get_count(pgms, :Iterations) Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(pgms.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(pgms)) – $(Iter) $(has_converged(pgms) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Proximal Gradient Method $Iter - ## Parameters - * retraction_method: $(pgms.retraction_method) * stepsize: $(typeof(pgms.stepsize)) * acceleration: $(typeof(pgms.acceleration)) ## Stopping criterion - - $(status_summary(pgms.stop)) + $(_in_str(status_summary(pgms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end # # Stepsize @@ -413,42 +415,33 @@ mutable struct ProximalGradientMethodBacktrackingStepsize{P, T} <: Stepsize last_stepsize::T stop_when_stepsize_less::T warm_start_factor::T - + function ProximalGradientMethodBacktrackingStepsize(; + initial_stepsize::T, sufficient_decrease::T, contraction_factor::T, strategy::Symbol, + candidate_point::P, last_stepsize::T, stop_when_stepsize_less::T, warm_start_factor::T + ) where {P, T <: Real} + return new{P, T}( + initial_stepsize, sufficient_decrease, contraction_factor, strategy, + candidate_point, last_stepsize, stop_when_stepsize_less, warm_start_factor + ) + end function ProximalGradientMethodBacktrackingStepsize( M::AbstractManifold; - initial_stepsize::T = 1.0, - sufficient_decrease::T = 0.5, - contraction_factor::T = 0.5, - strategy::Symbol = :nonconvex, - stop_when_stepsize_less::T = 1.0e-8, - warm_start_factor::T = 1.0, - ) where {T} - 0 < sufficient_decrease < 1 || - throw(DomainError(sufficient_decrease, "sufficient_decrease must be in (0, 1)")) - 0 < contraction_factor < 1 || - throw(DomainError(contraction_factor, "contraction_factor must be in (0, 1)")) - initial_stepsize > 0 || - throw(DomainError(initial_stepsize, "initial_stepsize must be positive")) - strategy in [:convex, :nonconvex] || - throw(DomainError(strategy, "strategy must be either :convex or :nonconvex")) - stop_when_stepsize_less > 0 || throw( - DomainError( - stop_when_stepsize_less, "stop_when_stepsize_less must be positive" - ), + p = rand(M), + initial_stepsize::Real = 1.0, sufficient_decrease::Real = 0.5, contraction_factor::Real = 0.5, + strategy::Symbol = :nonconvex, stop_when_stepsize_less::Real = 1.0e-8, warm_start_factor::Real = 1.0, ) - warm_start_factor > 0 || - throw(DomainError(warm_start_factor, "warm_start_factor must be positive")) - - p = rand(M) - return new{typeof(p), T}( - initial_stepsize, - sufficient_decrease, - contraction_factor, - strategy, - p, - initial_stepsize, - stop_when_stepsize_less, - warm_start_factor, + T = promote_type(typeof(initial_stepsize), typeof(sufficient_decrease), typeof(contraction_factor), typeof(stop_when_stepsize_less), typeof(warm_start_factor)) + 0 < sufficient_decrease < 1 || throw(DomainError(sufficient_decrease, "sufficient_decrease ($(sufficient_decrease)) must be in (0, 1)")) + 0 < contraction_factor < 1 || throw(DomainError(contraction_factor, "contraction_factor ($(contraction_factor)) must be in (0, 1)")) + initial_stepsize > 0 || throw(DomainError(initial_stepsize, "initial_stepsize ($(initial_stepsize)) must be positive")) + strategy in [:convex, :nonconvex] || throw(DomainError(strategy, "strategy (:$(strategy)) must be either :convex or :nonconvex")) + stop_when_stepsize_less > 0 || throw(DomainError(stop_when_stepsize_less, "stop_when_stepsize_less ($(stop_when_stepsize_less)) must be positive")) + warm_start_factor > 0 || throw(DomainError(warm_start_factor, "warm_start_factor ($(warm_start_factor)) must be positive")) + return ProximalGradientMethodBacktrackingStepsize(; + initial_stepsize = convert(T, initial_stepsize), sufficient_decrease = convert(T, sufficient_decrease), + contraction_factor = convert(T, contraction_factor), strategy = strategy, candidate_point = p, + last_stepsize = convert(T, initial_stepsize), stop_when_stepsize_less = convert(T, stop_when_stepsize_less), + warm_start_factor = convert(T, warm_start_factor), ) end end @@ -457,19 +450,28 @@ get_initial_stepsize(s::ProximalGradientMethodBacktrackingStepsize) = s.initial_ get_last_stepsize(s::ProximalGradientMethodBacktrackingStepsize) = s.last_stepsize function Base.show(io::IO, pgb::ProximalGradientMethodBacktrackingStepsize) - s = """ - ProximalGradientMethodBacktrackingStepsize(; - contraction_factor=$(pgb.contraction_factor), - initial_stepsize=$(pgb.initial_stepsize), - stop_when_stepsize_less=$(pgb.stop_when_stepsize_less), - sufficient_decrease=$(pgb.sufficient_decrease), - strategy=$(pgb.strategy), - warm_start_factor=$(pgb.warm_start_factor), - ) + print(io, "ProximalGradientMethodBacktrackingStepsize(; initial_stepsize = ", pgb.initial_stepsize) + print(io, ", sufficient_decrease = ", pgb.sufficient_decrease, ", contraction_factor = ", pgb.contraction_factor) + print(io, ", strategy = :$(pgb.strategy), candidate_point = ", pgb.candidate_point) + print(io, ", last_stepsize = ", pgb.last_stepsize, ", stop_when_stepsize_less = ", pgb.stop_when_stepsize_less) + print(io, ", warm_start_factor = ", pgb.warm_start_factor) + return print(io, ")") +end +function status_summary(pgb::ProximalGradientMethodBacktrackingStepsize; context::Symbol = :default) + (context === :short) && return (repr(pgb)) + (context === :inline) && return "A proximal gradient backtracking step size (last step size: $(pgb.last_stepsize))" + return """ + A backtracking method tailored for the proximal gradient method + (last step size: $(pgb.last_stepsize)) + + ## Parameters + * contraction factor: $(_MANOPT_INDENT)$(pgb.contraction_factor) + * sufficient decrease: $(_MANOPT_INDENT)$(pgb.sufficient_decrease) + * strategy: $(_MANOPT_INDENT):$(pgb.strategy) + * stop when step size less: $(_MANOPT_INDENT)$(pgb.stop_when_stepsize_less) + * warm start factor: $(_MANOPT_INDENT)$(pgb.warm_start_factor) """ - return print(io, s) end - function (s::ProximalGradientMethodBacktrackingStepsize)( mp::AbstractManoptProblem, st::ProximalGradientMethodState, i::Int, args...; kwargs... ) @@ -492,12 +494,9 @@ function (s::ProximalGradientMethodBacktrackingStepsize)( # Temporary state for backtracking that doesn't affect the main state pgm_temp = ProximalGradientMethodState( M; - p = copy(M, p), # Start from current (possibly) accelerated point - X = zero_vector(M, p), - sub_problem = st.sub_problem, - sub_state = st.sub_state, - retraction_method = st.retraction_method, - inverse_retraction_method = st.inverse_retraction_method, + p = copy(M, p), X = zero_vector(M, p), + sub_problem = st.sub_problem, sub_state = st.sub_state, + retraction_method = st.retraction_method, inverse_retraction_method = st.inverse_retraction_method, ) while λ > s.stop_when_stepsize_less @@ -616,11 +615,16 @@ $(_args(:M)) * `p` - initial point * `X` - initial tangent vector """ -mutable struct ProximalGradientMethodAcceleration{P, T, F, ITR} +mutable struct ProximalGradientMethodAcceleration{P, T, F, ITR <: AbstractInverseRetractionMethod} β::F inverse_retraction_method::ITR p::P X::T + function ProximalGradientMethodAcceleration(; + β::F, inverse_retraction_method::ITR, p::P, X::T + ) where {P, T, F, ITR <: AbstractInverseRetractionMethod} + return new{P, T, F, ITR}(β, inverse_retraction_method, p, X) + end end function ProximalGradientMethodAcceleration( @@ -630,7 +634,9 @@ function ProximalGradientMethodAcceleration( β::F = (k) -> (k - 1) / (k + 2), inverse_retraction_method::I = default_inverse_retraction_method(M, typeof(p)), ) where {P, T, F, I <: AbstractInverseRetractionMethod} - return ProximalGradientMethodAcceleration{P, T, F, I}(β, inverse_retraction_method, p, X) + return ProximalGradientMethodAcceleration( + β = β, inverse_retraction_method = inverse_retraction_method, p = p, X = X + ) end function (pga::ProximalGradientMethodAcceleration)( @@ -648,14 +654,9 @@ function (pga::ProximalGradientMethodAcceleration)( end function Base.show(io::IO, pga::ProximalGradientMethodAcceleration) - s = """ - ProximalGradientMethodAcceleration with parameters - * p=$(pga.p) - * X=$(pga.X) - * β=$(pga.β) - * inverse_retraction_method=$(pga.inverse_retraction_method) - """ - return print(io, s) + print(io, "ProximalGradientMethodAcceleration(; p = ", pga.p, ", X = ", pga.X) + print(io, ", β = ", pga.β, ", inverse_retraction_method = ", pga.inverse_retraction_method) + return print(io, ")") end """ @@ -713,18 +714,16 @@ function get_reason(c::StopWhenGradientMappingNormLess) return "" end -function status_summary(c::StopWhenGradientMappingNormLess) +function status_summary(c::StopWhenGradientMappingNormLess; context::Symbol = :default) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "|G| < $(c.threshold): $s" + return (_is_inline(context) ? "|G| < $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the gradient mapping norm is less then a tolerance.\n$(_MANOPT_INDENT)") * s end indicates_convergence(c::StopWhenGradientMappingNormLess) = true -function show(io::IO, c::StopWhenGradientMappingNormLess) - return print( - io, "StopWhenGradientMappingNormLess($(c.threshold))\n $(status_summary(c))" - ) +function Base.show(io::IO, c::StopWhenGradientMappingNormLess) + return print(io, "StopWhenGradientMappingNormLess($(c.threshold))") end # If we are running on a prox grad backtrack, ignore the threshold from the DEbug and take the one from the stepsize function (d::DebugWarnIfStepsizeCollapsed)( @@ -738,7 +737,7 @@ function (d::DebugWarnIfStepsizeCollapsed)( if s.last_stepsize ≤ s.stop_when_stepsize_less @warn "Backtracking stopped because the stepsize fell below the threshold $(s.stop_when_stepsize_less)." if d.status === :Once - @warn "Further warnings will be suppressed, use DebugWarnIfLagrangeMultiplierIncreases(:Always) to get all warnings." + @warn "Further warnings will be suppressed, use DebugWarnIfStepsizeCollapsed(:Always) to get all warnings." d.status = :No end end diff --git a/src/plans/proximal_plan.jl b/src/plans/proximal_plan.jl index dc1c4c3235..fb41010595 100644 --- a/src/plans/proximal_plan.jl +++ b/src/plans/proximal_plan.jl @@ -112,9 +112,7 @@ end function get_proximal_map( M::AbstractManifold, mpo::ManifoldProximalMapObjective{AllocatingEvaluation, F, <:Union{<:Tuple, <:Vector}}, - λ, - p, - i, + λ, p, i, ) where {F} check_prox_number(mpo.proximal_maps!!, i) return mpo.proximal_maps!![i](M, λ, p) @@ -128,9 +126,7 @@ function get_proximal_map!( M::AbstractManifold, q, mpo::ManifoldProximalMapObjective{AllocatingEvaluation, F, <:Union{<:Tuple, <:Vector}}, - λ, - p, - i, + λ, p, i, ) where {F} check_prox_number(mpo.proximal_maps!!, i) copyto!(M, q, mpo.proximal_maps!![i](M, λ, p)) @@ -144,9 +140,7 @@ end function get_proximal_map( M::AbstractManifold, mpo::ManifoldProximalMapObjective{InplaceEvaluation, F, <:Union{<:Tuple, <:Vector}}, - λ, - p, - i, + λ, p, i, ) where {F} check_prox_number(mpo.proximal_maps!!, i) q = allocate_result(M, get_proximal_map, p) @@ -154,12 +148,9 @@ function get_proximal_map( return q end function get_proximal_map!( - M::AbstractManifold, - q, + M::AbstractManifold, q, mpo::ManifoldProximalMapObjective{InplaceEvaluation, F, <:Union{<:Tuple, <:Vector}}, - λ, - p, - i, + λ, p, i, ) where {F} check_prox_number(mpo.proximal_maps!!, i) mpo.proximal_maps!![i](M, q, λ, p) @@ -192,6 +183,16 @@ function get_proximal_map!( mpo.proximal_maps!!(M, q, λ, p) return q end +function status_summary(mpo::ManifoldProximalMapObjective; context::Symbol = :default) + (context === :short) && (return repr(mpo)) + return "A proximal map objective for a cost with $(mpo.number_of_proxes) proximal maps" +end +function Base.show(io::IO, mpo::ManifoldProximalMapObjective{E}) where {E} + print(io, "ManifoldProximalMapObjective(", mpo.cost, ", ", mpo.proximal_maps!!, ", ") + print(io, mpo.number_of_proxes, "; ", _to_kw(E)) + return print(io, ")") +end + # # # Proximal based State @@ -231,13 +232,17 @@ $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(2000)") [`cyclic_proximal_point`](@ref) """ -mutable struct CyclicProximalPointState{P, TStop <: StoppingCriterion, Tλ} <: - AbstractManoptSolverState +mutable struct CyclicProximalPointState{P, SC <: StoppingCriterion, Tλ, A <: AbstractVector{<:Int}} <: AbstractManoptSolverState p::P - stop::TStop + stop::SC λ::Tλ order_type::Symbol - order::AbstractVector{Int} + order::A + function CyclicProximalPointState(; + p::P, stopping_criterion::SC, λ::Tλ, order_type::Symbol, order::A, + ) where {P, SC <: StoppingCriterion, Tλ, A <: AbstractVector{<:Int}} + return new{P, SC, Tλ, A}(p, stopping_criterion, λ, order_type, order) + end end function CyclicProximalPointState( @@ -247,13 +252,40 @@ function CyclicProximalPointState( λ::F = (i) -> 1.0 / i, evaluation_order::Symbol = :LinearOrder, ) where {P, S, F} - return CyclicProximalPointState{P, S, F}(p, stopping_criterion, λ, evaluation_order, []) + return CyclicProximalPointState(; p = p, stopping_criterion = stopping_criterion, λ = λ, order_type = evaluation_order, order = Int[]) end get_iterate(cpps::CyclicProximalPointState) = cpps.p function set_iterate!(cpps::CyclicProximalPointState, p) cpps.p = p return p end +function Base.show(io::IO, cpps::CyclicProximalPointState) + print(io, "CyclicProximalPointState(; ") + print(io, "p = "); print(io, cpps.p); print(io, ", ") + print(io, "stopping_crierion = "); print(io, cpps.stop); print(io, ", ") + print(io, "λ = "); print(io, cpps.λ); print(io, ", ") + print(io, "order = "); print(io, cpps.order); print(io, ", ") + print(io, "order_type = "); print(io, cpps.order_type) + return print(io, ")") +end +function status_summary(cpps::CyclicProximalPointState; context::Symbol = :default) + (context === :short) && return repr(cpps) + i = get_count(cpps, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(cpps.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the cyclic proximal point algorithm$(conv_inl)" + Iter = (i > 0) ? "After $i iterations\n" : "" + Conv = indicates_convergence(cpps.stop) ? "Yes" : "No" + s = """ + # Solver state for `Manopt.jl`s Cyclic Proximal Point Algorithm + $Iter + ## Parameters + * evaluation order of the proximal maps: :$(cpps.order_type) + + ## Stopping criterion + $(_in_str(status_summary(cpps.stop; context = context); indent = 0, headers = 1)) + This indicates convergence: $Conv""" + return s +end # # Debug @@ -286,21 +318,38 @@ function (d::DebugProximalParameter)( (k >= (d.at_init ? 0 : 1)) && Printf.format(d.io, Printf.Format(d.format), cpps.λ(k)) return nothing end - +function Base.show(io::IO, d::DebugProximalParameter) + return print( + io, "DebugGradientChange(; io = ", d.io, ", format=\"$(escape_string(d.format))\", at_init = $(d.at_init))", + ) +end +function status_summary(d::DebugProximalParameter; context::Symbol = :Default) + (context === :short) && (return "(:ProxParameter, \"$(escape_string(d.format))\")") + # Inline and default + return "A DebugAction printing the proximal parameter as “$(escape_string(d.format))”" +end # # Record @doc """ - RecordProximalParameter <: RecordAction + RecordProximalParameter{R <: Real} <: RecordAction record the current iterates proximal point algorithm parameter given by in [`AbstractManoptSolverState`](@ref)s `o.λ`. + +## Constructor + RecordProximalParameter(r::Type{<:Real}=Float64) """ -mutable struct RecordProximalParameter <: RecordAction - recorded_values::Array{Float64, 1} - RecordProximalParameter() = new(Array{Float64, 1}()) +mutable struct RecordProximalParameter{R <: Real} <: RecordAction + recorded_values::Array{R, 1} + RecordProximalParameter(r::Type{<:Real} = Float64) = new{r}(Array{r, 1}()) end function (r::RecordProximalParameter)( ::AbstractManoptProblem, cpps::CyclicProximalPointState, k::Int ) return record_or_reset!(r, cpps.λ(k), k) end +show(io::IO, ::RecordProximalParameter{R}) where {R} = print(io, "RecordProximalParameter($R)") +function status_summary(rg::RecordProximalParameter{R}; context::Symbol = :default) where {R} + (context === :short) && return ":ProximalParameter" + return "A RecordAction to record the current proximal parameter (of type $R)" +end diff --git a/src/plans/quasi_newton_plan.jl b/src/plans/quasi_newton_plan.jl index b167b01b4b..db4686a74f 100644 --- a/src/plans/quasi_newton_plan.jl +++ b/src/plans/quasi_newton_plan.jl @@ -409,8 +409,8 @@ space ``T_{p_{k+1}} $(_tex(:Cal, "M"))``, preferably with an isometric vector tr # Provided functors -* `(mp::AbstractManoptproblem, st::QuasiNewtonState) -> η` to compute the update direction -* `(η, mp::AbstractManoptproblem, st::QuasiNewtonState) -> η` to compute the update direction in-place of `η` +* `(mp::AbstractManoptProblem, st::QuasiNewtonState) -> η` to compute the update direction +* `(η, mp::AbstractManoptProblem, st::QuasiNewtonState) -> η` to compute the update direction in-place of `η` # Fields @@ -511,6 +511,55 @@ function initialize_update!(d::QuasiNewtonMatrixDirectionUpdate) copyto!(d.matrix, I) return d end +""" + hessian_value_diag(d::QuasiNewtonMatrixDirectionUpdate, M, p, X) + +Evaluate the quadratic form associated with the stored quasi-Newton matrix. +Returns the scalar ``c^{\top} B c`` where ``c`` are the coordinates of the +tangent vector `X` at `p` (in the basis `d.basis`) and ``B`` is `d.matrix`. +""" +function hessian_value_diag(d::QuasiNewtonMatrixDirectionUpdate{T}, M::AbstractManifold, p, X) where {T <: Union{BFGS, DFP, SR1, Broyden}} + c = get_coordinates(M, p, X, d.basis) + return dot(c, d.matrix, c) +end + +""" + UnitVector{TB} + +A type representing a unit tangent vector on a `Hyperrectangle`-like manifold with corners, +or a product of it with a standard manifold. +The field `index` stores the index of the element equal to 1. +All other elements are equal to 0. +`its` stores the overall iterator over all bounds. +""" +struct UnitVector{TB} + index::TB +end + +""" + hessian_value_diag(d::QuasiNewtonMatrixDirectionUpdate, M, p, X::UnitVector) + +Evaluate the quadratic form associated with the stored quasi-Newton matrix. +Returns the scalar ``c^{\top} B c`` where ``c`` are the coordinates of the +[`UnitVector`](@ref) `X` at `p` (in the basis `d.basis`) and ``B`` is `d.matrix`. +""" +function hessian_value_diag(d::QuasiNewtonMatrixDirectionUpdate{T}, M::AbstractManifold, p, X::UnitVector) where {T <: Union{BFGS, DFP, SR1, Broyden}} + b = to_coordinate_index(M, X, d.basis) + return d.matrix[b, b] +end +""" + hessian_value(d::QuasiNewtonMatrixDirectionUpdate, M, p, X::UnitVector, Y) + +Evaluate the quadratic form associated with the stored quasi-Newton matrix. +Returns the scalar ``c_b^{\top} B c`` where ``c_b`` are the coordinates of the +[`UnitVector`](@ref) `X` at `p` (assumed to correspond to the basis `d.basis`), +``c`` are the coordinates of the tangent vector `Y` at `p` (in the basis `d.basis`) +and ``B`` is `d.matrix`. +""" +function hessian_value(d::QuasiNewtonMatrixDirectionUpdate{T}, M::AbstractManifold, p, X::UnitVector, Y) where {T <: Union{BFGS, DFP, SR1, Broyden}} + b = to_coordinate_index(M, X, d.basis) + return dot(d.matrix[b, :], get_coordinates(M, p, Y, d.basis)) +end _doc_QN_B = """ ```math @@ -562,6 +611,19 @@ function is always included and the old, probably no longer relevant, informatio $(_fields(:vector_transport_method)) * `message`: a string containing a potential warning that might have appeared * `project!`: a function to stabilize the update by projecting on the tangent space +* `vector_transport_method`: method for transporting stored s and y directions to the new point +* `nonpositive_curvature_behavior`: how non-positive-definite pairs (s, y) are detected and handled in vector transport. + Allowed values are: + - `:ignore` (default): pairs whose inner product is zero are + omitted from the current Hessian approximation but are + retained in memory for further iterations. This may lead + to non-positive-definite Hessians and non-descent directions + being selected and thus needs to be handled elsewhere. + - `:byrd`: pairs such that `inner(M, p, X_s, Y_s) <= iszero_abstol * norm(M, p, Y_s)^2` + are removed from memory (see [ByrdLuNocedalZhu:1995](@cite), + Eq. (3.9) and its discussion). +* `sy_tol`: tolerance for detecting non-positive-definite pairs (X_s, X_y). + The pairs may lose positive-definiteness after vector transport. # Constructor @@ -597,6 +659,8 @@ mutable struct QuasiNewtonLimitedMemoryDirectionUpdate{ initial_scale::G project!::Proj vector_transport_method::VT + nonpositive_curvature_behavior::Symbol + sy_tol::F message::String end function QuasiNewtonLimitedMemoryDirectionUpdate( @@ -608,6 +672,8 @@ function QuasiNewtonLimitedMemoryDirectionUpdate( initial_scale::G = 1.0, (project!)::Proj = (copyto!), vector_transport_method::VTM = default_vector_transport_method(M, typeof(p)), + nonpositive_curvature_behavior::Symbol = :ignore, + sy_tol::Real = 1.0e-8, ) where { NT <: AbstractQuasiNewtonUpdateRule, T, @@ -630,6 +696,8 @@ function QuasiNewtonLimitedMemoryDirectionUpdate( _initial_state, project!, vector_transport_method, + nonpositive_curvature_behavior, + sy_tol, "", ) end @@ -661,20 +729,7 @@ function (d::QuasiNewtonLimitedMemoryDirectionUpdate{InverseBFGS})( end # backward pass for i in m:-1:1 - # what if division by zero happened here, setting to zero ignores this in the next step - # pre-compute in case inner is expensive - v = inner(M, p, d.memory_s[i], d.memory_y[i]) - if iszero(v) - d.ρ[i] = zero(eltype(d.ρ)) - if length(d.message) > 0 - d.message = replace(d.message, " i=" => " i=$i,") - d.message = replace(d.message, "summand in" => "summands in") - else - d.message = "The inner products ⟨s_i,y_i⟩ ≈ 0, i=$i, ignoring summand in approximation." - end - else - d.ρ[i] = 1 / v - end + # d.ρ is precomputed in the Hessian update d.ξ[i] = inner(M, p, d.memory_s[i], r) * d.ρ[i] r .-= d.ξ[i] .* d.memory_y[i] end @@ -726,13 +781,17 @@ function initialize_update!(d::QuasiNewtonLimitedMemoryDirectionUpdate) return d end +function show(io::IO, qns::QuasiNewtonLimitedMemoryDirectionUpdate) + return print(io, "QuasiNewtonLimitedMemoryDirectionUpdate with memory size $(length(qns.memory_s)) and $(qns.vector_transport_method) as vector transport.") +end + @doc """ QuasiNewtonCautiousDirectionUpdate <: AbstractQuasiNewtonDirectionUpdate These [`AbstractQuasiNewtonDirectionUpdate`](@ref)s represent any quasi-Newton update rule, which are based on the idea of a so-called cautious update. The search direction is calculated as given in [`QuasiNewtonMatrixDirectionUpdate`](@ref) or [`QuasiNewtonLimitedMemoryDirectionUpdate`](@ref), -butut the update then is only executed if +but the update then is only executed if ```math $(_tex(:frac, "g_{x_{k+1}}(y_k,s_k)", "$(_tex(:norm, "s_k"; index = "x_{k+1}"))^{2}")) ≥ θ $(_tex(:norm, "$(_tex(:grad))f(p_k)"; index = "p_k")), diff --git a/src/plans/record.jl b/src/plans/record.jl index 9c52aae41b..c4d3d35050 100644 --- a/src/plans/record.jl +++ b/src/plans/record.jl @@ -74,21 +74,30 @@ end function RecordSolverState(s::S, symbol::Symbol) where {S <: AbstractManoptSolverState} return RecordSolverState{S}(s; RecordFactory(get_state(s), symbol)...) end -function status_summary(rst::RecordSolverState) +function status_summary(rst::RecordSolverState; context::Symbol = :default) + (context === :short) && return repr(rst) + (context === :inline) && (return "A RecordSolverState for $(status_summary(rst.state; context = context))") if length(rst.recordDictionary) > 0 return """ - $(rst.state) + $(status_summary(rst.state; context = context)) ## Record $(rst.recordDictionary) """ - else - return "RecordSolverState($(rst.state), $(rst.recordDictionary))" + else # We indicate there is a record but no registered recordings + return """ + $(status_summary(rst.state; context = context)) + + ## Record + No recordings registered. + """ end end -function show(io::IO, rst::RecordSolverState) - return print(io, status_summary(rst)) +# 2-argument show, used by Array show, print(obj) and repr(obj), keep it short +function Base.show(io::IO, obj::RecordSolverState) + return print(io, "RecordSolverState($(obj.state), $(obj.recordDictionary))") end + dispatch_state_decorator(::RecordSolverState) = Val(true) @doc """ @@ -251,17 +260,29 @@ function (re::RecordEvery)( ) return nothing end -function show(io::IO, re::RecordEvery) +function Base.show(io::IO, re::RecordEvery) return print(io, "RecordEvery($(re.record), $(re.every), $(re.always_update))") end -function status_summary(re::RecordEvery) - s = "" - if re.record isa RecordGroup - s = status_summary(re.record)[3:(end - 2)] - else - s = "$(re.record)" +function status_summary(re::RecordEvery; context::Symbol = :default) + if context === :short + s = "" + if re.record isa RecordGroup + s = status_summary(re.record; context = context)[2:(end - 1)] + else + s = "$(status_summary(re.record; context = context))" + end + return "[$s, $(re.every)]" end - return "[$s, $(re.every)]" + s = "" + (re.every % 10 == 2) && (s = "every $(re.every)nd") + (re.every % 10 == 3) && (s = "every $(re.every)rd") + (re.every % 10 ∉ [2, 3]) && (s = "every $(re.every)th") + (re.every == 1) && (s = "every") + (context === :inline) && return "A RecordAction that records its inner action $s iteration" + return """ + A RecordAction that records $s iteration with + $(_MANOPT_INDENT)$(_in_str(status_summary(re.record; context = context); indent = 1)) + """ end get_record(r::RecordEvery) = get_record(r.record) get_record(r::RecordEvery, k) = get_record(r.record, k) @@ -339,10 +360,12 @@ function (d::RecordGroup)(p::AbstractManoptProblem, s::AbstractManoptSolverState end return end -function status_summary(rg::RecordGroup) - return "[ $(join(["$(status_summary(ri))" for ri in rg.group], ", ")) ]" +function status_summary(rg::RecordGroup; context::Symbol = :default) + (context === :short) && (return "[$(join(["$(status_summary(ri; context = context))" for ri in rg.group], ", "))]") + (context === :inline) && (return "A group of $(length(rg.group)) RecordActions") + return "A group of $(length(rg.group)) RecordActions:\n $(join(["* $(status_summary(ri; context = context))" for ri in rg.group], "\n"))\n" end -function show(io::IO, rg::RecordGroup) +function Base.show(io::IO, rg::RecordGroup) s = join(["$(ri)" for ri in rg.group], ", ") return print(io, "RecordGroup([$s])") end @@ -418,19 +441,28 @@ function (rsr::RecordSubsolver)( record_or_reset!(rsr, get_record(get_sub_state(ams), rsr.record...), k) return nothing end -function show(io::IO, rsr::RecordSubsolver{R}) where {R} +function Base.show(io::IO, rsr::RecordSubsolver{R}) where {R} return print(io, "RecordSubsolver(; record=$(rsr.record), record_type=$R)") end -status_summary(::RecordSubsolver) = ":Subsolver" +function status_summary(rsr::RecordSubsolver{R}; context::Symbol = :default) where {R} + (context === :short) && return ":Subsolver" + (context === :inline) && return "A RecordAction to specify something to record from each subsolver run" + return """ + A RecordAction to record elements in from each subsolver run of type $R. + + ## Recorded values + The following recorded symbols from the sub state are recorded in every iteration of the (outer) solver + $(join([ " * :$(s)" for s in rsr.record], "\n")) + """ +end @doc """ RecordWhenActive <: RecordAction record action that only records if the `active` boolean is set to true. -This can be set from outside and is for example triggered by |`RecordEvery`](@ref) -on recordings of the subsolver. -While this is for sub solvers maybe not completely necessary, recording values that -are never accessible, is not that useful. +This can be set from outside and is for example triggered by [`RecordEvery`](@ref) +on recordings of a subsolver. While this is for sub solvers maybe not completely necessary, +recording values that are never accessible, is not that useful. # Fields @@ -451,7 +483,6 @@ mutable struct RecordWhenActive{R <: RecordAction} <: RecordAction return new{R}(r, active, always_update) end end - function (rwa::RecordWhenActive)( amp::AbstractManoptProblem, ams::AbstractManoptSolverState, k::Int ) @@ -461,11 +492,16 @@ function (rwa::RecordWhenActive)( rwa.record(amp, ams, k) end end -function show(io::IO, rwa::RecordWhenActive) +function Base.show(io::IO, rwa::RecordWhenActive) return print(io, "RecordWhenActive($(rwa.record), $(rwa.active), $(rwa.always_update))") end -function status_summary(rwa::RecordWhenActive) - return repr(rwa) +function status_summary(rwa::RecordWhenActive; context::Symbol = :default) + (context === :short) && (return repr(rwa)) + (context === :inline) && (return "A RecordAction that only records its inner action when active (currently: $(rwa.active ? "" : "in")active)") + return """ + Record the following only, when active (currently: $(rwa.active ? "" : "in")active) + $(_in_str(status_summary(rwa.record; context = context), indent = 1, headers = 0)) + """ end function set_parameter!(rwa::RecordWhenActive, v::Val, args...) set_parameter!(rwa.record, v, args...) @@ -479,30 +515,6 @@ get_record(r::RecordWhenActive, args...) = get_record(r.record, args...) # # Specific Record types # - -@doc """ - RecordCost <: RecordAction - -Record the current cost function value, see [`get_cost`](@ref). - -# Fields - -* `recorded_values` : to store the recorded values - -# Constructor - - RecordCost() -""" -mutable struct RecordCost <: RecordAction - recorded_values::Array{Float64, 1} - RecordCost() = new(Array{Float64, 1}()) -end -function (r::RecordCost)(amp::AbstractManoptProblem, s::AbstractManoptSolverState, k::Int) - return record_or_reset!(r, get_cost(amp, get_iterate(s)), k) -end -show(io::IO, ::RecordCost) = print(io, "RecordCost()") -status_summary(di::RecordCost) = ":Cost" - @doc """ RecordChange <: RecordAction @@ -549,12 +561,9 @@ mutable struct RecordChange{ return new{typeof(irm), typeof(storage)}(Vector{Float64}(), storage, irm) end function RecordChange( - p, - a::StoreStateAction = StoreStateAction([:Iterate]); + p, a::StoreStateAction = StoreStateAction([:Iterate]); manifold::AbstractManifold = DefaultManifold(1), - inverse_retraction_method::IRT = default_inverse_retraction_method( - manifold, typeof(p) - ), + inverse_retraction_method::IRT = default_inverse_retraction_method(manifold, typeof(p)), ) where {IRT <: AbstractInverseRetractionMethod} update_storage!(a, Dict(:Iterate => p)) return new{IRT, typeof(a)}(Vector{Float64}(), a, inverse_retraction_method) @@ -567,8 +576,7 @@ function (r::RecordChange)(amp::AbstractManoptProblem, s::AbstractManoptSolverSt if has_storage(r.storage, PointStorageKey(:Iterate)) distance( M, - get_iterate(s), - get_storage(r.storage, PointStorageKey(:Iterate)), + get_iterate(s), get_storage(r.storage, PointStorageKey(:Iterate)), r.inverse_retraction_method, ) else @@ -579,12 +587,41 @@ function (r::RecordChange)(amp::AbstractManoptProblem, s::AbstractManoptSolverSt r.storage(amp, s, k) return r.recorded_values end -function show(io::IO, rc::RecordChange) +function Base.show(io::IO, rc::RecordChange) return print( io, "RecordChange(; inverse_retraction_method=$(rc.inverse_retraction_method))" ) end -status_summary(rc::RecordChange) = ":Change" +function status_summary(::RecordChange; context::Symbol = :default) + (context === :short) && return ":Change" + return "A RecordAction to record the change of the iterate" +end + +@doc """ + RecordCost <: RecordAction + +Record the current cost function value, see [`get_cost`](@ref). + +# Fields + +* `recorded_values` : to store the recorded values + +# Constructor + + RecordCost() +""" +mutable struct RecordCost <: RecordAction + recorded_values::Array{Float64, 1} + RecordCost() = new(Array{Float64, 1}()) +end +function (r::RecordCost)(amp::AbstractManoptProblem, s::AbstractManoptSolverState, k::Int) + return record_or_reset!(r, get_cost(amp, get_iterate(s)), k) +end +show(io::IO, ::RecordCost) = print(io, "RecordCost()") +function status_summary(::RecordCost; context::Symbol = :default) + (context === :short) && return ":Cost" + return "A RecordAction to record the cost value" +end @doc """ RecordEntry{T} <: RecordAction @@ -620,8 +657,12 @@ function (r::RecordEntry{T})( ) where {T} return record_or_reset!(r, getfield(s, r.field), i) end -function show(io::IO, di::RecordEntry) - return print(io, "RecordEntry(:$(di.field))") +function Base.show(io::IO, ra::RecordEntry) + return print(io, "RecordEntry(:$(ra.field))") +end +function status_summary(ra::RecordEntry; context::Symbol = :default) + (context === :short) && return ":$(ra.field)" + return "A RecordAction to record the solver state field :$(ra.field)" end @doc """ @@ -661,8 +702,12 @@ function (r::RecordEntryChange)( r.storage(amp, ams, k) return record_or_reset!(r, value, k) end -function show(io::IO, rec::RecordEntryChange) - return print(io, "RecordEntryChange(:$(rec.field), $(rec.distance))") +function Base.show(io::IO, ra::RecordEntryChange) + return print(io, "RecordEntryChange(:$(ra.field), $(ra.distance))") +end +function status_summary(ra::RecordEntryChange; context::Symbol = :default) + (context === :short) && return repr(ra) + return "A RecordAction to record the solver state field's :$(ra.field) change using the function $(ra.distance)" end @doc """ @@ -694,10 +739,13 @@ function (r::RecordIterate{T})( ) where {T} return record_or_reset!(r, get_iterate(s), i) end -function show(io::IO, ri::RecordIterate) +function Base.show(io::IO, ri::RecordIterate) return print(io, "RecordIterate($(eltype(ri.recorded_values)))") end -status_summary(di::RecordIterate) = ":Iterate" +function status_summary(di::RecordIterate; context::Symbol = :default) + (context === :short) && return ":Iterate" + return "A RecordAction to record the current iterate" +end @doc """ RecordIteration <: RecordAction @@ -712,8 +760,10 @@ function (r::RecordIteration)(::AbstractManoptProblem, ::AbstractManoptSolverSta return record_or_reset!(r, k, k) end show(io::IO, ::RecordIteration) = print(io, "RecordIteration()") -status_summary(::RecordIteration) = ":Iteration" - +function status_summary(::RecordIteration; context::Symbol = :default) + (context === :short) && return ":Iteration" + return "A RecordAction to record the current iteration number" +end @doc """ RecordStoppingReason <: RecordAction @@ -730,7 +780,10 @@ function (rsr::RecordStoppingReason)( return (length(s) > 0) && record_or_reset!(rsr, s, k) end show(io::IO, ::RecordStoppingReason) = print(io, "RecordStoppingReason()") -status_summary(di::RecordStoppingReason) = ":Stop" +function status_summary(::RecordStoppingReason; context::Symbol = :default) + (context === :short) && return ":Stop" + return "A RecordAction to record the stopping reason" +end @doc """ RecordTime <: RecordAction @@ -768,15 +821,18 @@ function (r::RecordTime)(p::AbstractManoptProblem, s::AbstractManoptSolverState, return record_or_reset!(r, t, k) end end -function show(io::IO, ri::RecordTime) +function Base.show(io::IO, ri::RecordTime) return print(io, "RecordTime(; mode=:$(ri.mode))") end -status_summary(ri::RecordTime) = (ri.mode === :iterative ? ":IterativeTime" : ":Time") +function status_summary(ri::RecordTime; context::Symbol = :default) + (context == :short) && return (ri.mode === :iterative ? ":IterativeTime" : ":Time") + # Inline and Default: + return "A RecordAction for recording times" * (ri.mode == :iterative ? " iteratively" : ".") +end # # Factory # - @doc """ RecordFactory(s::AbstractManoptSolverState, a) @@ -899,15 +955,15 @@ create a [`RecordAction`](@ref) where * a [`RecordAction`](@ref) is passed through * a [`Symbol`] creates * `:Change` to record the change of the iterates, see [`RecordChange`](@ref) + * `:Cost` to record the current cost function value * `:Gradient` to record the gradient, see [`RecordGradient`](@ref) * `:GradientNorm to record the norm of the gradient, see [`RecordGradientNorm`](@ref) * `:Iterate` to record the iterate * `:Iteration` to record the current iteration number - * `IterativeTime` to record the time iteratively - * `:Cost` to record the current cost function value + * `:IterativeTime` to record the times taken for each iteration. + * `:ProximalParameter` to record the proximal parameter, see [`RecordProximalParameter`](@ref) * `:Stepsize` to record the current step size * `:Time` to record the total time taken after every iteration - * `:IterativeTime` to record the times taken for each iteration. and every other symbol is passed to [`RecordEntry`](@ref), which results in recording the field of the state with the symbol indicating the field of the solver to record. @@ -922,6 +978,7 @@ function RecordActionFactory(s::AbstractManoptSolverState, symbol::Symbol) (symbol == :Iterate) && return RecordIterate(get_iterate(s)) (symbol == :Iteration) && return RecordIteration() (symbol == :IterativeTime) && return RecordTime(; mode = :iterative) + (symbol == :ProximalParameter) && return RecordProximalParameter() (symbol == :Stepsize) && return RecordStepsize() (symbol == :Stop) && return RecordStoppingReason() (symbol == :Subsolver) && return RecordSubsolver() diff --git a/src/plans/scaled_objective.jl b/src/plans/scaled_objective.jl index 5b105c6302..699f2f7a56 100644 --- a/src/plans/scaled_objective.jl +++ b/src/plans/scaled_objective.jl @@ -20,12 +20,19 @@ For now the functions rescaled are # Constructors ScaledManifoldObjective(objective, scale::Real=1) - -objective - scale*objective Generate a scaled manifold objective based on `objective` with `scale` being `1` by default in the first, `scale=-1` in the second case. The multiplication from the left with a scalar is also overloaded. + + -objective + +The single-parameter minus is overloaded to have a short notation turning a maximization problem +into a minimization one, which would fit the framework provided within Manopt.jl + + scale * objective + +Equivalent to the first constructor, but might be nicer to write in a few places. """ struct ScaledManifoldObjective{ E <: AbstractEvaluationType, O2, O1 <: AbstractManifoldObjective{E}, F, @@ -126,9 +133,15 @@ function get_hessian_function( return (M, Y, p, X) -> get_hessian!(M, Y, scaled_objective, p, X) end -function show(io::IO, scaled_objective::ScaledManifoldObjective{P, T}) where {P, T} +function Base.show(io::IO, scaled_objective::ScaledManifoldObjective) return print( - io, - "ScaledManifoldObjective based on a $(scaled_objective.objective) with scale $(scaled_objective.scale)", + io, "ScaledManifoldObjective($(repr(scaled_objective.objective)), $(scaled_objective.scale))", ) end +function status_summary(scaled_objective::ScaledManifoldObjective; context::Symbol = :default) + # short and inline + (context === :short) && (return "$(scaled_objective.scale) * $(status_summary(scaled_objective.objective; context = context))") + (context === :inline) && (return "$(status_summary(scaled_objective.objective; context = context)) scaled by a factor of $(scaled_objective.scale)") + # default + return "A scaled version of the objective\n$(status_summary(scaled_objective.objective; context = context))\nscaled by a factor of $(scaled_objective.scale)" +end diff --git a/src/plans/solver_state.jl b/src/plans/solver_state.jl index b25c4c7b5c..872d6c99c3 100644 --- a/src/plans/solver_state.jl +++ b/src/plans/solver_state.jl @@ -15,6 +15,11 @@ $(_fields(:stopping_criterion; name = "stop")) """ abstract type AbstractManoptSolverState end +function Base.show(io::IO, ::MIME"text/plain", ams::AbstractManoptSolverState) + multiline = get(io, :multiline, true) + return multiline ? status_summary(io, ams) : show(io, ams) +end + """ ClosedFormSubSolverState{E<:AbstractEvaluationType} <: AbstractManoptSolverState @@ -34,8 +39,11 @@ function ClosedFormSubSolverState(; ) where {E <: AbstractEvaluationType} return ClosedFormSubSolverState(evaluation) end +Base.show(io::IO, cfss::ClosedFormSubSolverState{E}) where {E} = print(io, "ClosedFormSubSolverState(; $(_to_kw(E)))") +status_summary(cfss::ClosedFormSubSolverState; context::Symbol = :default) = repr(cfss) maybe_wrap_evaluation_type(s::AbstractManoptSolverState) = s +maybe_wrap_evaluation_type(n::Nothing) = n function maybe_wrap_evaluation_type(::E) where {E <: AbstractEvaluationType} return ClosedFormSubSolverState{E}() end @@ -116,8 +124,8 @@ should be returned at the end of a solver instead of the usual minimizer. struct ReturnSolverState{S <: AbstractManoptSolverState} <: AbstractManoptSolverState state::S end -status_summary(rst::ReturnSolverState) = status_summary(rst.state) -show(io::IO, rst::ReturnSolverState) = print(io, "ReturnSolverState($(rst.state))") +status_summary(rst::ReturnSolverState; context::Symbol = :default) = status_summary(rst.state; context = context) +show(io::IO, rst::ReturnSolverState) = print(io, "ReturnSolverState(", rst.state, ")") dispatch_state_decorator(::ReturnSolverState) = Val(true) """ @@ -304,6 +312,13 @@ for example within the [`DebugSolverState`](@ref) or within the [`RecordSolverSt """ abstract type AbstractStateAction end +status_summary(asa::AbstractStateAction; context::Symbol = :default) = repr(asa) + +function Base.show(io::IO, ::MIME"text/plain", asa::AbstractStateAction) + multiline = get(io, :multiline, true) + return multiline ? status_summary(io, asa) : show(io, asa) +end + mutable struct StorageRef{T} x::T end diff --git a/src/plans/stepsize/initial_guess.jl b/src/plans/stepsize/initial_guess.jl index 4c9788b6a8..be6d67b7d1 100644 --- a/src/plans/stepsize/initial_guess.jl +++ b/src/plans/stepsize/initial_guess.jl @@ -145,6 +145,7 @@ function (hzi::HagerZhangInitialGuess{TF})( k::Int, last_stepsize::Real, η; lf0 = get_cost(mp, get_iterate(s)), Dlf0 = get_differential(mp, get_iterate(s), η), + kwargs... ) where {TF <: Real} M = get_manifold(mp) p = get_iterate(s) @@ -152,6 +153,13 @@ function (hzi::HagerZhangInitialGuess{TF})( alphamax = min(hzi.alphamax, max_stepsize(M, p)) + if :stop_when_stepsize_exceeds in keys(kwargs) + alphamax = min( + kwargs[:stop_when_stepsize_exceeds], + alphamax, + ) + end + if k == 1 point_d = hzi.point_distance(M, p) # Step I0 diff --git a/src/plans/stepsize/linesearch.jl b/src/plans/stepsize/linesearch.jl index 931aa91738..d993b9e072 100644 --- a/src/plans/stepsize/linesearch.jl +++ b/src/plans/stepsize/linesearch.jl @@ -23,6 +23,24 @@ abstract type Stepsize end get_message(::S) where {S <: Stepsize} = "" +function Base.show(io::IO, ::MIME"text/plain", ams::Stepsize) + multiline = get(io, :multiline, true) + return multiline ? status_summary(io, ams) : show(io, ams) +end + +""" + initialize_stepsize!(sm::Stepsize) + +Initialize the state of a stepsize functor. This function should be called in the +`initialize_solver!` function for solvers that do possess a stepsize and can be used to +set up internal state of the stepsize functor that is preserved between line searches in +the same optimization, for example adaptive thresholds for Wolfe criteria in Hager-Zhang +line search. + +By default it does nothing. +""" +initialize_stepsize!(sm::Stepsize) = sm + """ default_stepsize(M::AbstractManifold, ams::AbstractManoptSolverState) @@ -46,21 +64,40 @@ function max_stepsize(M::AbstractManifold, p) injectivity_radius(M, p) catch is_tutorial_mode() && - @warn "`max_stepsize was called, but there seems to not be an `injectivity_raidus` available on $M." + @warn "`max_stepsize was called, but there seems to not be an `injectivity_radius` available on $M." Inf end return s end +function max_stepsize(M::ProductManifold, p) + return min(map(max_stepsize, M.manifolds, submanifold_components(M, p))...) +end +function max_stepsize(M::AbstractPowerManifold, p) + stepsize = number_eltype(p)(Inf) + rep_size = representation_size(M.manifold) + for i in get_iterator(M) + cur_stepsize = max_stepsize(M.manifold, _read(M, rep_size, p, i)) + stepsize = min(cur_stepsize, stepsize) + end + return stepsize +end function max_stepsize(M::AbstractManifold) s = try injectivity_radius(M) catch is_tutorial_mode() && - @warn "`max_stepsize was called, but there seems to not be an `injectivity_raidus` available on $M." + @warn "`max_stepsize was called, but there seems to not be an `injectivity_radius` available on $M." Inf end return s end +function max_stepsize(M::ProductManifold) + return min(map(max_stepsize, M.manifolds)...) +end +function max_stepsize(M::AbstractPowerManifold) + return max_stepsize(M.manifold) +end + """ Linesearch <: Stepsize @@ -129,8 +166,8 @@ $(_kwargs(:retraction_method)) * `gradient = nothing`: precomputed gradient at point `p` * `report_messages_in::NamedTuple = (; )`: a named tuple of [`StepsizeMessage`](@ref)s to report messages in. currently supported keywords are `:non_descent_direction`, `:stepsize_exceeds`, `:stepsize_less`, `:stop_increasing`, `:stop_decreasing` -* `stop_when_stepsize_less=0.0`: to avoid numerical underflow -* `stop_when_stepsize_exceeds=`[`max_stepsize`](@ref)`(M, p) / norm(M, p, η)`) to avoid leaving the injectivity radius on a manifold +* `stop_when_stepsize_less::Real=0.0`: to avoid numerical underflow +* `stop_when_stepsize_exceeds::Real=`[`max_stepsize`](@ref)`(M, p) / norm(M, p, η)`) to avoid leaving the injectivity radius on a manifold or exceeding boundaries on a manifold with corners * `stop_increasing_at_step=100`: stop the initial increase of step size after these many steps * `stop_decreasing_at_step=`1000`: stop the decreasing search after these many steps @@ -151,22 +188,13 @@ end @doc "$_doc_linesearch_backtrack" function linesearch_backtrack!( - M::AbstractManifold, - q, - f::TF, - p, - s, - decrease, - contract, - η::T; - lf0 = f(M, p), - gradient = nothing, + M::AbstractManifold, q, f::TF, p, s, decrease, contract, η::T; + lf0::Real = f(M, p), gradient = nothing, Dlf0 = isnothing(gradient) ? nothing : real(inner(M, p, gradient, η)), retraction_method::AbstractRetractionMethod = default_retraction_method(M, typeof(p)), additional_increase_condition = (M, p) -> true, additional_decrease_condition = (M, p) -> true, - stop_when_stepsize_less = 0.0, - stop_when_stepsize_exceeds = max_stepsize(M, p) / norm(M, p, η), + stop_when_stepsize_less::Real = 0.0, stop_when_stepsize_exceeds::Real = max_stepsize(M, p) / norm(M, p, η), stop_increasing_at_step = 100, stop_decreasing_at_step = 1000, report_messages_in::NamedTuple = (;), @@ -182,14 +210,14 @@ function linesearch_backtrack!( while f_q < lf0 + decrease * s * Dlf0 || !additional_increase_condition(M, q) (stop_increasing_at_step == 0) && break i = i + 1 - s = s / contract + s = min(s / contract, stop_when_stepsize_exceeds) ManifoldsBase.retract_fused!(M, q, p, η, s, retraction_method) f_q = f(M, q) if i == stop_increasing_at_step set_message!(report_messages_in, :stop_increasing, at = i, bound = stop_increasing_at_step, value = s) break end - if s > stop_when_stepsize_exceeds + if s >= stop_when_stepsize_exceeds set_message!(report_messages_in, :stepsize_exceeds, at = i, bound = stop_when_stepsize_exceeds, value = s) break end diff --git a/src/plans/stepsize/stepsize.jl b/src/plans/stepsize/stepsize.jl index 4f573bf7b0..40ef290ec6 100644 --- a/src/plans/stepsize/stepsize.jl +++ b/src/plans/stepsize/stepsize.jl @@ -41,10 +41,10 @@ with the fields keyword arguments and the retraction is set to the default retra $(_kwargs(:retraction_method)) * `contraction_factor=0.95` * `sufficient_decrease=0.1` -* `last_stepsize=initialstepsize` +* `last_stepsize=initial_stepsize` * `initial_guess=`[`ArmijoInitialGuess`](@ref)`()` * `stop_when_stepsize_less=0.0`: stop when the stepsize decreased below this version. -* `stop_when_stepsize_exceeds=[`max_step`](@ref)`(M)`: provide an absolute maximal step size. +* `stop_when_stepsize_exceeds=[`max_stepsize`](@ref)`(M)`: provide an absolute maximal step size. * `stop_increasing_at_step=100`: for the initial increase test, stop after these many steps * `stop_decreasing_at_step=1000`: in the backtrack, stop after these many steps """ @@ -64,80 +64,83 @@ mutable struct ArmijoLinesearchStepsize{TRM <: AbstractRetractionMethod, P, I, F additional_decrease_condition::DF additional_increase_condition::IF messages::MSGS + function ArmijoLinesearchStepsize(; + additional_decrease_condition::DF, additional_increase_condition::IF, + candidate_point::P, contraction_factor::F, initial_stepsize::F, last_stepsize::F, + initial_guess::IGF, retraction_method::TRM, + stop_when_stepsize_less::F, stop_when_stepsize_exceeds::F, sufficient_decrease::F, + stop_increasing_at_step::I, stop_decreasing_at_step::I, messages::MSGS + ) where {TRM <: AbstractRetractionMethod, P, I <: Integer, F <: Real, IGF, DF, IF, MSGS} + return new{TRM, P, I, F, IGF, DF, IF, MSGS}( + candidate_point, contraction_factor, initial_guess, initial_stepsize, + last_stepsize, retraction_method, sufficient_decrease, + stop_when_stepsize_less, stop_when_stepsize_exceeds, stop_increasing_at_step, stop_decreasing_at_step, + additional_decrease_condition, additional_increase_condition, messages, + ) + end function ArmijoLinesearchStepsize( M::AbstractManifold; - additional_decrease_condition::DF = (M, p) -> true, - additional_increase_condition::IF = (M, p) -> true, + additional_decrease_condition::DF = (M, p) -> true, additional_increase_condition::IF = (M, p) -> true, candidate_point::P = allocate_result(M, rand), - contraction_factor::F = 0.95, - initial_stepsize::F = 1.0, - initial_guess::IGF = ArmijoInitialGuess(), - retraction_method::TRM = default_retraction_method(M), - stop_when_stepsize_less::F = 0.0, - stop_when_stepsize_exceeds::Real = max_stepsize(M), - stop_increasing_at_step::I = 100, - stop_decreasing_at_step::I = 1000, - sufficient_decrease = 0.1, - ) where {TRM <: AbstractRetractionMethod, P, I, F <: Real, IGF, DF, IF} + contraction_factor::Real = 0.95, initial_stepsize::Real = 1.0, last_stepsize::Real = initial_stepsize, + initial_guess::IGF = ArmijoInitialGuess(), retraction_method::TRM = default_retraction_method(M), + stop_when_stepsize_less::Real = 0.0, stop_when_stepsize_exceeds::Real = max_stepsize(M), + stop_increasing_at_step::Integer = 100, stop_decreasing_at_step::Integer = 1000, + sufficient_decrease::Real = 0.1, + ) where {TRM <: AbstractRetractionMethod, P, IGF, DF, IF} + R = promote_type( + typeof(contraction_factor), typeof(initial_stepsize), typeof(last_stepsize), + typeof(stop_when_stepsize_exceeds), typeof(stop_when_stepsize_less), typeof(sufficient_decrease), + ) + cf = convert(R, contraction_factor); is = convert(R, initial_stepsize); ls = convert(R, last_stepsize) + swse = convert(R, stop_when_stepsize_exceeds); swsl = convert(R, stop_when_stepsize_less) + sd = convert(R, sufficient_decrease) + I = promote_type(typeof(stop_increasing_at_step), typeof(stop_decreasing_at_step)) + sias = convert(I, stop_increasing_at_step); sdas = convert(I, stop_decreasing_at_step) msgs = (; - non_descent_direction = StepsizeMessage{F, F}(), - stop_decreasing = StepsizeMessage{Int, F}(), - stop_increasing = StepsizeMessage{Int, F}(), - stepsize_less = StepsizeMessage{F, F}(), - stepsize_exceeds = StepsizeMessage{F, F}(), + non_descent_direction = StepsizeMessage{R, R}(), + stop_decreasing = StepsizeMessage{I, R}(), stop_increasing = StepsizeMessage{I, R}(), + stepsize_less = StepsizeMessage{R, R}(), stepsize_exceeds = StepsizeMessage{R, R}(), ) - return new{TRM, P, I, F, IGF, DF, IF, typeof(msgs)}( - candidate_point, - contraction_factor, - initial_guess, - initial_stepsize, - initial_stepsize, - retraction_method, - sufficient_decrease, - stop_when_stepsize_less, - stop_when_stepsize_exceeds, - stop_increasing_at_step, - stop_decreasing_at_step, - additional_decrease_condition, - additional_increase_condition, - msgs, + return ArmijoLinesearchStepsize(; + additional_decrease_condition = additional_decrease_condition, + additional_increase_condition = additional_increase_condition, + candidate_point = candidate_point, contraction_factor = cf, initial_stepsize = is, last_stepsize = ls, + initial_guess = initial_guess, retraction_method = retraction_method, + stop_when_stepsize_less = swsl, stop_when_stepsize_exceeds = swse, sufficient_decrease = sd, + stop_increasing_at_step = sias, stop_decreasing_at_step = sdas, messages = msgs ) end end function ArmijoLinesearchStepsize(M::AbstractManifold, p; kwargs...) return ArmijoLinesearchStepsize(M; candidate_point = allocate(p), kwargs...) end - function (a::ArmijoLinesearchStepsize)( - mp::AbstractManoptProblem, - s::AbstractManoptSolverState, - k::Int, - η = (-get_gradient(mp, get_iterate(s))); - gradient = nothing, - kwargs..., + mp::AbstractManoptProblem, s::AbstractManoptSolverState, k::Int, η = (-get_gradient(mp, get_iterate(s))); + gradient = nothing, kwargs..., ) p = get_iterate(s) grad = isnothing(gradient) ? get_gradient(mp, get_iterate(s)) : gradient - return a(mp, p, grad, η; initial_guess = a.initial_guess(mp, s, k, a.last_stepsize, η)) + return a(mp, p, grad, η; initial_guess = a.initial_guess(mp, s, k, a.last_stepsize, η), kwargs...) end function (a::ArmijoLinesearchStepsize)( - mp::AbstractManoptProblem, p, X, η; initial_guess = 1.0, kwargs... + mp::AbstractManoptProblem, p, X, η; initial_guess::Real = 1.0, + stop_when_stepsize_exceeds = nothing, kwargs... ) reset_messages!(a.messages) l = norm(get_manifold(mp), p, η) + swse = if isnothing(stop_when_stepsize_exceeds) + (a.stop_when_stepsize_exceeds / l) + else + stop_when_stepsize_exceeds + end a.last_stepsize = linesearch_backtrack!( - get_manifold(mp), - a.candidate_point, - (M, p) -> get_cost_function(get_objective(mp))(M, p), - p, - initial_guess, - a.sufficient_decrease, - a.contraction_factor, - η; - gradient = X, - retraction_method = a.retraction_method, + get_manifold(mp), a.candidate_point, + (M, p) -> get_cost_function(get_objective(mp))(M, p), p, + initial_guess, a.sufficient_decrease, a.contraction_factor, η; + gradient = X, retraction_method = a.retraction_method, stop_when_stepsize_less = (a.stop_when_stepsize_less / l), - stop_when_stepsize_exceeds = (a.stop_when_stepsize_exceeds / l), + stop_when_stepsize_exceeds = swse, stop_increasing_at_step = a.stop_increasing_at_step, stop_decreasing_at_step = a.stop_decreasing_at_step, additional_decrease_condition = a.additional_decrease_condition, @@ -147,20 +150,31 @@ function (a::ArmijoLinesearchStepsize)( return a.last_stepsize end get_initial_stepsize(a::ArmijoLinesearchStepsize) = a.initial_stepsize -function show(io::IO, armijo_ls::ArmijoLinesearchStepsize) - return print( - io, - """ - ArmijoLinesearch(; - initial_stepsize=$(armijo_ls.initial_stepsize), - retraction_method=$(armijo_ls.retraction_method), - contraction_factor=$(armijo_ls.contraction_factor), - sufficient_decrease=$(armijo_ls.sufficient_decrease), - )""", - ) -end -function status_summary(armijo_ls::ArmijoLinesearchStepsize) - return "$(armijo_ls)\nand a computed last stepsize of $(armijo_ls.last_stepsize)" +function Base.show(io::IO, a_ls::ArmijoLinesearchStepsize) + print(io, "ArmijoLinesearch(; additional_decrease_condition = ", a_ls.additional_decrease_condition) + print(io, ", additional_increase_condition = ", a_ls.additional_increase_condition) + print(io, ", candidate_point = ", a_ls.candidate_point, ", contraction_factor = ", a_ls.contraction_factor) + print(io, ", initial_stepsize = ", a_ls.initial_stepsize, ", initial_guess = ", a_ls.initial_guess) + print(io, ", last_stepsize = ", a_ls.last_stepsize) + print(io, ", retraction_method = ", a_ls.retraction_method, ", stop_when_stepsize_less = ", a_ls.stop_when_stepsize_less) + print(io, ", stop_when_stepsize_exceeds = ", a_ls.stop_when_stepsize_exceeds, ", sufficient_decrease = ", a_ls.sufficient_decrease) + print(io, ", stop_increasing_at_step = ", a_ls.stop_increasing_at_step, ", stop_decreasing_at_step = ", a_ls.stop_decreasing_at_step) + return print(io, ", messages = ", a_ls.messages, ")") +end +function status_summary(a_ls::ArmijoLinesearchStepsize; context::Symbol = :default) + (context === :short) && return repr(a_ls) + (context === :inline) && return "An Armijo backtracking line search (last stepsize: $(a_ls.last_stepsize))" + return """ + Armijo backtracking line search + A line search based on sufficient decrease backtracking (last stepsize: $(a_ls.last_stepsize)) + + ## Parameters + * contraction_factor: $(_MANOPT_INDENT)$(a_ls.contraction_factor) + * initial guess: $(_MANOPT_INDENT)$(a_ls.initial_guess) + * initial stepsize: $(_MANOPT_INDENT)$(a_ls.initial_stepsize) + * retraction method: $(_MANOPT_INDENT)$(a_ls.retraction_method) + * sufficient decrease: $(_MANOPT_INDENT)$(a_ls.sufficient_decrease) + """ end function get_message(a::ArmijoLinesearchStepsize) s = [get_message(kv[1], kv[2]) for kv in pairs(a.messages)] @@ -268,45 +282,41 @@ mutable struct AdaptiveWNGradientStepsize{I <: Integer, R <: Real, F <: Function gradient_bound::R weight::R count::I + function AdaptiveWNGradientStepsize(; + count_threshold::I, minimal_bound::R, alternate_bound::F, gradient_reduction::R, + gradient_bound::R, weight::R, count::I + ) where {I <: Integer, R <: Real, F} + return new{I, R, F}( + count_threshold, minimal_bound, alternate_bound, gradient_reduction, gradient_bound, weight, count + ) + end end + function AdaptiveWNGradientStepsize( M::AbstractManifold; - p = rand(M), - X = zero_vector(M, p), - adaptive::Bool = true, + p = rand(M), X = zero_vector(M, p), adaptive::Bool = true, count_threshold::I = 4, - minimal_bound::R = 1.0e-4, - gradient_reduction::R = adaptive ? 0.9 : 0.0, - gradient_bound::R = norm(M, p, X), - alternate_bound::F = (bk, hat_c) -> min( + minimal_bound::Real = 1.0e-4, + gradient_reduction::Real = adaptive ? 0.9 : 0.0, + gradient_bound::Real = norm(M, p, X), + alternate_bound = (bk, hat_c) -> min( gradient_bound == 0 ? 1.0 : gradient_bound, max(minimal_bound, bk / (3 * hat_c)) - ), - kwargs..., - ) where {I <: Integer, R <: Real, F <: Function} - if gradient_bound == 0 - # If the gradient bound defaults to zero, set it to 1 - gradient_bound = 1.0 - end - return AdaptiveWNGradientStepsize{I, R, F}( - count_threshold, - minimal_bound, - alternate_bound, - gradient_reduction, - gradient_bound, - gradient_bound, - 0, + ), kwargs..., + ) where {I <: Integer} + R = promote_type(typeof(minimal_bound), typeof(gradient_reduction), typeof(gradient_bound)) + g = gradient_bound == 0 ? one(R) : convert(R, gradient_bound) + return AdaptiveWNGradientStepsize(; + count_threshold = count_threshold, count = zero(I), + minimal_bound = convert(R, minimal_bound), alternate_bound = alternate_bound, + gradient_reduction = convert(R, gradient_reduction), gradient_bound = g, weight = g, ) end function AdaptiveWNGradientStepsize(M::AbstractManifold, p; kwargs...) return AdaptiveWNGradientStepsize(M; p = p, kwargs...) end function (awng::AdaptiveWNGradientStepsize)( - mp::AbstractManoptProblem, - s::AbstractGradientSolverState, - i, - args...; - gradient = nothing, - kwargs..., + mp::AbstractManoptProblem, s::AbstractGradientSolverState, i, args...; + gradient = nothing, kwargs..., ) grad = isnothing(gradient) ? get_gradient(mp, get_iterate(s)) : gradient M = get_manifold(mp) @@ -340,19 +350,25 @@ function (awng::AdaptiveWNGradientStepsize)( end get_initial_stepsize(awng::AdaptiveWNGradientStepsize) = 1 / awng.gradient_bound get_last_stepsize(awng::AdaptiveWNGradientStepsize) = 1 / awng.gradient_bound -function show(io::IO, awng::AdaptiveWNGradientStepsize) - s = """ - AdaptiveWNGradient(; - count_threshold = $(awng.count_threshold), - minimal_bound = $(awng.minimal_bound), - alternate_bound = $(awng.alternate_bound), - gradient_reduction = $(awng.gradient_reduction), - gradient_bound = $(awng.gradient_bound) - ) - - as well as internally the weight ω_k = $(awng.weight) and current count c_k = $(awng.count). +function Base.show(io::IO, awng::AdaptiveWNGradientStepsize) + print(io, "AdaptiveWNGradientStepsize(; count_threshold = ", awng.count_threshold, ", count = ", awng.count) + print(io, ", minimal_bound = ", awng.minimal_bound, ", alternate_bound = ", awng.alternate_bound) + print(io, ", gradient_reduction = ", awng.gradient_reduction, ", gradient_bound = ", awng.gradient_bound) + print(io, ", weight = ", awng.weight) + return print(io, ")") +end +function status_summary(awng::AdaptiveWNGradientStepsize; context::Symbol = :default) + (context === :short) && return repr(awng) + (context === :inline) && return "A Adaptive WN Gradient step size" + return """ + An adaptive Gradient WN step size + (last step size: $(1 / awng.gradient_bound)) + + ## Parameters + * count threshold: $(_MANOPT_INDENT)$(awng.count_threshold) + * minimal_bound: $(_MANOPT_INDENT)$(awng.minimal_bound) + * gradient reduction:$(_MANOPT_INDENT)$(awng.gradient_reduction) """ - return print(io, s) end """ AdaptiveWNGradient(; kwargs...) @@ -463,9 +479,14 @@ function (cs::ConstantStepsize)( return s end get_initial_stepsize(s::ConstantStepsize) = s.length -function show(io::IO, cs::ConstantStepsize) +function Base.show(io::IO, cs::ConstantStepsize) return print(io, "ConstantLength($(cs.length); type=:$(cs.type))") end +function status_summary(s::ConstantStepsize; context::Symbol = :default) + (context === :short) && return repr(s) + r = (s.type === :absolute ? "absolute" : "relative") + return "A $r constant step size of length $(s.length)" +end """ ConstantLength(s; kwargs...) @@ -671,18 +692,20 @@ end Returns the extremal of the quadratic polynomial ``p`` with ``p'(a.t)=a.df``, ``p'(b.t)=b.df``. +The result is algebraically equivalent to `(a.t * b.df - b.t * a.df) / (b.df - a.df)` +but the used formula is more numerically stable. + # Input * `a::UnivariateTriple{R}`: triple of bracket value `a` * `b::UnivariateTriple{R}`: triple bracket value `b` """ function secant(a::UnivariateTriple{R}, b::UnivariateTriple{R}) where {R} - return (a.t * b.df - b.t * a.df) / (b.df - a.df) + return (a.t + b.t) / 2 + (b.t - a.t) * (a.df + b.df) / (2 * (a.df - b.df)) end """ cubic_stepsize_update_step(a::Real, b::Real, c::Real, τ::Real) - Step function to determine the stepsize update `c` described in [Hager:1989](@cite). @@ -706,6 +729,8 @@ function cubic_stepsize_update_step(a::Real, b::Real, c::Real, τ::Real) end """ + get_univariate_triple!(mp::AbstractManoptProblem, cbls::CubicBracketingLinesearchStepsize, p, η, t::Real) + Get the `UnivariateTriple` of the problem `mp` related to the step with stepsize ``t`` from ``p`` in direction ``η``. @@ -716,7 +741,7 @@ stepsize ``t`` from ``p`` in direction ``η``. * `η`: search direction at `p` * `t::Real`: step size """ -function get_univariate_triple!(mp::AbstractManoptProblem, cbls::CubicBracketingLinesearchStepsize, p, η, t) +function get_univariate_triple!(mp::AbstractManoptProblem, cbls::CubicBracketingLinesearchStepsize, p, η, t::Real) M = get_manifold(mp) cbls.last_stepsize = t ManifoldsBase.retract_fused!(M, cbls.candidate_point, p, η, t, cbls.retraction_method) @@ -740,7 +765,11 @@ function (cbls::CubicBracketingLinesearchStepsize)( check_curvature(c::UnivariateTriple) = abs(c.df) < cbls.sufficient_curvature * abs(init.df) n_iter = 0 - t = cbls.last_stepsize + max_step = cbls.max_stepsize + if :stop_when_stepsize_exceeds in keys(kwargs) + max_step = min(max_step, kwargs[:stop_when_stepsize_exceeds]) + end + t = min(cbls.last_stepsize, max_step) c_old = init c = get_univariate_triple!(mp, cbls, p, η, t) a, b = nothing, nothing @@ -755,9 +784,9 @@ function (cbls::CubicBracketingLinesearchStepsize)( (a, b) = c, c_old break end - (t == cbls.max_stepsize) && return t + (t == max_step) && return t t *= cbls.stepsize_increase - t = min(t, cbls.max_stepsize) + t = min(t, max_step) c_old = c c = get_univariate_triple!(mp, cbls, p, η, t) end @@ -798,20 +827,10 @@ function (cbls::CubicBracketingLinesearchStepsize)( end return t end -function show(io::IO, cbls::CubicBracketingLinesearchStepsize) +function Base.show(io::IO, cbls::CubicBracketingLinesearchStepsize) return print( io, - """ - CubicBracketingLinesearch(; - initial_stepsize = $(cbls.initial_stepsize), - stepsize_increase = $(cbls.stepsize_increase), - sufficient_curvature = $(cbls.sufficient_curvature), - min_bracket_width = $(cbls.min_bracket_width), - hybrid = $(cbls.hybrid), - retraction_method = $(cbls.retraction_method), - vector_transport_method = $(cbls.vector_transport_method), - max_stepsize = $(cbls.max_stepsize) - )""", + "CubicBracketingLinesearch(; initial_stepsize = $(cbls.initial_stepsize), stepsize_increase = $(cbls.stepsize_increase), sufficient_curvature = $(cbls.sufficient_curvature), min_bracket_width = $(cbls.min_bracket_width), hybrid = $(cbls.hybrid), retraction_method = $(cbls.retraction_method), vector_transport_method = $(cbls.vector_transport_method), max_stepsize = $(cbls.max_stepsize))", ) end function status_summary(cbls::CubicBracketingLinesearchStepsize) @@ -883,10 +902,10 @@ A functor `(problem, state, ...) -> s` to provide a constant step size `s`. * `type`: a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant. -In total the complete formulae reads for the ``i``th iterate as +In total the complete formulae reads for the ``k``th iterate as ```math -s_i = $(_tex(:frac, "(l - i a)f^i", "(i+s)^e")) +s_k = $(_tex(:frac, "(l - k a)f^k", "(k + s)^e")) ``` and hence the default simplifies to just ``s_i = \frac{l}{i}`` @@ -912,17 +931,23 @@ mutable struct DecreasingStepsize{R <: Real} <: Stepsize exponent::R shift::R type::Symbol + function DecreasingStepsize(; + length::R, factor::R, subtrahend::R, exponent::R, shift::R, type::Symbol + ) where {R} + return new{R}(length, factor, subtrahend, exponent, shift, type) + end end function DecreasingStepsize( M::AbstractManifold; - length::R = isinf(manifold_dimension(M)) ? 1.0 : manifold_dimension(M) / 2, - factor::R = 1.0, - subtrahend::R = 0.0, - exponent::R = 1.0, - shift::R = 0.0, + length::Real = isinf(manifold_dimension(M)) ? 1.0 : manifold_dimension(M) / 2, + factor::Real = 1.0, subtrahend::Real = 0.0, exponent::Real = 1.0, shift::Real = 0.0, type::Symbol = :relative, - ) where {R} - return DecreasingStepsize(length, factor, subtrahend, exponent, shift, type) + ) + R = promote_type(typeof(length), typeof(factor), typeof(subtrahend), typeof(exponent), typeof(shift)) + l = convert(R, length); f = convert(R, factor); s = convert(R, subtrahend); e = convert(R, exponent); t = convert(R, shift) + return DecreasingStepsize(; + length = l, factor = f, subtrahend = s, exponent = e, shift = t, type = type + ) end function (s::DecreasingStepsize)( amp::P, ams::O, k::Int, args...; kwargs... @@ -937,11 +962,27 @@ function (s::DecreasingStepsize)( return ds end get_initial_stepsize(s::DecreasingStepsize) = s.length -function show(io::IO, s::DecreasingStepsize) - return print( - io, - "DecreasingLength(; length=$(s.length), factor=$(s.factor), subtrahend=$(s.subtrahend), shift=$(s.shift), type=$(s.type))", - ) +function Base.show(io::IO, s::DecreasingStepsize) + print(io, "DecreasingStepsize(; length = ", s.length, ", exponent = ", s.exponent, ", factor = ", s.factor) + return print(io, ", subtrahend = ", s.subtrahend, ", shift = ", s.shift, ", type = :$(s.type))") +end +function status_summary(s::DecreasingStepsize; context::Symbol = :default) + (context === :short) && return repr(s) + (context === :inline) && return "A decreasing stepsize ($(s.length) - k*$(s.subtrahend)) * $(s.factor)^k) / (k + $(s.shift))^$(s.exponent)" + return """ + A decreasing step size + For the `k`th iterate compute + + ((l - k*a)f^k) / (k + s)^e + + ## Parameters + * length l: $(_MANOPT_INDENT)$(s.length) + * subtrahend a: $(_MANOPT_INDENT)$(s.subtrahend) + * factor f: $(_MANOPT_INDENT)$(s.factor) + * shift s: $(_MANOPT_INDENT)$(s.shift) + * exponent e: $(_MANOPT_INDENT)$(s.exponent) + * type : $(_MANOPT_INDENT):$(s.type) + """ end """ DegreasingLength(; kwargs...) @@ -1004,26 +1045,26 @@ mutable struct DistanceOverGradientsStepsize{R <: Real, P} <: Stepsize use_curvature::Bool sectional_curvature_bound::R last_stepsize::R + function DistanceOverGradientsStepsize(; + initial_distance::R, max_distance::R, gradient_sum::R, initial_point::P, + use_curvature::Bool, sectional_curvature_bound::R, last_stepsize::R + ) where {R <: Real, P} + return new{R, P}( + initial_distance, max_distance, gradient_sum, initial_point, use_curvature, + sectional_curvature_bound, last_stepsize, + ) + end end - function DistanceOverGradientsStepsize( - M::AbstractManifold, - p; - initial_distance::R1 = 1.0e-3, - use_curvature::Bool = false, - sectional_curvature_bound::R2 = 0.0, + M::AbstractManifold, p; + initial_distance::R1 = 1.0e-3, use_curvature::Bool = false, sectional_curvature_bound::R2 = 0.0, ) where {R1 <: Real, R2 <: Real} R = promote_type(R1, R2) id = convert(R, initial_distance) κ = convert(R, sectional_curvature_bound) - return DistanceOverGradientsStepsize{R, typeof(p)}( - id, - id, # max_distance starts at initial_distance - zero(R), # gradient_sum starts at 0 - copy(M, p), # store initial point - use_curvature, - κ, - NaN, # last_stepsize + return DistanceOverGradientsStepsize(; + initial_distance = id, max_distance = id, gradient_sum = zero(R), initial_point = copy(M, p), + use_curvature = use_curvature, sectional_curvature_bound = κ, last_stepsize = zero(R) ) end @@ -1053,17 +1094,12 @@ function geometric_curvature_function(κ::Real, d::Real) end function (rdog::DistanceOverGradientsStepsize{R, P})( - mp::AbstractManoptProblem, - s::AbstractManoptSolverState, - i, - args...; - gradient = nothing, - kwargs..., + mp::AbstractManoptProblem, s::AbstractManoptSolverState, i, args...; + gradient = nothing, kwargs..., ) where {R, P} M = get_manifold(mp) p = get_iterate(s) grad = isnothing(gradient) ? get_gradient(mp, p) : gradient - # Compute gradient norm grad_norm_sq = clamp(norm(M, p, grad)^2, eps(R), typemax(R)) if i == 0 @@ -1099,7 +1135,6 @@ function (rdog::DistanceOverGradientsStepsize{R, P})( stepsize = rdog.max_distance / sqrt(rdog.gradient_sum) end end - rdog.last_stepsize = stepsize return stepsize end @@ -1107,22 +1142,28 @@ end get_initial_stepsize(rdog::DistanceOverGradientsStepsize) = rdog.last_stepsize get_last_stepsize(rdog::DistanceOverGradientsStepsize) = rdog.last_stepsize -function show(io::IO, rdog::DistanceOverGradientsStepsize) - s = """ - DistanceOverGradients(; - initial_distance = $(rdog.initial_distance), - use_curvature = $(rdog.use_curvature), - sectional_curvature_bound = $(rdog.sectional_curvature_bound) - ) - - Current state: - max_distance = $(rdog.max_distance) - gradient_sum = $(rdog.gradient_sum) - last_stepsize = $(rdog.last_stepsize) +function Base.show(io::IO, rdog::DistanceOverGradientsStepsize) + print(io, "DistanceOverGradientStepsize(; initial_distance = ", rdog.initial_distance) + print(io, "use_curvature = ", rdog.use_curvature, ", sectional_curvature_bound = ", rdog.sectional_curvature_bound) + print(io, "max_distance = ", rdog.max_distance, ", gradient_sum = ", rdog.gradient_sum) + print(io, "initial_point = ", rdog.initial_point, ", last_stepsize = ", rdog.last_stepsize) + return print(io, ")") +end +function status_summary(rdog::DistanceOverGradientsStepsize; context::Symbol = :default) + (context === :short) && return repr(rdog) + s = rdog.use_curvature ? "including a curvature correction" : "" + (context === :inline) && return "A distance over gradients step size $s (last stepsize: $(rdog.last_stepsize))" + s2 = !rdog.use_curvature ? "" : "* sectional curvature bound:$(_MANOPT_INDENT)$(rdog.sectional_curvature_bound)" + return """ + A distance over gradients step size + (last stepsize: $(rdog.last_stepsize)) + + ## Parameters + * use curvature correction: $(_MANOPT_INDENT)$(rdog.use_curvature)$(s2) + * sum of gradients: %(_MANOPT_INDENT)$(rdog.gradient_sum) + * maximal distance r_t: $(_MANOPT_INDENT)$(rdog.max_distance) """ - return print(io, s) end - doc_DoG_main = raw""" DistanceOverGradients(; kwargs...) DistanceOverGradients(M::AbstractManifold; kwargs...) @@ -1268,7 +1309,7 @@ mutable struct NonmonotoneLinesearchStepsize{ retraction_method::TRM = default_retraction_method(M), stepsize_reduction::R = 0.5, stop_when_stepsize_less::R = 0.0, - stop_when_stepsize_exceeds = real(max_stepsize(M)), + stop_when_stepsize_exceeds::R = real(max_stepsize(M)), stop_increasing_at_step::I = 100, stop_decreasing_at_step::I = 1000, storage::Union{Nothing, StoreStateAction} = StoreStateAction( @@ -1362,7 +1403,8 @@ function (a::NonmonotoneLinesearchStepsize)( η, p_old, X_old, - k, + k; + kwargs..., ) end function (a::NonmonotoneLinesearchStepsize)( @@ -1428,6 +1470,13 @@ function (a::NonmonotoneLinesearchStepsize)( end #compute the new step size with the help of the Barzilai-Borwein step size + l = norm(M, p, η) + local swse # COV_EXCL_LINE + if :stop_when_stepsize_exceeds in keys(kwargs) + swse = kwargs[:stop_when_stepsize_exceeds] + else + swse = (a.stop_when_stepsize_exceeds / l) + end a.last_stepsize = linesearch_backtrack!( M, a.candidate_point, @@ -1437,32 +1486,21 @@ function (a::NonmonotoneLinesearchStepsize)( a.sufficient_decrease, a.stepsize_reduction, η; - lf0 = maximum([a.old_costs[j] for j in 1:min(iter, memory_size)]), + lf0 = maximum(view(a.old_costs, 1:min(iter, memory_size))), gradient = X, retraction_method = a.retraction_method, - stop_when_stepsize_less = (a.stop_when_stepsize_less / norm(M, p, η)), - stop_when_stepsize_exceeds = (a.stop_when_stepsize_exceeds / norm(M, p, η)), + stop_when_stepsize_less = (a.stop_when_stepsize_less / l), + stop_when_stepsize_exceeds = swse, stop_increasing_at_step = a.stop_increasing_at_step, stop_decreasing_at_step = a.stop_decreasing_at_step, report_messages_in = a.messages, ) return a.last_stepsize end -function show(io::IO, a::NonmonotoneLinesearchStepsize) +function Base.show(io::IO, a::NonmonotoneLinesearchStepsize) return print( io, - """ - NonmonotoneLinesearch(; - last_stepsize = $(a.last_stepsize), - bb_max_stepsize = $(a.bb_max_stepsize), - bb_min_stepsize = $(a.bb_min_stepsize), - memory_size = $(length(a.old_costs)), - stepsize_reduction = $(a.stepsize_reduction), - strategy = :$(a.strategy), - sufficient_decrease = $(a.sufficient_decrease), - retraction_method = $(a.retraction_method), - vector_transport_method = $(a.vector_transport_method) - )""", + "NonmonotoneLinesearch(; last_stepsize = $(a.last_stepsize), bb_max_stepsize = $(a.bb_max_stepsize), bb_min_stepsize = $(a.bb_min_stepsize), memory_size = $(length(a.old_costs)), stepsize_reduction = $(a.stepsize_reduction), strategy = :$(a.strategy), sufficient_decrease = $(a.sufficient_decrease), retraction_method = $(a.retraction_method), vector_transport_method = $(a.vector_transport_method))", ) end function get_message(a::NonmonotoneLinesearchStepsize) @@ -1563,10 +1601,7 @@ A functor `(problem, state, ...) -> s` to provide a step size due to Polyak, cf. # Constructor - PolyakStepsize(; - γ = i -> 1/i, - initial_cost_estimate=0.0 - ) + PolyakStepsize(; γ = i -> 1/i, initial_cost_estimate=0.0) Construct a stepsize of Polyak type. @@ -1577,8 +1612,8 @@ mutable struct PolyakStepsize{F, R} <: Stepsize γ::F best_cost_value::R end -function PolyakStepsize(; γ::F = (i) -> 1 / i, initial_cost_estimate::R = 0.0) where {F, R} - return PolyakStepsize{F, R}(γ, initial_cost_estimate) +function PolyakStepsize(; γ = (i) -> 1 / i, initial_cost_estimate = 0.0) + return PolyakStepsize(γ, initial_cost_estimate) end function (ps::PolyakStepsize)( amp::AbstractManoptProblem, ams::AbstractManoptSolverState, k::Int, args...; kwargs... @@ -1592,15 +1627,12 @@ function (ps::PolyakStepsize)( α = (c - ps.best_cost_value + ps.γ(k)) / (norm(M, p, X)^2) return α end -function show(io::IO, ps::PolyakStepsize) - return print( - io, - """ - Polyak() - A stepsize with keyword parameters - * initial_cost_estimate = $(ps.best_cost_value) - """, - ) +function Base.show(io::IO, ps::PolyakStepsize) + return print(io, "Polyak(; γ = $(ps.γ))") +end +function status_summary(ps::PolyakStepsize; context::Symbol = :default) + (context === :short) && return repr(ps) + return "Polyak step size with γ = $(ps.γ) and current best minimum estimate $(ps.best_cost_value)" end """ Polyak(; kwargs...) @@ -1631,9 +1663,7 @@ initialize the Polyak stepsize to a certain sequence and an initial estimate of $(_note(:ManifoldDefaultFactory, "PolyakStepsize")) """ function Polyak(args...; kwargs...) - return ManifoldDefaultsFactory( - Manopt.PolyakStepsize, args...; requires_manifold = false, kwargs... - ) + return ManifoldDefaultsFactory(Manopt.PolyakStepsize, args...; requires_manifold = false, kwargs...) end @doc """ @@ -1686,42 +1716,54 @@ mutable struct WolfePowellLinesearchStepsize{ retraction_method::TRM stop_when_stepsize_less::R vector_transport_method::VTM - stop_increasing_at_step::Int - stop_decreasing_at_step::Int + stop_increasing_at_step::I + stop_decreasing_at_step::I messages::TMSG + function WolfePowellLinesearchStepsize(; + sufficient_decrease::R, sufficient_curvature::R, candidate_direction::T, candidate_point::P, + last_stepsize::R, max_stepsize::R, retraction_method::TRM, stop_when_stepsize_less::R, + vector_transport_method::VTM, stop_increasing_at_step::I, stop_decreasing_at_step::I, + messages::TMSG + ) where {R <: Real, TRM <: AbstractRetractionMethod, VTM <: AbstractVectorTransportMethod, P, T, I <: Integer, TMSG} + return new{R, TRM, VTM, P, T, I, TMSG}( + sufficient_decrease, sufficient_curvature, + candidate_direction, candidate_point, last_stepsize, max_stepsize, retraction_method, + stop_when_stepsize_less, vector_transport_method, stop_increasing_at_step, stop_decreasing_at_step, messages + ) + end function WolfePowellLinesearchStepsize( M::AbstractManifold; p::P = allocate_result(M, rand), X::T = zero_vector(M, p), max_stepsize::Real = max_stepsize(M), retraction_method::TRM = default_retraction_method(M), - sufficient_decrease::R = 1.0e-4, - sufficient_curvature::R = 0.999, + sufficient_decrease::Real = 1.0e-4, + sufficient_curvature::Real = 0.999, vector_transport_method::VTM = default_vector_transport_method(M), - stop_when_stepsize_less::R = 0.0, - stop_increasing_at_step::I = 100, - stop_decreasing_at_step::I = 1000, - ) where {TRM, VTM, P, T, R, I} + stop_when_stepsize_less::Real = 0.0, + stop_increasing_at_step::Integer = 100, + stop_decreasing_at_step::Integer = 1000, + ) where {TRM, VTM, P, T} + R = promote_type( + typeof(max_stepsize), typeof(sufficient_curvature), typeof(sufficient_decrease), + typeof(stop_when_stepsize_less), + ) + I = promote_type(typeof(stop_decreasing_at_step), typeof(stop_increasing_at_step)) msgs = (; non_descent_direction = StepsizeMessage{R, R}(), - stop_decreasing = StepsizeMessage{Int, R}(), - stop_increasing = StepsizeMessage{Int, R}(), + stop_decreasing = StepsizeMessage{I, R}(), + stop_increasing = StepsizeMessage{I, R}(), stepsize_less = StepsizeMessage{R, R}(), stepsize_exceeds = StepsizeMessage{R, R}(), ) - return new{R, TRM, VTM, P, T, I, typeof(msgs)}( - sufficient_decrease, - sufficient_curvature, - X, - p, - 0.0, - max_stepsize, - retraction_method, - stop_when_stepsize_less, - vector_transport_method, - stop_increasing_at_step, - stop_decreasing_at_step, - msgs, + return WolfePowellLinesearchStepsize(; + sufficient_decrease = convert(R, sufficient_decrease), sufficient_curvature = convert(R, sufficient_curvature), + candidate_direction = X, candidate_point = p, last_stepsize = convert(R, 0.0), + max_stepsize = convert(R, max_stepsize), retraction_method = retraction_method, + stop_when_stepsize_less = convert(R, stop_when_stepsize_less), + vector_transport_method = vector_transport_method, + stop_increasing_at_step = convert(I, stop_increasing_at_step), stop_decreasing_at_step = convert(I, stop_decreasing_at_step), + messages = msgs ) end end @@ -1733,10 +1775,7 @@ function WolfePowellLinesearchStepsize(M::AbstractManifold, p; kwargs...) ) end function (a::WolfePowellLinesearchStepsize)( - mp::AbstractManoptProblem, - ams::AbstractManoptSolverState, - k::Int, - η = (-get_gradient(mp, get_iterate(ams))); + mp::AbstractManoptProblem, ams::AbstractManoptSolverState, k::Int, η = (-get_gradient(mp, get_iterate(ams))); kwargs..., ) # For readability extract a few variables @@ -1747,7 +1786,11 @@ function (a::WolfePowellLinesearchStepsize)( max_step_increase = ifelse( isfinite(a.max_stepsize), min(1.0e9, a.max_stepsize / grad_norm), 1.0e9 ) + if :stop_when_stepsize_exceeds in keys(kwargs) + max_step_increase = min(max_step_increase, kwargs[:stop_when_stepsize_exceeds]) + end step = ifelse(isfinite(a.max_stepsize), min(1.0, a.max_stepsize / grad_norm), 1.0) + step = min(step, max_step_increase) s_plus = step s_minus = step # clear messages @@ -1774,14 +1817,14 @@ function (a::WolfePowellLinesearchStepsize)( break end end - s_plus = 2.0 * s_minus + s_plus = min(2.0 * s_minus, max_step_increase) else vector_transport_to!(M, a.candidate_direction, p, η, a.candidate_point, a.vector_transport_method) if get_differential(mp, a.candidate_point, a.candidate_direction; Y = Y) < a.sufficient_curvature * l i = 0 while fNew <= f0 + a.sufficient_decrease * step * l && (s_plus < max_step_increase) # increase - s_plus = s_plus * 2.0 + s_plus = min(s_plus * 2.0, max_step_increase) step = s_plus ManifoldsBase.retract_fused!(M, a.candidate_point, p, η, step, a.retraction_method) fNew = get_cost(mp, a.candidate_point) @@ -1816,24 +1859,29 @@ function (a::WolfePowellLinesearchStepsize)( a.last_stepsize = step return step end -function show(io::IO, a::WolfePowellLinesearchStepsize) - return print( - io, - """ - WolfePowellLinesearch(; - sufficient_decrease = $(a.sufficient_decrease), - sufficient_curvature = $(a.sufficient_curvature), - retraction_method = $(a.retraction_method), - vector_transport_method = $(a.vector_transport_method), - stop_when_stepsize_less = $(a.stop_when_stepsize_less), - stop_increasing_at_step = $(a.stop_increasing_at_step), - stop_decreasing_at_step = $(a.stop_decreasing_at_step), - )""", - ) -end -function status_summary(a::WolfePowellLinesearchStepsize) - s = (a.last_stepsize > 0) ? "\nand the last stepsize used was $(a.last_stepsize)." : "" - return "$a$s" +function Base.show(io::IO, a::WolfePowellLinesearchStepsize) + print(io, "WolfePowellLinesearchStepsize(; sufficient_decrease = ", a.sufficient_decrease) + print(io, ", sufficient_curvature = ", a.sufficient_curvature, ", candidate_direction = ", a.candidate_direction, ", candidate_point = ", a.candidate_point) + print(io, ", last_stepsize = ", a.last_stepsize, ", max_stepsize = ", a.max_stepsize) + print(io, ", retraction_method = ", a.retraction_method, ", stop_when_stepsize_less = ", a.stop_when_stepsize_less) + print(io, ", vector_transport_method = ", a.vector_transport_method) + print(io, ", stop_increasing_at_step = ", a.stop_increasing_at_step, ", stop_decreasing_at_step = ", a.stop_decreasing_at_step) + return print(io, ", messages = ", a.messages, ")") +end +function status_summary(a::WolfePowellLinesearchStepsize; context::Symbol = :default) + (context === :short) && return repr(a) + (context === :inline) && return "A Wolfe Powell step size (last stepsize: $(a.last_stepsize))" + return """ + A Wolfe Powell line search based step size + (last stepsize: $(a.last_stepsize)) + + ## Parameters + * maximal step size: $(_MANOPT_INDENT)$(a.max_stepsize) + * retraction method: $(_MANOPT_INDENT)$(a.retraction_method) + * vector transport method: $(_MANOPT_INDENT)$(a.retraction_method) + * sufficient decrease: $(_MANOPT_INDENT)$(a.sufficient_decrease) + * sufficient curvature: $(_MANOPT_INDENT)$(a.sufficient_curvature) + """ end function get_message(a::WolfePowellLinesearchStepsize) s = [get_message(kv[1], kv[2]) for kv in pairs(a.messages)] @@ -1916,22 +1964,20 @@ mutable struct WolfePowellBinaryLinesearchStepsize{ sufficient_curvature::F last_stepsize::F stop_when_stepsize_less::F - function WolfePowellBinaryLinesearchStepsize( - M::AbstractManifold = DefaultManifold(); - sufficient_decrease::F = 10^(-4), - sufficient_curvature::F = 0.999, + M::AbstractManifold; + sufficient_decrease::Real = 10^(-4), + sufficient_curvature::Real = 0.999, retraction_method::RTM = default_retraction_method(M), vector_transport_method::VTM = default_vector_transport_method(M), - stop_when_stepsize_less::F = 0.0, - ) where {VTM <: AbstractVectorTransportMethod, RTM <: AbstractRetractionMethod, F} + stop_when_stepsize_less::Real = 0.0, + last_stepsize::Real = 0.0, + ) where {VTM <: AbstractVectorTransportMethod, RTM <: AbstractRetractionMethod} + F = promote_type(typeof(sufficient_decrease), typeof(sufficient_curvature), typeof(stop_when_stepsize_less), typeof(last_stepsize)) return new{RTM, VTM, F}( - retraction_method, - vector_transport_method, - sufficient_decrease, - sufficient_curvature, - 0.0, - stop_when_stepsize_less, + retraction_method, vector_transport_method, + convert(F, sufficient_decrease), convert(F, sufficient_curvature), + convert(F, last_stepsize), convert(F, stop_when_stepsize_less), ) end end @@ -1977,22 +2023,27 @@ function (a::WolfePowellBinaryLinesearchStepsize)( a.last_stepsize = t return t end -function show(io::IO, a::WolfePowellBinaryLinesearchStepsize) - return print( - io, - """ - WolfePowellBinaryLinesearch(; - sufficient_decrease = $(a.sufficient_decrease), - sufficient_curvature = $(a.sufficient_curvature), - retraction_method = $(a.retraction_method), - vector_transport_method = $(a.vector_transport_method), - stop_when_stepsize_less = $(a.stop_when_stepsize_less), - )""", - ) -end -function status_summary(a::WolfePowellBinaryLinesearchStepsize) - s = (a.last_stepsize > 0) ? "\nand the last stepsize used was $(a.last_stepsize)." : "" - return "$a$s" +function Base.show(io::IO, a::WolfePowellBinaryLinesearchStepsize) + print(io, "WolfePowellBinaryLinesearchStepsize(; sufficient_decrease = ", a.sufficient_decrease) + print(io, ", sufficient_curvature = ", a.sufficient_curvature) + print(io, ", last_stepsize = ", a.last_stepsize) + print(io, ", retraction_method = ", a.retraction_method, ", stop_when_stepsize_less = ", a.stop_when_stepsize_less) + print(io, ", vector_transport_method = ", a.vector_transport_method) + return print(io, ")") +end +function status_summary(a::WolfePowellBinaryLinesearchStepsize; context::Symbol = :default) + (context === :short) && return repr(a) + (context === :inline) && return "A Wolfe Powell bisection dissection step size (last stepsize: $(a.last_stepsize))" + return """ + A Wolfe Powell bisection line search based step size + (last stepsize: $(a.last_stepsize)) + + ## Parameters + * retraction method: $(_MANOPT_INDENT)$(a.retraction_method) + * vector transport method: $(_MANOPT_INDENT)$(a.retraction_method) + * sufficient decrease: $(_MANOPT_INDENT)$(a.sufficient_decrease) + * sufficient curvature: $(_MANOPT_INDENT)$(a.sufficient_curvature) + """ end _doc_WPBL_algorithm = """With @@ -2134,3 +2185,632 @@ end function get_last_stepsize(step::WolfePowellBinaryLinesearchStepsize, ::Any...) return step.last_stepsize end + + +#### Hager-Zhang Linesearch + + +@doc """ + HagerZhangLinesearchStepsize{P,T,R<:Real} <: Linesearch + +Do a bracketing line search to find a step size ``α`` that finds a +local minimum along the search direction ``X`` starting from ``p``, +utilizing cubic polynomial interpolation using the method described in +[HagerZhang:2006:2](@cite). The function [`secant`](@ref) is used to find the minimum of the +cubic polynomial fitted to values of the cost function and its derivative at the endpoints +of the current interval. +See [`HagerZhangLinesearch`](@ref) for the mathematical details. + +# Fields + +$(_fields(:p; name = "candidate_point")) + as temporary storage for candidates +* `initial_stepsize::R`: the step size to start the search with +$(_fields(:retraction_method)) +$(_fields(:vector_transport_method)) +* `initial_guess`: see keyword arguments of [`HagerZhangLinesearch`](@ref) for details. +* `stepsize_limit`: see keyword arguments of [`HagerZhangLinesearch`](@ref) for details. +* `max_bracket_iterations`: see keyword arguments of [`HagerZhangLinesearch`](@ref) for details. +* `start_enforcing_wolfe_conditions_at_bracketing_iteration`: see keyword arguments of + [`HagerZhangLinesearch`](@ref) for details. +* `allow_early_maxstep_termination`: see keyword arguments of [`HagerZhangLinesearch`](@ref) for details. +* `wolfe_condition_mode`: see keyword arguments of [`HagerZhangLinesearch`](@ref) for details. +* `ϵ`, `δ`, `σ`, `ω`, `θ`, `γ`, `ρ`, `Δ`: see keyword arguments of [`HagerZhangLinesearch`](@ref) for details. +* `secant_acceptance_ratio`: see keyword arguments of [`HagerZhangLinesearch`](@ref) for details. +* `candidate_direction`, `temporary_tangent`: as temporary storage for tangent vectors +* `triples`: temporary storage for function and derivative evaluations +* `last_evaluation_index`: to keep track of the number of evaluations performed so far; + points at the last filled entry of `triples`. +* `Qₖ`, `Cₖ`: to keep track of the parameters of the Wolfe condition when in adaptive mode +* `current_mode`: to keep track of the current Wolfe condition mode when in adaptive mode +* `last_stepsize`: last stepsize computed since reset +* `last_cost`: last cost value computed since reset +* `ϵₖ`: the current ϵ parameter used in the approximate Wolfe condition and bracketing + +# Constructor + + HagerZhangLinesearchStepsize(M::AbstractManifold; kwargs...) +""" +mutable struct HagerZhangLinesearchStepsize{ + TF <: Real, + TIG <: AbstractInitialLinesearchGuess, + TRM <: AbstractRetractionMethod, + TVTM <: AbstractVectorTransportMethod, + TP, + TX, + } <: Linesearch + # parameters + initial_guess::TIG + retraction_method::TRM + vector_transport_method::TVTM + stepsize_limit::TF + max_bracket_iterations::Int + start_enforcing_wolfe_conditions_at_bracketing_iteration::Int + allow_early_maxstep_termination::Bool + wolfe_condition_mode::Symbol # :standard, :approximate, :adaptive + ϵ::TF # approximate Wolfe termination parameter + δ::TF # used in approximate Wolfe condition + σ::TF # used in curvature condition + ω::TF + θ::TF # update rule parameter + γ::TF + ρ::TF + Δ::TF + secant_acceptance_ratio::TF + # storage for candidates + candidate_point::TP + candidate_direction::TX + temporary_tangent::TX + # storage for function evaluations + triples::Vector{UnivariateTriple{TF}} + last_evaluation_index::Int + # storage to be kept between outer solver iterations + Qₖ::TF + Cₖ::TF + current_mode::Symbol + # other storage + last_stepsize::TF + last_cost::TF + ϵₖ::TF + function HagerZhangLinesearchStepsize( + M::AbstractManifold; + initial_guess::TIG = HagerZhangInitialGuess(), + retraction_method::TRM = default_retraction_method(M), + vector_transport_method::TVTM = default_vector_transport_method(M), + initial_last_stepsize::TF = NaN, + initial_last_cost::TF = NaN, + stepsize_limit::TF = Inf, + candidate_point = allocate_result(M, rand), + candidate_direction = zero_vector(M, candidate_point), + max_bracket_iterations::Int = 10, + start_enforcing_wolfe_conditions_at_bracketing_iteration::Int = initial_guess isa ConstantStepsize ? 2 : 1, + max_function_evaluations::Int = 20, + wolfe_condition_mode::Symbol = :adaptive, + allow_early_maxstep_termination::Bool = true, + ϵ::TF = 1.0e-6, + δ::TF = 0.1, + σ::TF = 0.9, + ω::TF = 1.0e-3, + θ::TF = 0.5, + γ::TF = 0.66, + ρ::TF = 5.0, + Δ::TF = 0.7, + secant_acceptance_ratio::TF = 1.0e-8, + ) where { + TIG <: AbstractInitialLinesearchGuess, TRM <: AbstractRetractionMethod, + TVTM <: AbstractVectorTransportMethod, TF <: Real, + } + + # check parameters + @assert δ > 0 && δ < 0.5 + @assert δ <= σ + @assert σ < 1 + @assert ϵ >= 0 + @assert ω >= 0 && ω <= 1 + @assert Δ >= 0 && Δ <= 1 + @assert θ > 0 && θ < 1 + @assert γ > 0 && γ < 1 + @assert ρ > 1 + @assert stepsize_limit > 0 + @assert wolfe_condition_mode in (:standard, :approximate, :adaptive) + @assert secant_acceptance_ratio >= 0 + + # allocate storage + triples = Vector{UnivariateTriple{TF}}(undef, max_function_evaluations) + + initial_wolfe_mode = wolfe_condition_mode == :adaptive ? :standard : wolfe_condition_mode + + return new{TF, TIG, TRM, TVTM, typeof(candidate_point), typeof(candidate_direction)}( + initial_guess, retraction_method, vector_transport_method, stepsize_limit, + max_bracket_iterations, start_enforcing_wolfe_conditions_at_bracketing_iteration, + allow_early_maxstep_termination, wolfe_condition_mode, + ϵ, δ, σ, ω, θ, γ, ρ, Δ, secant_acceptance_ratio, + candidate_point, candidate_direction, zero_vector(M, candidate_point), + triples, 0, + 0.0, 0.0, # Qₖ, Cₖ + initial_wolfe_mode, + initial_last_stepsize, initial_last_cost, ϵ, + ) + end +end + +function initialize_stepsize!(hzls::HagerZhangLinesearchStepsize) + hzls.Qₖ = 0.0 + hzls.Cₖ = 0.0 + hzls.last_stepsize = NaN + hzls.last_cost = NaN + hzls.ϵₖ = hzls.ϵ + hzls.current_mode = hzls.wolfe_condition_mode + if hzls.current_mode === :adaptive + hzls.current_mode = :standard + end + hzls.last_evaluation_index = 0 + return hzls +end + +""" + _hz_evaluate_next_step( + hzls::HagerZhangLinesearchStepsize, M::AbstractManifold, + mp::AbstractManoptProblem, p, η, α::Real + ) + +Evaluate and store the next trial step for the Hager-Zhang linesearch. + +Given the current iterate `p`, search direction `η` (in the tangent space at `p`), and a +candidate step size `α`, this function + +1. Retracts from `p` along `η` by step `α` into `hzls.candidate_point` (using + `hzls.retraction_method`), +2. Vector-transports `η` to the candidate point into `hzls.candidate_direction` (using + `hzls.vector_transport_method`), +3. Evaluates the objective and directional derivative via + `get_cost_and_differential(mp, hzls.candidate_point, hzls.candidate_direction)`, +4. Stores the resulting triple `(α, f, df)` in `hzls.triples` and increments + `hzls.last_evaluation_index`. + +This helper is side-effecting by design; it mutates `hzls`' internal storage. + +# Return value + +By default return a tuple with three values: +- the index `i_k::Int` at which the new evaluation was stored. +- `evaluation_limit_termination`: `true` iff the maximum number of stored evaluations + has been reached. +- `wolfe_termination` is `true` iff the (standard or approximate) Wolfe conditions are + satisfied for the current candidate, according to `hzls.current_mode`. + +# Errors + +Throws an error if called more often than the maximum number of allocated function +evaluations (i.e. if `hzls.triples` would overflow). +""" +function _hz_evaluate_next_step( + hzls::HagerZhangLinesearchStepsize, + M::AbstractManifold, + mp::AbstractManoptProblem, + p, + η, + α::Real + ) + triples = hzls.triples + max_evals = length(triples) + if hzls.last_evaluation_index + 1 > max_evals + # this should never happen if the calling code is correct + error("Hager-Zhang linesearch exceeded maximum number of function evaluations $(length(hzls.triples)).") + end + ManifoldsBase.retract_fused!(M, hzls.candidate_point, p, η, α, hzls.retraction_method) + vector_transport_to!( + M, hzls.candidate_direction, p, η, hzls.candidate_point, hzls.vector_transport_method + ) + f, df = get_cost_and_differential(mp, hzls.candidate_point, hzls.candidate_direction; Y = hzls.temporary_tangent) + hzls.last_evaluation_index += 1 + triples[hzls.last_evaluation_index] = UnivariateTriple(α, f, df) + + wolfe_termination = false + evaluation_limit_termination = hzls.last_evaluation_index == max_evals + i_k = hzls.last_evaluation_index + if hzls.current_mode === :standard + # Eq (22) in HagerZhang:2006:2 + # equivalent to the (T1) condition + wolfe_termination = (α * hzls.δ * triples[1].df >= (triples[i_k].f - triples[1].f)) && + (triples[i_k].df >= hzls.σ * triples[1].df) + elseif hzls.current_mode === :approximate + # Eq (23) in HagerZhang:2006:2 + additional criterion in the (T2) condition + wolfe_termination = ((2 * hzls.δ - 1) * triples[1].df >= triples[i_k].df) && + (triples[i_k].df >= hzls.σ * triples[1].df) && triples[i_k].f <= triples[1].f + hzls.ϵₖ + else + error("Unknown Wolfe condition mode $(hzls.current_mode).") + end + + return hzls.last_evaluation_index, evaluation_limit_termination, wolfe_termination +end + +""" + _hz_bracket(hzls::HagerZhangLinesearchStepsize, c::Real, max_alpha::Real) + +Perform the bracketing phase of the Hager-Zhang linesearch starting from an initial +stepsize `c` and not exceeding `max_alpha`. + +Returns a tuple `(i_a, i_b, f_eval, f_wolfe, f_early_maxstep)` where `i_a` and `i_b` are +the indices in the stored function evaluations such that the minimum is bracketed between +`triples[i_a].t` and `triples[i_b].t`. `f_eval` is `true` if the maximum number of function +evaluations has been reached during the bracketing phase. `f_wolfe` is `true` if the Wolfe +conditions have been satisfied. `f_early_maxstep` is `true` if the maximum stepsize was +reached early with negative slope and an improvement over the initial point. +""" +function _hz_bracket( + hzls::HagerZhangLinesearchStepsize, M::AbstractManifold, + mp::AbstractManoptProblem, p, η, c::Real, max_alpha::Real + ) + # B0 + current_step = c + local c_index, f_eval, f_wolfe # COV_EXCL_LINE + ls_early_exit = false + for j in 1:hzls.max_bracket_iterations + c_index, f_eval, f_wolfe = _hz_evaluate_next_step(hzls, M, mp, p, η, current_step) + if f_eval || (f_wolfe && j >= hzls.start_enforcing_wolfe_conditions_at_bracketing_iteration) + break + end + if hzls.triples[c_index].df >= 0 + # B1 -- detecting a positive slope + # handled after the loop + break + else + if hzls.triples[c_index].f > hzls.triples[1].f + hzls.ϵₖ + # B2 -- function value gets sufficiently larger than at 0 + # perform main bracketing loop (we can skip U0-U2 checks here) + (i_a_bar, i_b_bar, f_eval, f_wolfe) = _hz_u3(hzls, M, mp, p, η, 1, c_index) + return (i_a_bar, i_b_bar, f_eval, f_wolfe, false) + else + if current_step == max_alpha + # we've reached maximum alpha so we can't expand anymore + # we handle this case after the loop + ls_early_exit = hzls.allow_early_maxstep_termination + break + end + # B3 -- widen the bracket + current_step *= hzls.ρ + if current_step > max_alpha + current_step = max_alpha + end + end + end + end + # we detected positive slope, ran out of iterations or reached max stepsize + # B1 seems to be the best choice for all three cases + + if ls_early_exit + # additional termination condition: we reached the maximum stepsize with negative + # slope and an improvement over the initial point, so we can exit early with this step + return (1, c_index, f_eval, f_wolfe, true) + end + + i_min = 1 + for i in 2:(hzls.last_evaluation_index - 1) + if hzls.triples[i].f <= hzls.triples[1].f + hzls.ϵₖ + i_min = i + end + end + return (i_min, c_index, f_eval, f_wolfe, false) +end + +""" + _hz_update( + hzls::HagerZhangLinesearchStepsize, M::AbstractManifold, + mp::AbstractManoptProblem, p, η, i_a::Int, i_b::Int, c::Real + ) + +Perform an update procedure of the Hager-Zhang linesearch given the current bracketing +indices `i_a` and `i_b` and a candidate stepsize `c`. + +Returns indices and termination information `(i_A, i_B, i_c, f_eval, f_wolfe)` where the +minimum is now bracketed between `alpha_values[i_A]` and `alpha_values[i_B]`. Index `i_c` +indicates the position at which evaluation of the candidate `c` was stored. If the +candidate `c` is outside of the current bracket, the last index is returned as `-1`. +`f_eval` is `true` if the maximum number of function evaluations has been reached. +`f_wolfe` is `true` if the Wolfe conditions have been satisfied at the candidate `i_c`. +""" +function _hz_update( + hzls::HagerZhangLinesearchStepsize, M::AbstractManifold, + mp::AbstractManoptProblem, p, η, i_a::Int, i_b::Int, c::Real + ) + # U0 + if c < hzls.triples[i_a].t || c > hzls.triples[i_b].t + return (i_a, i_b, -1, false, false) + end + i_c, f_eval, f_wolfe = _hz_evaluate_next_step(hzls, M, mp, p, η, c) + if hzls.triples[i_c].df >= 0 + # U1 + return (i_a, i_c, i_c, f_eval, f_wolfe) + else + if hzls.triples[i_c].f <= hzls.triples[1].f + hzls.ϵₖ + # U2 + return (i_c, i_b, i_c, f_eval, f_wolfe) + else + if f_eval || f_wolfe + # termination condition met + return (i_a, i_b, i_c, f_eval, f_wolfe) + else + # U3 + i_a_bar, i_b_bar, f_eval, f_wolfe = _hz_u3(hzls, M, mp, p, η, i_a, i_c) + return (i_a_bar, i_b_bar, i_c, f_eval, f_wolfe) + end + end + end +end + +function _hz_u3( + hzls::HagerZhangLinesearchStepsize, M::AbstractManifold, + mp::AbstractManoptProblem, p, η, i_a::Int, i_b::Int + ) + i_a_bar = i_a + i_b_bar = i_b + # the loop should typically terminate before exceeding the number of evaluations + f_eval = false + f_wolfe = false + while hzls.last_evaluation_index < length(hzls.triples) + # U3 (a) + d = (1 - hzls.θ) * hzls.triples[i_a_bar].t + hzls.θ * hzls.triples[i_b_bar].t + i_d, f_eval, f_wolfe = _hz_evaluate_next_step(hzls, M, mp, p, η, d) + if hzls.triples[i_d].df >= 0 || f_eval || f_wolfe + return (i_a_bar, i_d, f_eval, f_wolfe) + else + if hzls.triples[i_d].f <= hzls.triples[1].f + hzls.ϵₖ + # U3 (b) + i_a_bar = i_d + else + # U3 (c) + i_b_bar = i_d + end + end + end + return (i_a_bar, i_b_bar, f_eval, f_wolfe) +end + +""" + _hz_secant2( + hzls::HagerZhangLinesearchStepsize, M::AbstractManifold, + mp::AbstractManoptProblem, p, η, i_a::Int, i_b::Int + ) + +Perform the secant-based update in the Hager-Zhang linesearch. + +Computes a trial step using a secant interpolation of the bracketing +endpoints. If the trial step is too close to an endpoint, falls back to a +bisection step. Returns the updated bracketing indices and termination flags +from the internal update routine. + +# Arguments +- `hzls`: linesearch state and storage. +- `M`: manifold for retractions and transports. +- `mp`: optimization problem providing cost and differential. +- `p`: current iterate. +- `η`: search direction in the tangent space at `p`. +- `i_a`, `i_b`: indices of the current bracketing interval in `hzls.triples`. + +# Return value +Returns `(i_A, i_B, i_c, f_eval, f_wolfe)` where +- `i_A`, `i_B`: indices bracketing the minimum after the update, +- `i_c`: index of the most recent evaluation (or `-1` if the candidate was out of range), +- `f_eval`: `true` iff the evaluation limit has been reached, +- `f_wolfe`: `true` iff the Wolfe conditions are satisfied. + +# Steps (S1-S4) +- S1: compute a secant trial `c` from the current bracket and accept it unless too close to + an endpoint (otherwise use a bisection step). +- S2/S3: if the trial becomes a new endpoint, perform an update from that side. +- S4: return the updated bracket and termination flags. +""" +function _hz_secant2( + hzls::HagerZhangLinesearchStepsize, M::AbstractManifold, + mp::AbstractManoptProblem, p, η, i_a::Int, i_b::Int + ) + # S1 + c = secant(hzls.triples[i_a], hzls.triples[i_b]) + width = hzls.triples[i_b].t - hzls.triples[i_a].t + if abs(c - hzls.triples[i_a].t) < hzls.secant_acceptance_ratio * width || + abs(c - hzls.triples[i_b].t) < hzls.secant_acceptance_ratio * width + # secant too close to an endpoint, use bisection instead + # this case is not present in the original algorithm, but the following steps don't make much sense in this case + c = (hzls.triples[i_a].t + hzls.triples[i_b].t) / 2 + return _hz_update(hzls, M, mp, p, η, i_a, i_b, c) + end + (i_A, i_B, i_c, f_eval, f_wolfe) = _hz_update(hzls, M, mp, p, η, i_a, i_b, c) + if f_eval || f_wolfe + # not present in the original algorithm, but this seems to be the right way to handle this case + return (i_A, i_B, i_c, f_eval, f_wolfe) + end + if i_c == i_B + # S2 + c_bar = secant(hzls.triples[i_b], hzls.triples[i_B]) + # S4, part 1 + return _hz_update(hzls, M, mp, p, η, i_A, i_B, c_bar) + elseif i_c == i_A + # S3 + c_bar = secant(hzls.triples[i_a], hzls.triples[i_A]) + # S4, part 1 + return _hz_update(hzls, M, mp, p, η, i_A, i_B, c_bar) + else + # S4, part 2 + return (i_A, i_B, i_c, f_eval, f_wolfe) + end +end + +function (hzls::HagerZhangLinesearchStepsize)( + mp::AbstractManoptProblem, + s::AbstractManoptSolverState, + k::Int, + η = (-get_gradient(mp, get_iterate(s))); + fp = get_cost(mp, get_iterate(s)), + gradient = nothing, + kwargs..., + ) + M = get_manifold(mp) + p = get_iterate(s) + + dphi_0 = if !isnothing(gradient) + real(inner(M, p, η, gradient)) + else + get_differential(mp, p, η; Y = hzls.temporary_tangent) + end + hzls.triples[1] = UnivariateTriple(0.0, fp, dphi_0) + hzls.last_evaluation_index = 1 + + # update Qₖ, Cₖ + hzls.Qₖ = 1 + hzls.Qₖ * hzls.Δ + hzls.Cₖ += (abs(fp) - hzls.Cₖ) / hzls.Qₖ + + if hzls.wolfe_condition_mode == :adaptive + # Checking the V3 condition + if abs(hzls.last_cost - fp) <= hzls.ω * hzls.Cₖ + hzls.current_mode = :approximate + end + end + + # L0, initialization + + # handle stepsize limit + max_alpha = hzls.stepsize_limit + if :stop_when_stepsize_exceeds in keys(kwargs) + max_alpha = min( + kwargs[:stop_when_stepsize_exceeds], + max_alpha, + ) + end + # guess initial alpha + α0 = hzls.initial_guess(mp, s, k, hzls.last_stepsize, η; lf0 = fp, Dlf0 = dphi_0, stop_when_stepsize_exceeds = max_alpha) + + # in case initial_guess does not take into account the stepsize limit, we enforce it here + α0 = min(α0, max_alpha) + + # L0, bracket(c) + local i_a_j, i_b_j, f_eval, f_wolfe # COV_EXCL_LINE + (i_a_j, i_b_j, f_eval, f_wolfe, f_early_maxstep) = _hz_bracket(hzls, M, mp, p, η, α0, max_alpha) + !f_early_maxstep && while !(f_eval || f_wolfe) + # L1 + finite_at_b = isfinite(hzls.triples[i_b_j].f) + if finite_at_b + # _hz_secant2 only makes sense if we have finite function values at both ends + # but _hz_update may still work + (i_a, i_b, _i_c, f_eval, f_wolfe) = _hz_secant2(hzls, M, mp, p, η, i_a_j, i_b_j) + else + (i_a, i_b) = (i_a_j, i_b_j) + end + # L2 + # we additionally check that we can continue narrowing the bracket + if !(f_eval || f_wolfe) && + (!finite_at_b || (hzls.triples[i_b].t - hzls.triples[i_a].t) > hzls.γ * (hzls.triples[i_b_j].t - hzls.triples[i_a_j].t)) + # secant2 did not reduce the bracket sufficiently + # we need to do bisection + (i_a, i_b, _i_c, f_eval, f_wolfe) = _hz_update( + hzls, M, mp, p, η, + i_a, i_b, + (hzls.triples[i_a].t + hzls.triples[i_b].t) / 2, + ) + end + # L3 + i_a_j, i_b_j = i_a, i_b + + # loop terminates when we generate a point satisfying T1 or T2, or when we run out + # of objective evaluations + end + + hzls.last_stepsize = hzls.triples[hzls.last_evaluation_index].t + hzls.last_cost = hzls.triples[hzls.last_evaluation_index].f + return hzls.last_stepsize +end + +function Base.show(io::IO, hzls::HagerZhangLinesearchStepsize) + return print( + io, + """ + HagerZhangLinesearch(; + initial_guess = $(hzls.initial_guess), + retraction_method = $(hzls.retraction_method), + vector_transport_method = $(hzls.vector_transport_method), + stepsize_limit = $(hzls.stepsize_limit), + max_bracket_iterations = $(hzls.max_bracket_iterations), + start_enforcing_wolfe_conditions_at_bracketing_iteration = $(hzls.start_enforcing_wolfe_conditions_at_bracketing_iteration), + max_function_evaluations = $(length(hzls.triples)), + wolfe_condition_mode = $(hzls.wolfe_condition_mode), + ϵ = $(hzls.ϵ), δ = $(hzls.δ), σ = $(hzls.σ), + ω = $(hzls.ω), + θ = $(hzls.θ), γ = $(hzls.γ), secant_acceptance_ratio = $(hzls.secant_acceptance_ratio), + ρ = $(hzls.ρ), + Δ = $(hzls.Δ), + )""", + ) +end +function status_summary(hzls::HagerZhangLinesearchStepsize) + return "$(hzls)\nand a computed last stepsize of $(hzls.last_stepsize)" +end + +@doc """ + HagerZhangLinesearch(; kwargs...) + HagerZhangLinesearch(M::AbstractManifold; kwargs...) + +A functor representing the curvature minimizing cubic bracketing scheme introduced +in [HagerZhang:2006:2](@cite). + +The following changes were made to the original algorithm from the paper: +1. The algorithm bails out early out of a secant update that is too close to one of the end + points and switches to bisection. Original algorithm performs a similar check at a later + stage. This precaution prevents a non-productive evaluation of the objective. +2. Added `start_enforcing_wolfe_conditions_at_bracketing_iteration`, since with a very low + stepsize initialization that satisfies Wolfe conditions we might accept the initial + stepsize and not notice that bracketing could help us reach the minimum earlier. + Setting `start_enforcing_wolfe_conditions_at_bracketing_iteration`` to 1 reproduces the + behavior of the original paper. For example a static initial stepsize equal to 1.0 could + benefit from having this parameter increased. +3. The paper isn't entirely clear on what the final stepsize to return is. This + implementation returns the last evaluated stepsize. +4. The original algorithm doesn't specify what to do when the maximum stepsize is reached + during the bracketing phase with a negative slope and an improvement over the initial + point. This implementation allows for an early termination in this case, which seems + reasonable since we can't expand the bracket anymore and this point is likely close to + the minimum. By default this early termination is allowed, but it can be turned off via + `allow_early_maxstep_termination` in which case the algorithm continues with the main + loop even in this case. + +## Keyword arguments + +$(_kwargs(:p; name = "candidate_point")) as temporary storage for candidates +$(_kwargs(:retraction_method)) +$(_kwargs(:vector_transport_method)) +* `initial_guess::AbstractInitialLinesearchGuess=HagerZhangInitialGuess()`: initial linesearch guess strategy +* `initial_last_stepsize::Real = NaN`: initial value for the stored last stepsize +* `initial_last_cost::Real = NaN`: initial value for the stored last cost +* `stepsize_limit::Real = Inf`: upper bound for trial stepsizes during bracketing +* `candidate_point = allocate_result(M, rand)`: storage for trial points +* `candidate_direction = zero_vector(M, candidate_point)`: storage for transported directions +* `max_bracket_iterations::Int = 10`: maximum number of bracketing iterations +* `start_enforcing_wolfe_conditions_at_bracketing_iteration::Int = initial_guess isa ConstantStepsize ? 2 : 1`: + bracketing iteration number at which Wolfe conditions are started to be enforced; + setting to 1 may cause no bracketing to occur when the initial guess satisfies the Wolfe + conditions. +* `max_function_evaluations::Int = 20`: maximum number of function evaluations per linesearch +* `allow_early_maxstep_termination::Bool = true`: whether to allow early termination when + the maximum stepsize is reached with negative slope and an improvement over the initial point. +* `wolfe_condition_mode::Symbol = :adaptive`: one of `:standard`, `:approximate`, or `:adaptive`. + Selects between (T1) and (T2) conditions in [HagerZhang:2006:2](@cite). +* `ϵ::Real = 1.0e-6`: initial allowed increase in function value in termination condition (T2). + Allowed range: `ϵ >= 0`. +* `δ::Real = 0.1`: parameter for approximate Wolfe condition. + Allowed range: `0 < δ < 0.5` and `δ <= σ`. +* `σ::Real = 0.9`: curvature condition parameter. Allowed range: `δ <= σ < 1`. +* `ω::Real = 1.0e-3`: interpolation safeguard parameter. Allowed range: `0 <= ω <= 1`. +* `θ::Real = 0.5`: bisection update parameter. Allowed range: `0 < θ < 1`. +* `γ::Real = 0.66`: determines when a bisection step is performed instead of secant. + Allowed range: `0 < γ < 1`. +* `ρ::Real = 5.0`: bracketing expansion factor. Allowed range: `ρ > 1`. +* `Δ::Real = 0.7`: Parameter controlling the rate of change of Qₖ. + Allowed range: `0 <= Δ <= 1`. +* `secant_acceptance_ratio::Real = 1.0e-8`: minimum relative interval length + for accepting secant step. Allowed range: `secant_acceptance_ratio >= 0`. + In case of rejection, a bisection step is performed instead. + +$(_note(:ManifoldDefaultFactory, "HagerZhangLinesearch")) +""" +function HagerZhangLinesearch(args...; kwargs...) + return ManifoldDefaultsFactory(HagerZhangLinesearchStepsize, args...; kwargs...) +end diff --git a/src/plans/stochastic_gradient_plan.jl b/src/plans/stochastic_gradient_plan.jl index 6cf14fc4ee..ab6ffdc001 100644 --- a/src/plans/stochastic_gradient_plan.jl +++ b/src/plans/stochastic_gradient_plan.jl @@ -348,6 +348,29 @@ function get_gradient!( return X end +function Base.show(io::IO, msgo::ManifoldStochasticGradientObjective{E}) where {E} + print(io, "ManifoldStochasticGradientObjective(") + print(io, msgo.gradient!!) + print(io, "; ") + if !ismissing(msgo.cost) + print(io, "cost = ") + print(io, msgo.cost) + print(io, ", ") + end + print(io, _to_kw(E)) + return print(io, ")") +end +function status_summary(msgo::ManifoldStochasticGradientObjective; context::Symbol = :default) + (context === :short) && return repr(msgo) + cs = ismissing(msgo.cost) ? "" : "including the cost function" + (context === :inline) && return "A stochastic gradient objective $cs." + ics = ismissing(msgo.cost) ? "" : "\n* cost: $(_MANOPT_INDENT)$(msgo.cost)" + return """ + A stochastic gradient objective + + ## Functions$(ics) + * subgradient ∂f:$(_MANOPT_INDENT)$(msgo.gradient!!)""" +end """ AbstractStochasticGradientDescentSolverState <: AbstractManoptSolverState diff --git a/src/plans/stopping_criterion.jl b/src/plans/stopping_criterion.jl index 5ffdd08463..8426dc2f9c 100644 --- a/src/plans/stopping_criterion.jl +++ b/src/plans/stopping_criterion.jl @@ -15,6 +15,12 @@ details when a criterion is met (and that is empty otherwise). """ abstract type StoppingCriterion end +function Base.show(io::IO, ::MIME"text/plain", asc::StoppingCriterion) + multiline = get(io, :multiline, true) + return multiline ? status_summary(io, asc) : show(io, asc) +end + + """ indicates_convergence(c::StoppingCriterion) @@ -125,23 +131,24 @@ function (c::StopAfter)(::AbstractManoptProblem, ::AbstractManoptSolverState, k: end return false end +indicates_convergence(c::StopAfter) = false function get_reason(c::StopAfter) if (c.at_iteration >= 0) return "The algorithm ran for $(floor(c.time, typeof(c.threshold))) (threshold: $(c.threshold)).\n" end return "" end -function status_summary(c::StopAfter) +function status_summary(c::StopAfter; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) - s = has_stopped ? "reached" : "not reached" - return "stopped after $(c.threshold):\t$s" + s = (has_stopped ? "reached" : "not reached") + return (_is_inline(context) ? "stopped after $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop after $(c.threshold)\n$(_MANOPT_INDENT)") * "$s" end -indicates_convergence(c::StopAfter) = false -function show(io::IO, c::StopAfter) - return print(io, "StopAfter($(repr(c.threshold)))\n $(status_summary(c))") +function Base.show(io::IO, c::StopAfter) + return print(io, "StopAfter($(repr(c.threshold)))") end -""" +@doc """ set_parameter!(c::StopAfter, :MaxTime, v::Period) Update the time period after which an algorithm shall stop. @@ -186,19 +193,21 @@ function (c::StopAfterIteration)( end return false end +indicates_convergence(c::StopAfterIteration) = false function get_reason(c::StopAfterIteration) if c.at_iteration >= c.max_iterations return "At iteration $(c.at_iteration) the algorithm reached its maximal number of iterations ($(c.max_iterations)).\n" end return "" end -function status_summary(c::StopAfterIteration) +function status_summary(c::StopAfterIteration; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Max Iteration $(c.max_iterations):\t$s" + return (_is_inline(context) ? "stopped after $(c.max_iterations) iterations:$(_MANOPT_INDENT)" : "A stopping criterion to stop after $(c.max_iterations) iterations\n$(_MANOPT_INDENT)") * "$s" end -function show(io::IO, c::StopAfterIteration) - return print(io, "StopAfterIteration($(c.max_iterations))\n $(status_summary(c))") +function Base.show(io::IO, c::StopAfterIteration) + return print(io, "StopAfterIteration($(c.max_iterations))") end """ @@ -318,18 +327,17 @@ function get_reason(c::StopWhenChangeLess) end return "" end -function status_summary(c::StopWhenChangeLess) +function status_summary(c::StopWhenChangeLess; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "|Δp| < $(c.threshold): $s" -end -indicates_convergence(c::StopWhenChangeLess) = true -function show(io::IO, c::StopWhenChangeLess) - s = ismissing(c.outer_norm) ? "" : "and outer norm $(c.outer_norm)" - return print( - io, - "StopWhenChangeLess with threshold $(c.threshold)$(s).\n $(status_summary(c))", - ) + return (_is_inline(context) ? "|Δp| < $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the change of the iterate is less than $(c.threshold)\n using the $(repr(c.inverse_retraction_method))\n$(_MANOPT_INDENT)") * "$s" +end +indicates_convergence(c::StopWhenChangeLess) = false +function Base.show(io::IO, c::StopWhenChangeLess) + print(io, "StopWhenChangeLess($(c.threshold); inverse_retraction_method=$(repr(c.inverse_retraction_method))") + !ismissing(c.outer_norm) && print(io, ", outer_norm = ", c.outer_norm) + return print(io, ")") end """ @@ -375,7 +383,7 @@ function (c::StopWhenCostChangeLess)( c.last_change = 2 * c.tolerance end c.last_change = c.last_cost - c.last_cost = get_cost(problem, get_iterate(state)) + c.last_cost = get_cost(problem, state) c.last_change = c.last_change - c.last_cost if abs(c.last_change) < c.tolerance c.at_iteration = iteration @@ -383,33 +391,32 @@ function (c::StopWhenCostChangeLess)( end return false end +indicates_convergence(c::StopWhenCostChangeLess) = false function get_reason(c::StopWhenCostChangeLess) if c.at_iteration >= 0 return "At iteration $(c.at_iteration) the algorithm performed a step with an absolute cost change ($(abs(c.last_change))) less than $(c.tolerance)." end return "" end -function status_summary(c::StopWhenCostChangeLess) +function status_summary(c::StopWhenCostChangeLess; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "|Δf(p)| = $(abs(c.last_change)) < $(c.tolerance):\t$s" + return (_is_inline(context) ? "|Δf(p)| = $(abs(c.last_change)) < $(c.tolerance):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the change of the cost function is less than $(c.tolerance)\n$(_MANOPT_INDENT)") * "$s" end function Base.show(io::IO, c::StopWhenCostChangeLess) - return print( - io, - "StopWhenCostChangeLess with threshold $(c.tolerance).\n $(status_summary(c))", - ) + return print(io, "StopWhenCostChangeLess($(c.tolerance))") end """ StopWhenCostLess <: StoppingCriterion store a threshold when to stop looking at the cost function of the -optimization problem from within a [`AbstractManoptProblem`](@ref), i.e `get_cost(p,get_iterate(o))`. +optimization problem from within a [`AbstractManoptProblem`](@ref), i.e `get_cost(p, s)`. # Constructor - StopWhenCostLess(ε) + StopWhenCostLess(ε::Real) initialize the stopping criterion to a threshold `ε`. """ @@ -427,26 +434,28 @@ function (c::StopWhenCostLess)( if k == 0 # reset on init c.at_iteration = -1 end - c.last_cost = get_cost(p, get_iterate(s)) + c.last_cost = get_cost(p, s) if c.last_cost < c.threshold c.at_iteration = k return true end return false end +indicates_convergence(c::StopWhenCostLess) = false function get_reason(c::StopWhenCostLess) if (c.last_cost < c.threshold) && (c.at_iteration >= 0) return "The algorithm reached a cost function value ($(c.last_cost)) less than the threshold ($(c.threshold)).\n" end return "" end -function status_summary(c::StopWhenCostLess) +function status_summary(c::StopWhenCostLess; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "f(x) < $(c.threshold):\t$s" + return (_is_inline(context) ? "f(x) < $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the cost function is less than $(c.threshold)\n$(_MANOPT_INDENT)") * "$s" end -function show(io::IO, c::StopWhenCostLess) - return print(io, "StopWhenCostLess($(c.threshold))\n $(status_summary(c))") +function Base.show(io::IO, c::StopWhenCostLess) + return print(io, "StopWhenCostLess($(c.threshold))") end """ @@ -459,6 +468,76 @@ function set_parameter!(c::StopWhenCostLess, ::Val{:MinCost}, v) return c end +""" + StopWhenRelativeAPosterioriCostChangeLessOrEqual <: StoppingCriterion + +A stopping criterion to stop when + +````math +\\frac{f_k - f_{k+1}}{\\max(\\lvert f_k \\rvert, \\lvert f_{k+1} \\rvert, 1)} ≤ tol, +```` + +based on Eq. (1) in [ZhuByrdLuNocedal:1997](@cite) + +# Fields +* _`threshold`: the threshold `tol` in the above formula. +$(_fields([:at_iteration, :last_change])) +* `last_cost`: the last cost value + +# Constructor + + StopWhenRelativeAPosterioriCostChangeLessOrEqual(threshold::F) + +Initialize the stopping criterion to a `threshold` for the change of the cost function. + + StopWhenRelativeAPosterioriCostChangeLessOrEqual(; factr::Real=1.0e7) + +Initialize threshold to `factr * eps(factr)`, following the convention in [ZhuByrdLuNocedal:1997](@cite). +""" +mutable struct StopWhenRelativeAPosterioriCostChangeLessOrEqual{F <: Real} <: StoppingCriterion + threshold::F + at_iteration::Int + last_cost::F + last_change::F +end +function StopWhenRelativeAPosterioriCostChangeLessOrEqual(tol::F) where {F <: Real} + return StopWhenRelativeAPosterioriCostChangeLessOrEqual{F}(tol, -1, zero(tol), 2 * tol) +end +StopWhenRelativeAPosterioriCostChangeLessOrEqual(; factr::F = 1.0e7) where {F <: Real} = StopWhenRelativeAPosterioriCostChangeLessOrEqual(factr * eps(typeof(factr))) +function (c::StopWhenRelativeAPosterioriCostChangeLessOrEqual)( + problem::AbstractManoptProblem, state::AbstractManoptSolverState, iteration::Int + ) + if iteration <= 0 # reset on init + c.at_iteration = -1 + c.last_cost = Inf + c.last_change = 2 * c.threshold + end + current_cost = get_cost(problem, state) + c.last_change = (c.last_cost - current_cost) / max(abs(c.last_cost), abs(current_cost), 1) + c.last_cost = current_cost + if iteration > 1 && c.last_change <= c.threshold + c.at_iteration = iteration + return true + end + return false +end +indicates_convergence(c::StopWhenRelativeAPosterioriCostChangeLessOrEqual) = false +function get_reason(c::StopWhenRelativeAPosterioriCostChangeLessOrEqual) + if c.at_iteration >= 0 + return "At iteration $(c.at_iteration) the algorithm performed a step with a relative a posteriori cost change ($(abs(c.last_change))) less than or equal to $(c.threshold)." + end + return "" +end +function status_summary(c::StopWhenRelativeAPosterioriCostChangeLessOrEqual; context::Symbol = :default) + (context == :short) && return repr(c) + has_stopped = (c.at_iteration >= 0) + s = has_stopped ? "reached" : "not reached" + return (_is_inline(context) ? "(fₖ- fₖ₊₁)/max(|fₖ|, |fₖ₊₁|, 1) = $(abs(c.last_change)) ≤ $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the relative posteriori cost change is less than $(c.threshold)\n$(_MANOPT_INDENT)") * "$s" +end +function Base.show(io::IO, c::StopWhenRelativeAPosterioriCostChangeLessOrEqual) + return print(io, "StopWhenRelativeAPosterioriCostChangeLessOrEqual($(c.threshold))") +end + @doc """ StopWhenEntryChangeLess @@ -523,16 +602,21 @@ function (sc::StopWhenEntryChangeLess)( sc.storage(mp, s, k) return false end +indicates_convergence(sc::StopWhenEntryChangeLess) = false function get_reason(sc::StopWhenEntryChangeLess) if (sc.last_change < sc.threshold) && (sc.at_iteration >= 0) return "At iteration $(sc.at_iteration) the algorithm performed a step with a change ($(sc.last_change)) in $(sc.field) less than $(sc.threshold).\n" end return "" end -function status_summary(sc::StopWhenEntryChangeLess) +function status_summary(sc::StopWhenEntryChangeLess; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (sc.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "|Δ:$(sc.field)| < $(sc.threshold): $s" + return (_is_inline(context) ? "|Δ:$(sc.field)| < $(sc.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the change of $(sc.field) is less than $(sc.threshold)\n$(_MANOPT_INDENT)") * "$s" +end +function Base.show(io::IO, sc::StopWhenEntryChangeLess) + return print(io, "StopWhenEntryChangeLess($(sc.field), $(sc.distance), $(sc.threshold))") end """ @@ -544,9 +628,6 @@ function set_parameter!(c::StopWhenEntryChangeLess, ::Val{:Threshold}, v) c.threshold = v return c end -function show(io::IO, c::StopWhenEntryChangeLess) - return print(io, "StopWhenEntryChangeLess\n $(status_summary(c))") -end @doc """ StopWhenGradientChangeLess <: StoppingCriterion @@ -642,23 +723,21 @@ function (c::StopWhenGradientChangeLess)( c.storage(mp, s, k) return false end +indicates_convergence(c::StopWhenGradientChangeLess) = false function get_reason(c::StopWhenGradientChangeLess) if (c.last_change < c.threshold) && (c.at_iteration >= 0) return "At iteration $(c.at_iteration) the change of the gradient ($(c.last_change)) was less than $(c.threshold).\n" end return "" end -function status_summary(c::StopWhenGradientChangeLess) +function status_summary(c::StopWhenGradientChangeLess; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "|Δgrad f| < $(c.threshold): $s" + return (_is_inline(context) ? "|Δgrad f| < $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the change of the gradient is less than $(c.threshold)\n$(_MANOPT_INDENT)") * "$s" end -function show(io::IO, c::StopWhenGradientChangeLess) - s = ismissing(c.outer_norm) ? "" : "outer_norm=$(c.outer_norm), " - return print( - io, - "StopWhenGradientChangeLess with threshold $(c.threshold); $(s)vector_transport_method=$(c.vector_transport_method))\n $(status_summary(c))", - ) +function Base.show(io::IO, c::StopWhenGradientChangeLess) + return print(io, "StopWhenGradientChangeLess($(c.threshold); vector_transport_method=$(c.vector_transport_method))") end """ @@ -715,7 +794,7 @@ Create a stopping criterion with threshold `ε` for the gradient, that is, this indicates to stop when [`get_gradient`](@ref) returns a gradient vector of norm less than `ε`, where the norm to use can be specified in the `norm=` keyword. """ -mutable struct StopWhenGradientNormLess{F, TF, N <: Union{Missing, Real}} <: StoppingCriterion +mutable struct StopWhenGradientNormLess{F, TF <: Real, N <: Union{Missing, Real}} <: StoppingCriterion norm::F threshold::TF last_change::TF @@ -723,7 +802,7 @@ mutable struct StopWhenGradientNormLess{F, TF, N <: Union{Missing, Real}} <: Sto outer_norm::N function StopWhenGradientNormLess( ε::TF; norm::F = norm, outer_norm::N = missing - ) where {F, TF, N <: Union{Missing, Real}} + ) where {F, TF <: Real, N <: Union{Missing, Real}} return new{F, TF, N}(norm, ε, zero(ε), -1, outer_norm) end end @@ -751,22 +830,95 @@ function get_reason(c::StopWhenGradientNormLess) end return "" end -function status_summary(c::StopWhenGradientNormLess) +indicates_convergence(c::StopWhenGradientNormLess) = true +function status_summary(c::StopWhenGradientNormLess; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "|grad f| < $(c.threshold): $s" -end -indicates_convergence(c::StopWhenGradientNormLess) = true -function show(io::IO, c::StopWhenGradientNormLess) - return print(io, "StopWhenGradientNormLess($(c.threshold))\n $(status_summary(c))") + return (_is_inline(context) ? "|grad f| < $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the gradient norm is less than $(c.threshold)\n$(_MANOPT_INDENT)") * "$s" end +show(io::IO, c::StopWhenGradientNormLess) = print(io, "StopWhenGradientNormLess($(c.threshold))") """ - set_parameter!(c::StopWhenGradientNormLess, :MinGradNorm, v::Float64) + set_parameter!(c::StopWhenGradientNormLess{F,TF}, ::Val{:MinGradNorm}, v::TF) where {F,TF<:Real} Update the minimal gradient norm when an algorithm shall stop """ -function set_parameter!(c::StopWhenGradientNormLess, ::Val{:MinGradNorm}, v::Float64) +function set_parameter!(c::StopWhenGradientNormLess{F, TF}, ::Val{:MinGradNorm}, v::TF) where {F, TF <: Real} + c.threshold = v + return c +end + +""" + StopWhenProjectedNegativeGradientNormLess <: StoppingCriterion + +A stopping criterion similar to [`StopWhenGradientNormLess`](@ref), although it checks the +norm of projected minus gradient. It is primarily useful for optimization involving +[`Hyperrectangle`](@extref Manifolds.Hyperrectangle). + +On manifolds with boundary and manifolds with corners, for a tangent vector ``X``, +``-X`` might not be a valid tangent vector. As an example, consider the objective +``f(x)=x^2`` on the interval ``[1, 2]``. Its gradient at 1 is equal to 2, but because the +point 1 is at the boundary of the interval, the projected negative gradient is equal to 0 +because we can't go in the negative direction. +""" +mutable struct StopWhenProjectedNegativeGradientNormLess{F, TF <: Real, N <: Union{Missing, Real}} <: StoppingCriterion + norm::F + threshold::TF + last_change::TF + at_iteration::Int + outer_norm::N + function StopWhenProjectedNegativeGradientNormLess( + ε::TF; norm::F = norm, outer_norm::N = missing + ) where {F, TF <: Real, N <: Union{Missing, Real}} + return new{F, TF, N}(norm, ε, zero(ε), -1, outer_norm) + end +end + +function (sc::StopWhenProjectedNegativeGradientNormLess)( + mp::AbstractManoptProblem, s::AbstractManoptSolverState, k::Int + ) + M = get_manifold(mp) + if k == 0 # reset on init + sc.at_iteration = -1 + end + if (k > 0) + r = (has_components(M) && !ismissing(sc.outer_norm)) ? (sc.outer_norm,) : () + p = get_iterate(s) + mpg = -get_gradient(s) + embed_project!(M, mpg, p, mpg) + sc.last_change = sc.norm(M, p, mpg, r...) + if sc.last_change < sc.threshold + sc.at_iteration = k + return true + end + end + return false +end +function get_reason(c::StopWhenProjectedNegativeGradientNormLess) + if (c.last_change < c.threshold) && (c.at_iteration >= 0) + return "The algorithm reached approximately critical point after $(c.at_iteration) iterations; the gradient norm ($(c.last_change)) is less than $(c.threshold).\n" + end + return "" +end +function status_summary(c::StopWhenProjectedNegativeGradientNormLess; context::Symbol = :default) + (context === :short) && return repr(c) + has_stopped = (c.at_iteration >= 0) + s = has_stopped ? "reached" : "not reached" + (context === :inline) && return "|proj (-grad f)| < $(c.threshold): $s" + return "A StoppingCriterion to stop when the negative projected gradient norm is less than a threshold of $(c.threshold):\n$(_MANOPT_INDENT)$s" +end +indicates_convergence(c::StopWhenProjectedNegativeGradientNormLess) = true +function Base.show(io::IO, c::StopWhenProjectedNegativeGradientNormLess) + return print(io, "StopWhenProjectedNegativeGradientNormLess($(c.threshold); norm = $(c.norm))") +end + +""" + set_parameter!(c::StopWhenProjectedNegativeGradientNormLess{F,TF}, ::Val{:MinGradNorm}, v::TF) where {F, TF<:Real} + +Update the minimal gradient norm when an algorithm shall stop. +""" +function set_parameter!(c::StopWhenProjectedNegativeGradientNormLess{F, TF}, ::Val{:MinGradNorm}, v::TF) where {F, TF <: Real} c.threshold = v return c end @@ -804,20 +956,23 @@ function (c::StopWhenStepsizeLess)( end return false end +indicates_convergence(c::StopWhenStepsizeLess) = false function get_reason(c::StopWhenStepsizeLess) if (c.last_stepsize < c.threshold) && (c.at_iteration >= 0) return "The algorithm computed a step size ($(c.last_stepsize)) less than $(c.threshold).\n" end return "" end -function status_summary(c::StopWhenStepsizeLess) +function status_summary(c::StopWhenStepsizeLess; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Stepsize s < $(c.threshold):\t$s" + return (_is_inline(context) ? "Stepsize s < $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the step size is less than $(c.threshold)\n$(_MANOPT_INDENT)") * "$s" end -function show(io::IO, c::StopWhenStepsizeLess) - return print(io, "StopWhenStepsizeLess($(c.threshold))\n $(status_summary(c))") +function Base.show(io::IO, c::StopWhenStepsizeLess) + return print(io, "StopWhenStepsizeLess($(c.threshold))") end + """ set_parameter!(c::StopWhenStepsizeLess, :MinStepsize, v) @@ -831,13 +986,14 @@ end """ StopWhenCostNaN <: StoppingCriterion -stop looking at the cost function of the optimization problem from within a [`AbstractManoptProblem`](@ref), i.e `get_cost(p,get_iterate(o))`. +Stop the solver when the cost function of the optimization problem +[`AbstractManoptProblem`](@ref) is `NaN`. The value is obtained using `get_cost(p, s)`. # Constructor StopWhenCostNaN() -initialize the stopping criterion to NaN. +initialize the stopping criterion with `at_iteration` equal to -1. """ mutable struct StopWhenCostNaN <: StoppingCriterion at_iteration::Int @@ -850,37 +1006,41 @@ function (c::StopWhenCostNaN)( c.at_iteration = -1 end # but still verify whether it yields NaN - if isnan(get_cost(p, get_iterate(s))) + if isnan(get_cost(p, s)) c.at_iteration = k return true end return false end +indicates_convergence(c::StopWhenCostNaN) = false function get_reason(c::StopWhenCostNaN) if c.at_iteration >= 0 return "The algorithm reached a cost function value of NaN.\n" end return "" end -function status_summary(c::StopWhenCostNaN) +function status_summary(c::StopWhenCostNaN; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "f(x) is NaN:\t$s" + return (_is_inline(context) ? "f(x) is NaN:$(_MANOPT_INDENT)" : "A stopping criterion to stop when the cost function is NaN\n$(_MANOPT_INDENT)") * "$s" end -function show(io::IO, c::StopWhenCostNaN) - return print(io, "StopWhenCostNaN()\n $(status_summary(c))") +function Base.show(io::IO, c::StopWhenCostNaN) + return print(io, "StopWhenCostNaN()") end """ StopWhenIterateNaN <: StoppingCriterion -stop looking at the cost function of the optimization problem from within a [`AbstractManoptProblem`](@ref), i.e `get_cost(p,get_iterate(o))`. +Stop the solver when the iterate of the optimization problem from within an +[`AbstractManoptProblem`](@ref) contains `NaN` values. +The value is obtained using [`get_iterate`](@ref)`(s)`. # Constructor StopWhenIterateNaN() -initialize the stopping criterion to NaN. +Initialize the stopping criterion. """ mutable struct StopWhenIterateNaN <: StoppingCriterion at_iteration::Int @@ -893,24 +1053,26 @@ function (c::StopWhenIterateNaN)( c.at_iteration = -1 end if (k >= 0) && any(isnan.(get_iterate(s))) - c.at_iteration = 0 + c.at_iteration = k return true end return false end function get_reason(c::StopWhenIterateNaN) if (c.at_iteration >= 0) - return "The algorithm reached an iterate containing NaNs iterate.\n" + return "The algorithm reached an iterate containing NaNs.\n" end return "" end -function status_summary(c::StopWhenIterateNaN) +indicates_convergence(c::StopWhenIterateNaN) = false +function status_summary(c::StopWhenIterateNaN; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "f(x) is NaN:\t$s" + return (_is_inline(context) ? "An entry of x is NaN:$(_MANOPT_INDENT)" : "A stopping criterion to stop when an entry of the iterate is NaN\n$(_MANOPT_INDENT)") * "$s" end -function show(io::IO, c::StopWhenIterateNaN) - return print(io, "StopWhenIterateNaN()\n $(status_summary(c))") +function Base.show(io::IO, c::StopWhenIterateNaN) + return print(io, "StopWhenIterateNaN()") end @doc """ @@ -949,21 +1111,21 @@ function (c::StopWhenSmallerOrEqual)( end return false end +indicates_convergence(c::StopWhenSmallerOrEqual) = false function get_reason(c::StopWhenSmallerOrEqual) if (c.at_iteration >= 0) return "The value of the variable ($(string(c.value))) is smaller than or equal to its threshold ($(c.minValue)).\n" end return "" end -function status_summary(c::StopWhenSmallerOrEqual) +function status_summary(c::StopWhenSmallerOrEqual; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Field :$(c.value) ≤ $(c.minValue):\t$s" + return (_is_inline(context) ? "Field :$(c.value) ≤ $(c.minValue):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the field :$(c.value) is smaller than or equal to $(c.minValue)\n$(_MANOPT_INDENT)") * "$s" end -function show(io::IO, c::StopWhenSmallerOrEqual) - return print( - io, "StopWhenSmallerOrEqual(:$(c.value), $(c.minValue))\n $(status_summary(c))" - ) +function Base.show(io::IO, c::StopWhenSmallerOrEqual) + return print(io, "StopWhenSmallerOrEqual(:$(c.value), $(c.minValue))") end """ @@ -998,23 +1160,23 @@ function (c::StopWhenSubgradientNormLess)( end return false end +indicates_convergence(c::StopWhenSubgradientNormLess) = true function get_reason(c::StopWhenSubgradientNormLess) if (c.value < c.threshold) && (c.at_iteration >= 0) return "The algorithm reached approximately critical point after $(c.at_iteration) iterations; the subgradient norm ($(c.value)) is less than $(c.threshold).\n" end return "" end -function status_summary(c::StopWhenSubgradientNormLess) +function status_summary(c::StopWhenSubgradientNormLess; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "|∂f| < $(c.threshold): $s" + return (_is_inline(context) ? "|∂f| < $(c.threshold):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the subgradient norm |∂f| is less than $(c.threshold)\n$(_MANOPT_INDENT)") * "$s" end -indicates_convergence(c::StopWhenSubgradientNormLess) = true -function show(io::IO, c::StopWhenSubgradientNormLess) - return print( - io, "StopWhenSubgradientNormLess($(c.threshold))\n $(status_summary(c))" - ) +function Base.show(io::IO, c::StopWhenSubgradientNormLess) + return print(io, "StopWhenSubgradientNormLess($(c.threshold))") end + """ set_parameter!(c::StopWhenSubgradientNormLess, :MinSubgradNorm, v::Float64) @@ -1061,14 +1223,22 @@ function get_reason(c::StopWhenAll) end return "" end -function status_summary(c::StopWhenAll) +function status_summary(c::StopWhenAll; context::Symbol = :default) + if context == :short + return join( + [ + s isa StoppingCriterionSet ? "($(status_summary(s; context = :short)))" : status_summary(s; context = :short) for s in c.criteria + ], + " & " + ) + end has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - r = "Stop When _all_ of the following are fulfilled:\n" + r = "Stop when _all_ of the following are fulfilled:\n" for cs in c.criteria - r = "$r * $(replace(status_summary(cs), "\n" => "\n "))\n" + r = "$r * $(_in_str(status_summary(cs; context = :inline); indent = 0, headers = 0))\n" end - return "$(r)Overall: $s" + return (_is_inline(context) ? "$(r)Overall: $s" : "Stop when _all_ of the following are fulfilled:\n$(r)Overall: $s") end function indicates_convergence(c::StopWhenAll) return any(indicates_convergence(ci) for ci in c.criteria) @@ -1084,9 +1254,18 @@ end function get_count(c::StopWhenAll, v::Val{:Iterations}) return maximum(get_count(ci, v) for ci in c.criteria) end -function show(io::IO, c::StopWhenAll) - s = replace(status_summary(c), "\n" => "\n ") #increase indent - return print(io, "StopWhenAll with the stopping criteria\n $(s)") +function Base.show(io::IO, c::StopWhenAll) + print(io, "StopWhenAll([") + first = true + for cs in c.criteria + if !first + print(io, ", ") + else + first = false + end + show(io, cs) + end + return print(io, "])") end """ @@ -1139,8 +1318,11 @@ end # `_fast_any(f, tup::Tuple)`` is functionally equivalent to `any(f, tup)`` but on Julia 1.10 # this implementation is faster on heterogeneous tuples -@inline _fast_any(f, tup::Tuple{}) = true +# for length zero -> return false +@inline _fast_any(f, tup::Tuple{}) = false +# for one-element tuples, evaluate that one element @inline _fast_any(f, tup::Tuple{T}) where {T} = f(tup[1]) +# for more than that -> finish fast, if the first is true end checks, otherwise continue with tail @inline function _fast_any(f, tup::Tuple) if f(tup[1]) return true @@ -1163,12 +1345,20 @@ function get_reason(c::StopWhenAny) end return "" end -function status_summary(c::StopWhenAny) +function status_summary(c::StopWhenAny; context::Symbol = :default) + if context == :short + return join( + [ + s isa StoppingCriterionSet ? "($(status_summary(s; context = :short)))" : status_summary(s; context = :short) for s in c.criteria + ], + " | " + ) + end has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - r = "Stop When _one_ of the following are fulfilled:\n" + r = "Stop when _one_ of the following are fulfilled:\n" for cs in c.criteria - r = "$r * $(replace(status_summary(cs), "\n" => "\n "))\n" + r = "$r * $(_in_str(status_summary(cs; context = :inline); indent = 0, headers = 0))\n" end return "$(r)Overall: $s" end @@ -1184,9 +1374,18 @@ function get_count(c::StopWhenAny, v::Val{:Iterations}) (length(iters) == 0) && (return -1) # None indicated to stop yet, so we also do not return minimum(iters) end -function show(io::IO, c::StopWhenAny) - s = replace(status_summary(c), "\n" => "\n ") #increase indent - return print(io, "StopWhenAny with the Stopping Criteria\n $(s)") +function Base.show(io::IO, c::StopWhenAny) + print(io, "StopWhenAny([") + first = true + for cs in c.criteria + if !first + print(io, ", ") + else + first = false + end + show(io, cs) + end + return print(io, "])") end """ |(s1,s2) @@ -1352,18 +1551,12 @@ function get_reason(sc::StopWhenRepeated) r = """At iteration $(sc.at_iteration), the stopping criterion $(typeof(sc.stopping_criterion)) has indicated to stop $(sc.n) $(c) times: $(sc.count) ≥ $(sc.n): $(s) last inner criterion status: - $(replace(status_summary(sc.stopping_criterion), "\n" => "\n ")) + $(_in_str(status_summary(sc.stopping_criterion); indent = 1, headers = 0)) """ return r end return "" end -function status_summary(sc::StopWhenRepeated) - has_stopped = (sc.at_iteration >= 0) - s = has_stopped ? "reached" : "not reached" - c = sc.consecutive ? "consecutive" : "" - return "$(sc.count) ≥ $(sc.n) ($(c)): $(s) (last inner status: $(status_summary(sc.stopping_criterion)))" -end function indicates_convergence(sc::StopWhenRepeated) return indicates_convergence(sc.stopping_criterion) end @@ -1371,12 +1564,15 @@ function has_converged(sc::StopWhenRepeated) # When the inner one indicates convergence, this does as well return has_converged(sc.stopping_criterion) end -function show(io::IO, sc::StopWhenRepeated) - is = replace("$(sc.stopping_criterion)", "\n" => "\n ") #increase indent - return print( - io, - "StopWhenRepeated with the Stopping Criterion:\n $(is)\n$(status_summary(sc))", - ) +function Base.show(io::IO, sc::StopWhenRepeated) + return print(io, "StopWhenRepeated($(typeof(sc.stopping_criterion)), $(sc.n); consecutive=$(sc.consecutive))") +end +function status_summary(sc::StopWhenRepeated; context::Symbol = :default) + (context == :short) && return "StopWhenRepeated($(repr(sc.stopping_criterion)))×$(sc.n)" + has_stopped = (sc.at_iteration >= 0) + s = has_stopped ? "reached" : "not reached" + c = sc.consecutive ? "consecutive" : "" + return (_is_inline(context) ? "$(status_summary(sc.stopping_criterion; cnontext = context)) × $(sc.count) ≥ $(sc.n) ($(c)):" : "A stopping criterion to stop when the inner criterion has indicated to stop $(sc.n) ($(c)) times.\n$(_in_str(status_summary(sc.stopping_criterion; context = context); indent = 1, headers = 0))\n$(_in_str(s; indent = 2, headers = 0))") end @doc """ @@ -1464,12 +1660,6 @@ function get_reason(sc::StopWhenCriterionWithIterationCondition) end return "" end -function status_summary(sc::StopWhenCriterionWithIterationCondition) - has_stopped = (sc.at_iteration >= 0) - s = has_stopped ? "reached" : "not reached" - is = replace("$(sc.stopping_criterion)", "\n" => "\n ") #increase indent - return "$(sc.comp) && $(is)\n overall: $(s)" -end function indicates_convergence(sc::StopWhenCriterionWithIterationCondition) return indicates_convergence(sc.stopping_criterion) end @@ -1477,16 +1667,18 @@ function has_converged(sc::StopWhenCriterionWithIterationCondition) # When the inner one indicates convergence, this does as well return has_converged(sc.stopping_criterion) end -function show(io::IO, sc::StopWhenCriterionWithIterationCondition) +function Base.show(io::IO, sc::StopWhenCriterionWithIterationCondition) + return print(io, "StopWhenCriterionWithIterationCondition($(typeof(sc.stopping_criterion)), $(sc.comp))") +end +function status_summary(sc::StopWhenCriterionWithIterationCondition; context::Symbol = :default) + (context == :short) && return repr(sc) has_stopped = (sc.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - is = replace("$(sc.stopping_criterion)", "\n" => "\n ") #increase indent - return print( - io, - "StopWhenCriterionWithIterationCondition with the Stopping Criterion:\n $(is)\nand condition $(sc.comp)\n\toverall: $(s)", - ) + is = replace("$(status_summary(sc.stopping_criterion; context = context))", "\n" => "\n ") #increase indent + return (_is_inline(context) ? "$(sc.comp) && $(is):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the inner criterion is met and $(sc.comp)\n$(_MANOPT_INDENT)$(is)\n$(_MANOPT_INDENT)$(_MANOPT_INDENT)") * "$s" end + @doc """ get_reason(s::AbstractManoptSolverState) diff --git a/src/plans/subgradient_plan.jl b/src/plans/subgradient_plan.jl index 29a9ff04dc..9b53306168 100644 --- a/src/plans/subgradient_plan.jl +++ b/src/plans/subgradient_plan.jl @@ -16,8 +16,7 @@ Generate the [`ManifoldSubgradientObjective`](@ref) for a subgradient objective, of a (cost) function `f(M, p)` and a function `∂f(M, p)` that returns a not necessarily deterministic element from the subdifferential at `p` on a manifold `M`. """ -struct ManifoldSubgradientObjective{T <: AbstractEvaluationType, C, S} <: - AbstractManifoldCostObjective{T, C} +struct ManifoldSubgradientObjective{T <: AbstractEvaluationType, C, S} <: AbstractManifoldCostObjective{T, C} cost::C subgradient!!::S function ManifoldSubgradientObjective( @@ -106,3 +105,20 @@ end function get_subgradient_function(admo::AbstractDecoratedManifoldObjective, recursive = false) return get_subgradient_function(get_objective(admo, recursive)) end + +function Base.show(io::IO, mso::ManifoldSubgradientObjective{E}) where {E} + return print(io, "ManifoldSubgradientObjective(", mso.cost, ", ", mso.subgradient!!, "; ", _to_kw(E), ")") +end + +function status_summary(mso::ManifoldSubgradientObjective{E}; context::Symbol = :default) where {E} + (context === :short) && return repr(mso) + s = "A subgradient objective " + (context === :inline) && (return s) + e = (E === AllocatingEvaluation ? " (allocating)" : " (in-place)") + return """ + $s + + ## Components + * `f`: $(mso.cost) + * `∂f`: $(mso.subgradient!!)$e""" +end diff --git a/src/plans/trust_regions_plan.jl b/src/plans/trust_regions_plan.jl index 288de05039..6f084307a1 100644 --- a/src/plans/trust_regions_plan.jl +++ b/src/plans/trust_regions_plan.jl @@ -90,3 +90,18 @@ function get_hessian!(TpM::TangentSpace, W, trmo::TrustRegionModelObjective, X, p = TpM.point return get_objective_hessian!(M, W, trmo, p, V) end + +function Base.show(io::IO, trmo::TrustRegionModelObjective) + print(io, "TrustRegionModelObjective(") + print(io, trmo.objective) + return print(io, ")") +end +function status_summary(trmo::TrustRegionModelObjective; context::Symbol = :default) + (context === :short) && return repr(arcmo) + (context === :inline) && return "The (tangent space) model for the trust region solver for the objective $(status_summary(arcmo.objective; context = context))" + return """ + The trust region model for the sub problem in the tangent space + + ## Objective + $(_in_str(status_summary(trmo.objective)))""" +end diff --git a/src/plans/vectorial_plan.jl b/src/plans/vectorial_plan.jl index 02e4bea3b0..eafde16d29 100644 --- a/src/plans/vectorial_plan.jl +++ b/src/plans/vectorial_plan.jl @@ -31,6 +31,9 @@ struct CoordinateVectorialType{B <: AbstractBasis} <: AbstractVectorialType basis::B end CoordinateVectorialType() = CoordinateVectorialType(DefaultOrthonormalBasis()) + +Base.show(io::IO, cvt::CoordinateVectorialType) = print(io, "CoordinateVectorialType($(cvt.basis))") + """ get_basis(::AbstractVectorialType) @@ -80,6 +83,7 @@ when it makes sense, especially for Hessian and gradient functions. struct FunctionVectorialType{P <: AbstractPowerRepresentation} <: AbstractVectorialType range::P end +Base.show(io::IO, fvt::FunctionVectorialType) = print(io, "FunctionVectorialType($(fvt.range))") """ get_range(::AbstractVectorialType) @@ -115,7 +119,7 @@ For the [`ComponentVectorialType`](@ref) imagine that ``f`` could also be writte using its component functions, ```math -f(p) = \bigl( f_1(p), f_2(p),…, f_n(p) \bigr)^{$(_tex(:rm, "T"))} +f(p) = $(_tex(:bigl))( f_1(p), f_2(p),…, f_n(p) $(_tex(:bigr)))^{$(_tex(:rm, "T"))} ``` In this representation `f` is given as a vector `[f1(M,p), f2(M,p), ..., fn(M,p)]` @@ -132,8 +136,13 @@ that if this is expensive even to compute a single component, all of `f` has to abstract type AbstractVectorFunction{E <: AbstractEvaluationType, FT <: AbstractVectorialType} <: Function end +function Base.show(io::IO, ::MIME"text/plain", avf::AbstractVectorFunction) + multiline = get(io, :multiline, true) + return multiline ? status_summary(io, avf) : show(io, avf) +end + @doc """ - VectorGradientFunction{E, FT, JT, F, J, I} <: AbstractManifoldObjective{E} + VectorGradientFunction{E, FT, JT, F, J, I} <: AbstractVectorFunction{E} Represent an abstract vectorial function ``f:$(_math(:Manifold))) → ℝ^n`` that provides a (component wise) gradient. @@ -200,18 +209,10 @@ struct VectorGradientFunction{ end function VectorGradientFunction( - f::F, - Jf::J, - range_dimension::I; - evaluation::E = AllocatingEvaluation(), - function_type::FT = FunctionVectorialType(), - jacobian_type::JT = FunctionVectorialType(), + f::F, Jf::J, range_dimension::I; + evaluation::E = AllocatingEvaluation(), function_type::FT = FunctionVectorialType(), jacobian_type::JT = FunctionVectorialType(), ) where { - I <: Integer, - F, - J, - E <: AbstractEvaluationType, - FT <: AbstractVectorialType, + I <: Integer, F, J, E <: AbstractEvaluationType, FT <: AbstractVectorialType, JT <: AbstractVectorialType, } return VectorGradientFunction{E, FT, JT, F, J, I}( @@ -219,6 +220,24 @@ function VectorGradientFunction( ) end +function status_summary(vgf::VectorGradientFunction; context::Symbol = :default) + _is_inline(context) && (return "A vectorial function including gradients of length $(length(vgf)) represented as $(vgf.cost_type) and gradients as $(vgf.jacobian_type)") + return """ + A function defined on a manifold that maps into a vector space including gradients of the component functions. + + ## Components + * cost: $(_MANOPT_INDENT)$(vgf.value!!)$(_MANOPT_INDENT)(as $(vgf.cost_type)), + * gradient(s) or Jacobian:$(_MANOPT_INDENT)$(vgf.jacobian!!)$(_MANOPT_INDENT)(as $(vgf.jacobian_type)) + * dimension: $(_MANOPT_INDENT)$(length(vgf))""" +end +function show(io::IO, vgf::VectorGradientFunction{E}) where {E} + print(io, "VectorGradientFunction("); print(io, vgf.value!!); print(io, ", ") + print(io, vgf.jacobian!!); print(io, ", "); print(io, vgf.range_dimension) + print(io, "; "); print(io, _to_kw(E)) + print(io, ", function_type = "); print(io, vgf.cost_type); print(io, ", jacobian_type = ") + return print(io, vgf.jacobian_type) +end + _doc_vhf = """ VectorHessianFunction{E, FT, JT, HT, F, J, H, I} <: AbstractVectorGradientFunction{E, FT, JT} @@ -258,13 +277,8 @@ inplace-variant, specified by the `evaluation=` keyword. @doc "$(_doc_vhf)" struct VectorHessianFunction{ E <: AbstractEvaluationType, - FT <: AbstractVectorialType, - JT <: AbstractVectorialType, - HT <: AbstractVectorialType, - F, - J, - H, - I <: Integer, + FT <: AbstractVectorialType, JT <: AbstractVectorialType, HT <: AbstractVectorialType, + F, J, H, I <: Integer, } <: AbstractVectorGradientFunction{E, FT, JT} value!!::F cost_type::FT @@ -276,23 +290,12 @@ struct VectorHessianFunction{ end function VectorHessianFunction( - f::F, - Jf::J, - Hf::H, - range_dimension::I; - evaluation::E = AllocatingEvaluation(), - function_type::FT = FunctionVectorialType(), - jacobian_type::JT = FunctionVectorialType(), - hessian_type::HT = FunctionVectorialType(), + f::F, Jf::J, Hf::H, range_dimension::I; + evaluation::E = AllocatingEvaluation(), function_type::FT = FunctionVectorialType(), + jacobian_type::JT = FunctionVectorialType(), hessian_type::HT = FunctionVectorialType(), ) where { - I <: Integer, - F, - J, - H, - E <: AbstractEvaluationType, - FT <: AbstractVectorialType, - JT <: AbstractVectorialType, - HT <: AbstractVectorialType, + I <: Integer, F, J, H, E <: AbstractEvaluationType, + FT <: AbstractVectorialType, JT <: AbstractVectorialType, HT <: AbstractVectorialType, } return VectorHessianFunction{E, FT, JT, HT, F, J, H, I}( f, function_type, Jf, jacobian_type, Hf, hessian_type, range_dimension @@ -304,6 +307,27 @@ _vgf_index_to_length(::Colon, n) = n _vgf_index_to_length(i::AbstractArray{<:Integer}, n) = length(i) _vgf_index_to_length(r::UnitRange{<:Integer}, n) = length(r) +function status_summary(vhf::VectorHessianFunction; context::Symbol = :default) + _is_inline(context) && (return "A vectorial function of length $(length(vhf)) including gradients and Hessians represented as $(vhf.cost_type), gradients as $(vhf.jacobian_type), and Hessians as $(vhf.hessian_type).") + return """ + A function defined on a manifold that maps into a vector space including gradients and Hessians of the component functions. + + * cost:$(_MANOPT_INDENT)$(vhf.value!!)$(_MANOPT_INDENT)(represented as $(vhf.cost_type)), + * gradient(s) or Jacobian:$(_MANOPT_INDENT)$(vhf.jacobian!!)$(_MANOPT_INDENT)(represented as $(vhf.jacobian_type)) + * Hessian(s):$(_MANOPT_INDENT)$(vhf.hessians!!)$(_MANOPT_INDENT)(represented as $(vhf.hessian_type)) + * dimension:$(_MANOPT_INDENT)$(length(vhf))""" +end +function show(io::IO, vhf::VectorHessianFunction{E}) where {E} + print(io, "VectorGradientFunction("); print(io, vhf.value!!); print(io, ", ") + print(io, vhf.jacobian!!); print(io, ", "); print(io, vhf.hessians!!); print(io, ", ") + print(io, vhf.range_dimension); print(io, "; "); print(io, _to_kw(E)) + print(io, ", function_type = "); print(io, vhf.cost_type) + print(io, ", jacobian_type = "); print(io, vhf.jacobian_type) + print(io, ", hessian_type = ") + return print(io, vhf.hessian_type) +end + + # # # ---- Hessian diff --git a/src/solvers/ChambollePock.jl b/src/solvers/ChambollePock.jl index fdb533e00c..41c0efbc30 100644 --- a/src/solvers/ChambollePock.jl +++ b/src/solvers/ChambollePock.jl @@ -154,7 +154,11 @@ function Manopt.ChambollePockState( vector_transport_method_dual, ) end -function show(io::IO, cps::ChambollePockState) +function status_summary(cps::ChambollePockState; context::Symbol = :default) + (context === :short) && return repr(cps) + i = get_count(cps, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(cps.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for Chambolle-Pock algorithm$(conv_inl)" i = get_count(cps, :Iterations) Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(cps.stop) ? "Yes" : "No" @@ -175,10 +179,9 @@ function show(io::IO, cps::ChambollePockState) * vector_transport_method_dual: $(cps.vector_transport_method_dual) ## Stopping criterion - - $(status_summary(cps.stop)) + $(_in_str(status_summary(cps.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end get_solver_result(apds::AbstractPrimalDualSolverState) = get_iterate(apds) get_iterate(apds::AbstractPrimalDualSolverState) = apds.p diff --git a/src/solvers/DouglasRachford.jl b/src/solvers/DouglasRachford.jl index 3dd2134203..5aae5099e8 100644 --- a/src/solvers/DouglasRachford.jl +++ b/src/solvers/DouglasRachford.jl @@ -43,14 +43,8 @@ $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(300)")) * `parallel=false`: indicate whether to use a parallel Douglas-Rachford or not. """ mutable struct DouglasRachfordState{ - P, - Tλ, - Tα, - TR, - S, - E <: AbstractEvaluationType, - TM <: AbstractRetractionMethod, - ITM <: AbstractInverseRetractionMethod, + P, Tλ, Tα, TR, S, + E <: AbstractEvaluationType, TM <: AbstractRetractionMethod, ITM <: AbstractInverseRetractionMethod, } <: AbstractManoptSolverState p::P p_tmp::P @@ -65,10 +59,7 @@ mutable struct DouglasRachfordState{ stop::S parallel::Bool function DouglasRachfordState( - M::AbstractManifold; - p::P = rand(M), - λ::Fλ = i -> 1.0, - α::Fα = i -> 0.9, + M::AbstractManifold; p::P = rand(M), λ::Fλ = i -> 1.0, α::Fα = i -> 0.9, reflection_evaluation::E = AllocatingEvaluation(), R::FR = ( if reflection_evaluation isa AllocatingEvaluation @@ -82,47 +73,65 @@ mutable struct DouglasRachfordState{ retraction_method::TM = default_retraction_method(M, typeof(p)), inverse_retraction_method::ITM = default_inverse_retraction_method(M, typeof(p)), ) where { - P, - Fλ, - Fα, - FR, - S <: StoppingCriterion, - E <: AbstractEvaluationType, - TM <: AbstractRetractionMethod, - ITM <: AbstractInverseRetractionMethod, + P, Fλ, Fα, FR, S <: StoppingCriterion, E <: AbstractEvaluationType, + TM <: AbstractRetractionMethod, ITM <: AbstractInverseRetractionMethod, + } + return DouglasRachfordState(; + p = p, p_tmp = copy(M, p), s = copy(M, p), s_tmp = copy(M, p), + λ = λ, α = α, R = R, reflection_evaluation = reflection_evaluation, + retraction_method = retraction_method, inverse_retraction_method = inverse_retraction_method, + stopping_criterion = stopping_criterion, parallel = parallel, + ) + end + function DouglasRachfordState(; + p::P, p_tmp::P, s::P, s_tmp::P, λ::Fλ, α::Fα, R::FR, + reflection_evaluation::E, retraction_method::TM, inverse_retraction_method::ITM, + stopping_criterion::S, parallel::Bool + ) where { + P, Fλ, Fα, FR, S <: StoppingCriterion, E <: AbstractEvaluationType, + TM <: AbstractRetractionMethod, ITM <: AbstractInverseRetractionMethod, } return new{P, Fλ, Fα, FR, S, E, TM, ITM}( - p, - copy(M, p), - copy(M, p), - copy(M, p), - λ, - α, - R, - reflection_evaluation, - retraction_method, - inverse_retraction_method, - stopping_criterion, - parallel, + p, p_tmp, s, s_tmp, λ, α, R, reflection_evaluation, + retraction_method, inverse_retraction_method, stopping_criterion, parallel, ) end end -function show(io::IO, drs::DouglasRachfordState) +function Base.show(io::IO, drs::DouglasRachfordState) + print(io, "DouglasRachfordState(; ") + print(io, "p = "); print(io, drs.p); print(io, ", ") + print(io, "p_tmp = "); print(io, drs.p_tmp); print(io, ", ") + print(io, "s = "); print(io, drs.s); print(io, ", ") + print(io, "s_tmp = "); print(io, drs.s_tmp); print(io, ", ") + print(io, "α = "); print(io, drs.α); print(io, ", ") + print(io, "λ = "); print(io, drs.λ); print(io, ", ") + print(io, "R = "); print(io, drs.R); print(io, ", ") + print(io, "reflection_evaluation = "); print(io, drs.reflection_evaluation); print(io, ", ") + print(io, "retraction_method = "); print(io, drs.retraction_method); print(io, ", ") + print(io, "inverse_retraction_method = "); print(io, drs.inverse_retraction_method); print(io, ", ") + print(io, "stopping_criterion = "); print(io, drs.stop); print(io, ", ") + print(io, "parallel = "); print(io, drs.parallel) + return print(io, ")") +end +function status_summary(drs::DouglasRachfordState; context::Symbol = :default) + (context === :short) && return repr(drs) i = get_count(drs, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(drs.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the Douglas Rachford solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" refl_e = drs.reflection_evaluation == AllocatingEvaluation() ? "allocating" : "in place" Conv = indicates_convergence(drs.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(drs)) – $(Iter) $(has_converged(drs) ? "(converged)" : "")") P = drs.parallel ? "Parallel " : "" s = """ - # Solver state for `Manopt.jl`s $P Douglas Rachford Algorithm + # Solver state for `Manopt.jl`s $(P)Douglas Rachford Algorithm $Iter using an $(refl_e) reflection. ## Stopping criterion - - $(status_summary(drs.stop)) + $(_in_str(status_summary(drs.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end get_iterate(drs::DouglasRachfordState) = drs.p function set_iterate!(drs::DouglasRachfordState, p) diff --git a/src/solvers/FrankWolfe.jl b/src/solvers/FrankWolfe.jl index 2629bd1200..5ee47797e1 100644 --- a/src/solvers/FrankWolfe.jl +++ b/src/solvers/FrankWolfe.jl @@ -48,14 +48,9 @@ $(_kwargs(:X; add_properties = [:as_Memory])) where the remaining fields from before are keyword arguments. """ mutable struct FrankWolfeState{ - P, - T, - Pr, - St <: AbstractManoptSolverState, - TStep <: Stepsize, - TStop <: StoppingCriterion, - TM <: AbstractRetractionMethod, - ITM <: AbstractInverseRetractionMethod, + P, T, Pr, St <: AbstractManoptSolverState, + TStep <: Stepsize, TStop <: StoppingCriterion, + TM <: AbstractRetractionMethod, ITM <: AbstractInverseRetractionMethod, } <: AbstractGradientSolverState p::P X::T @@ -66,43 +61,46 @@ mutable struct FrankWolfeState{ retraction_method::TM inverse_retraction_method::ITM function FrankWolfeState( - M::AbstractManifold, - sub_problem::Pr, - sub_state::St; - p::P = rand(M), - X::T = zero_vector(M, p), + M::AbstractManifold, sub_problem; evaluation::E = AllocatingEvaluation(), kwargs... + ) where {E <: AbstractEvaluationType} + cfs = ClosedFormSubSolverState(; evaluation = evaluation) + return FrankWolfeState(M, sub_problem, cfs; kwargs...) + end + function FrankWolfeState( + M::AbstractManifold, sub_problem::Pr, sub_state::St; + p::P = rand(M), X::T = zero_vector(M, p), stopping_criterion::TStop = StopAfterIteration(200) | StopWhenGradientNormLess(1.0e-6), stepsize::TStep = default_stepsize(M, FrankWolfeState), retraction_method::TM = default_retraction_method(M, typeof(p)), inverse_retraction_method::ITM = default_inverse_retraction_method(M, typeof(p)), ) where { - P, - T, - Pr <: Union{AbstractManoptProblem, F} where {F}, - St <: AbstractManoptSolverState, - TStop <: StoppingCriterion, - TStep <: Stepsize, - TM <: AbstractRetractionMethod, - ITM <: AbstractInverseRetractionMethod, + P, T, + Pr <: Union{AbstractManoptProblem, F} where {F}, St <: AbstractManoptSolverState, + TStop <: StoppingCriterion, TStep <: Stepsize, + TM <: AbstractRetractionMethod, ITM <: AbstractInverseRetractionMethod, + } + return FrankWolfeState( + sub_problem, sub_state; + p = p, X = X, stopping_criterion = stopping_criterion, stepsize = stepsize, + retraction_method = retraction_method, inverse_retraction_method = inverse_retraction_method + ) + end + FrankWolfeState(::AbstractManifold, ::AbstractManoptSolverState; kwargs...) = error("No sub problem provided.") + function FrankWolfeState( + sub_problem::Pr, sub_state::St; + p::P, X::T, stopping_criterion::TStop, stepsize::TStep, + retraction_method::TM, inverse_retraction_method::ITM + ) where { + P, T, Pr <: Union{AbstractManoptProblem, F} where {F}, St <: AbstractManoptSolverState, + TStop <: StoppingCriterion, TStep <: Stepsize, + TM <: AbstractRetractionMethod, ITM <: AbstractInverseRetractionMethod, } return new{P, T, Pr, St, TStep, TStop, TM, ITM}( - p, - X, - sub_problem, - sub_state, - stopping_criterion, - stepsize, - retraction_method, - inverse_retraction_method, + p, X, sub_problem, sub_state, + stopping_criterion, stepsize, retraction_method, inverse_retraction_method, ) end end -function FrankWolfeState( - M::AbstractManifold, sub_problem; evaluation::E = AllocatingEvaluation(), kwargs... - ) where {E <: AbstractEvaluationType} - cfs = ClosedFormSubSolverState(; evaluation = evaluation) - return FrankWolfeState(M, sub_problem, cfs; kwargs...) -end function default_stepsize(M::AbstractManifold, ::Type{FrankWolfeState}) return DecreasingStepsize(M; length = 2.0, shift = 2.0) @@ -118,29 +116,36 @@ function set_iterate!(fws::FrankWolfeState, p) fws.p = p return fws end -function show(io::IO, fws::FrankWolfeState) +function Base.show(io::IO, fws::FrankWolfeState) + print(io, "FrankWolfeState(", fws.sub_problem, ", ", fws.sub_state, "; ") + print(io, "inverse_retraction_method = ", fws.inverse_retraction_method) + print(io, ", p = ", fws.p, ", retraction_method = ", fws.retraction_method) + print(io, ", stopping_criterion = ", fws.stop, ", stepsize = ", fws.stepsize) + return print(io, "X = ", fws.X, ")") +end +function status_summary(fws::FrankWolfeState; context::Symbol = :default) + (context === :short) && return repr(fws) i = get_count(fws, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(fws.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the Frank Wolfe algorithm$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(fws.stop) ? "Yes" : "No" - sub = repr(fws.sub_state) - sub = replace(sub, "\n" => "\n | ") - s = """ + sub = _in_str(status_summary(fws.sub_state; context = context); indent = 1, headers = 1, indent_end = "| ") + return """ # Solver state for `Manopt.jl`s Frank Wolfe Method $Iter ## Parameters * inverse retraction method: $(fws.inverse_retraction_method) * retraction method: $(fws.retraction_method) * sub solver state: - | $(sub) + $(sub) ## Stepsize - $(fws.stepsize) + $(_in_str(status_summary(fws.stepsize; context = context); indent = 0, headers = 1)) ## Stopping criterion - - $(status_summary(fws.stop)) + $(_in_str(status_summary(fws.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) end _doc_FW_problem = """ @@ -322,6 +327,7 @@ calls_with_kwargs(::typeof(Frank_Wolfe_method!)) = (decorate_objective!, decorat function initialize_solver!(amp::AbstractManoptProblem, fws::FrankWolfeState) get_gradient!(amp, fws.X, fws.p) + initialize_stepsize!(fws.stepsize) return fws end function step_solver!(amp::AbstractManoptProblem, fws::FrankWolfeState, k) diff --git a/src/solvers/Lanczos.jl b/src/solvers/Lanczos.jl index 9490596985..bfdd6dd282 100644 --- a/src/solvers/Lanczos.jl +++ b/src/solvers/Lanczos.jl @@ -45,6 +45,15 @@ mutable struct LanczosState{T, R, SC, SCN, B, TM, C} <: AbstractManoptSolverStat Hp::T # `Hess_f`` A temporary vector for evaluations of the Hessian Hp_residual::T # A residual vector S::T # store the tangent vector that solves the minimization problem + function LanczosState(; + X::T, σ::R, stopping_criterion::SC, stopping_criterion_newton::SCN, Lanczos_vectors::B, + tridig_matrix::TM, coefficients::C, Hp::T, Hp_residual::T, S::T + ) where {T, SC <: StoppingCriterion, SCN <: StoppingCriterion, R, B, TM, C} + return new{T, R, SC, SCN, B, TM, C}( + X, σ, stopping_criterion, stopping_criterion_newton, Lanczos_vectors, + tridig_matrix, coefficients, Hp, Hp_residual, S + ) + end end function LanczosState( TpM::TangentSpace; @@ -59,17 +68,11 @@ function LanczosState( tridig = spdiagm(maxIterLanczos, maxIterLanczos, [0.0]) coeffs = zeros(maxIterLanczos) Lanczos_vectors = typeof(X)[] - return LanczosState{T, R, SC, SCN, typeof(Lanczos_vectors), typeof(tridig), typeof(coeffs)}( - X, - σ, - stopping_criterion, - stopping_criterion_newton, - Lanczos_vectors, - tridig, - coeffs, - copy(TpM, X), - copy(TpM, X), - copy(TpM, X), + return LanczosState(; + X = X, σ = σ, stopping_criterion = stopping_criterion, + stopping_criterion_newton = stopping_criterion_newton, + Lanczos_vectors = Lanczos_vectors, tridig_matrix = tridig, coefficients = coeffs, + Hp = copy(TpM, X), Hp_residual = copy(TpM, X), S = copy(TpM, X), ) end function get_solver_result(ls::LanczosState) @@ -83,13 +86,21 @@ function set_parameter!(ls::LanczosState, ::Val{:σ}, σ) ls.σ = σ return ls end - -function show(io::IO, ls::LanczosState) +function Base.show(io::IO, ls::LanczosState) + print(io, "LanczosState(; X = ", ls.X, ", σ = ", ls.σ, ", stopping_criterion = ", ls.stop) + print(io, ", stopping_criterion_newton = ", ls.stop_newton, ", ") + print(io, "Lanczos_vectors = ", ls.Lanczos_vectors, ", ", "tridig_matrix = ", ls.tridig_matrix, ", ") + print(io, "coefficients = ", ls.X); print(io, ", Hp = ", ls.Hp, ", ") + print(io, "Hp_residual = ", ls.Hp_residual, ", ", "S = ", ls.S) + return print(io, ")") +end +function status_summary(ls::LanczosState; context::Symbol = :default) + (context === :short) && return repr(ls) i = get_count(ls, :Iterations) Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(ls.stop) ? "Yes" : "No" vectors = length(ls.Lanczos_vectors) - s = """ + return """ # Solver state for `Manopt.jl`s Lanczos Iteration $Iter ## Parameters @@ -102,7 +113,6 @@ function show(io::IO, ls::LanczosState) (b) For the Newton sub solver $(status_summary(ls.stop_newton)) This indicates convergence: $Conv""" - return print(io, s) end # @@ -239,7 +249,7 @@ end # _math_sc_firstorder = raw""" ```math -m(X_k) \leq m(0) +m(X_k) ≤ m(0) \quad\text{ and }\quad \lVert \operatorname{grad} m(X_k) \rVert ≤ θ \lVert X_k \rVert^2 ``` @@ -253,7 +263,7 @@ solver indicating that the model function at the current (outer) iterate, $_doc_ARC_model -defined on the tangent space ``$(_math(:TangentSpace))entSpace)))`` fulfills at the current iterate ``X_k`` that +defined on the tangent space ``$(_math(:TangentSpace))`` fulfils at the current iterate ``X_k`` that $_math_sc_firstorder @@ -323,10 +333,12 @@ function (c::StopWhenFirstOrderProgress)( prog && (c.at_iteration = k) return prog end -function status_summary(c::StopWhenFirstOrderProgress) +function status_summary(c::StopWhenFirstOrderProgress; context::Symbol = :default) + (context == :short) && return repr(sc) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "First order progress with θ=$(c.θ):\t$s" + _is_inline(context) && return "First order progress with θ=$(c.θ):$(_MANOPT_INDENT)$s" + return "A stopping criterion to stop when the Lanczos model has fpund a certain first order progress with θ=$(c.θ):$(_MANOPT_INDENT)$s" end indicates_convergence(c::StopWhenFirstOrderProgress) = true function show(io::IO, c::StopWhenFirstOrderProgress) @@ -377,15 +389,13 @@ function get_reason(c::StopWhenAllLanczosVectorsUsed) end return "" end -function status_summary(c::StopWhenAllLanczosVectorsUsed) +function status_summary(c::StopWhenAllLanczosVectorsUsed; context::Symbol = :default) + (context === :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "All Lanczos vectors ($(c.maxLanczosVectors)) used:\t$s" + return (context === :inline ? "All $(c.maxLanczosVectors) Lanczos vectors used:$(_MANOPT_INDENT)" : "Stop when all $(c.maxLanczosVectors) Lanczos vectors are used\n$(_MANOPT_INDENT)") * s end indicates_convergence(c::StopWhenAllLanczosVectorsUsed) = false function show(io::IO, c::StopWhenAllLanczosVectorsUsed) - return print( - io, - "StopWhenAllLanczosVectorsUsed($(repr(c.maxLanczosVectors)))\n $(status_summary(c))", - ) + return print(io, "StopWhenAllLanczosVectorsUsed($(repr(c.maxLanczosVectors)))") end diff --git a/src/solvers/LevenbergMarquardt.jl b/src/solvers/LevenbergMarquardt.jl index 772c1c645f..d265dc2564 100644 --- a/src/solvers/LevenbergMarquardt.jl +++ b/src/solvers/LevenbergMarquardt.jl @@ -35,7 +35,7 @@ $(_args(:p)) for mutating evaluation this value must be explicitly specified. You can also provide the cost and its Jacobian already as a[`VectorGradientFunction`](@ref) `vgf`, -Alternatively, passing a [`NonlinearLeastSquaresObjective`](@ref) `nlso`. +Alternatively, passing a [`ManifoldNonlinearLeastSquaresObjective`](@ref) `nlso`. # Keyword arguments @@ -114,12 +114,12 @@ function LevenbergMarquardt( evaluation::AbstractEvaluationType = AllocatingEvaluation(), kwargs..., ) - nlso = NonlinearLeastSquaresObjective(vgf) + nlso = ManifoldNonlinearLeastSquaresObjective(vgf) return LevenbergMarquardt(M, nlso, p; evaluation = evaluation, kwargs...) end function LevenbergMarquardt( M::AbstractManifold, nlso::O, p; kwargs... - ) where {O <: Union{NonlinearLeastSquaresObjective, AbstractDecoratedManifoldObjective}} + ) where {O <: Union{ManifoldNonlinearLeastSquaresObjective, AbstractDecoratedManifoldObjective}} keywords_accepted(LevenbergMarquardt; kwargs...) q = copy(M, p) return LevenbergMarquardt!(M, nlso, q; kwargs...) @@ -151,7 +151,7 @@ function LevenbergMarquardt!( ) end end - nlso = NonlinearLeastSquaresObjective( + nlso = ManifoldNonlinearLeastSquaresObjective( f, jacobian_f, num_components; @@ -180,7 +180,7 @@ function LevenbergMarquardt!( ), (linear_subsolver!) = (default_lm_lin_solve!), kwargs..., #collect rest - ) where {O <: Union{NonlinearLeastSquaresObjective, AbstractDecoratedManifoldObjective}} + ) where {O <: Union{ManifoldNonlinearLeastSquaresObjective, AbstractDecoratedManifoldObjective}} keywords_accepted(LevenbergMarquardt!; kwargs...) dnlso = decorate_objective!(M, nlso; kwargs...) nlsp = DefaultManoptProblem(M, dnlso) @@ -206,7 +206,7 @@ calls_with_kwargs(::typeof(LevenbergMarquardt!)) = (decorate_objective!, decorat # Solver functions # function initialize_solver!( - dmp::DefaultManoptProblem{mT, <:NonlinearLeastSquaresObjective}, + dmp::DefaultManoptProblem{mT, <:ManifoldNonlinearLeastSquaresObjective}, lms::LevenbergMarquardtState, ) where {mT <: AbstractManifold} M = get_manifold(dmp) @@ -241,7 +241,7 @@ function default_lm_lin_solve!(sk, JJ, grad_f_c) end function step_solver!( - dmp::DefaultManoptProblem{mT, <:NonlinearLeastSquaresObjective}, + dmp::DefaultManoptProblem{mT, <:ManifoldNonlinearLeastSquaresObjective}, lms::LevenbergMarquardtState, ::Integer, ) where {mT <: AbstractManifold} diff --git a/src/solvers/NelderMead.jl b/src/solvers/NelderMead.jl index bd1c1f3abf..d4e3272e90 100644 --- a/src/solvers/NelderMead.jl +++ b/src/solvers/NelderMead.jl @@ -27,7 +27,6 @@ the tangent space at point `p`. struct NelderMeadSimplex{TP, T <: AbstractVector{TP}} pts::T end - function NelderMeadSimplex(M::AbstractManifold) return NelderMeadSimplex([rand(M) for i in 1:(manifold_dimension(M) + 1)]) end @@ -46,6 +45,7 @@ function NelderMeadSimplex( pts = map(X -> retract(M, p_, X, retraction_method), vecs) return NelderMeadSimplex(pts) end +Base.show(io::IO, nms::NelderMeadSimplex) = print(io, "NelderMeadSimplex(", nms.pts, ")") @doc """ NelderMeadState <: AbstractManoptSolverState @@ -90,67 +90,58 @@ $(_kwargs([:inverse_retraction_method, :retraction_method])) * `p=copy(M, population.pts[1])`: initialise the storage for the best point (iterate)¨ """ mutable struct NelderMeadState{ - T, - S <: StoppingCriterion, - Tα <: Real, - Tγ <: Real, - Tρ <: Real, - Tσ <: Real, - TR <: AbstractRetractionMethod, - TI <: AbstractInverseRetractionMethod, + P, S <: StoppingCriterion, F <: Real, A <: AbstractVector{<:Real}, + TR <: AbstractRetractionMethod, TI <: AbstractInverseRetractionMethod, } <: AbstractManoptSolverState - population::NelderMeadSimplex{T} + population::NelderMeadSimplex{P} stop::S - α::Tα - γ::Tγ - ρ::Tρ - σ::Tσ - p::T - costs::Vector{Float64} + α::F + γ::F + ρ::F + σ::F + p::P + costs::A retraction_method::TR inverse_retraction_method::TI + function NelderMeadState(; + population::NelderMeadSimplex{P}, stopping_criterion::S, α::F, γ::F, ρ::F, σ::F, p::P, costs::A, + retraction_method::TR, inverse_retraction_method::TI + ) where { + P, S <: StoppingCriterion, F <: Real, A <: AbstractVector{<:Real}, + TR <: AbstractRetractionMethod, TI <: AbstractInverseRetractionMethod, + } + return new{P, S, F, A, TR, TI}( + population, stopping_criterion, α, γ, ρ, σ, p, costs, retraction_method, inverse_retraction_method + ) + end function NelderMeadState( M::AbstractManifold; population::NelderMeadSimplex{T} = NelderMeadSimplex(M), - stopping_criterion::StoppingCriterion = StopAfterIteration(2000) | - StopWhenPopulationConcentrated(), - α = 1.0, - γ = 2.0, - ρ = 1 / 2, - σ = 1 / 2, - retraction_method::AbstractRetractionMethod = default_retraction_method( - M, eltype(population.pts) - ), - inverse_retraction_method::AbstractInverseRetractionMethod = default_inverse_retraction_method( - M, eltype(population.pts) - ), + stopping_criterion::StoppingCriterion = StopAfterIteration(2000) | StopWhenPopulationConcentrated(), + α::Real = 1.0, γ::Real = 2.0, ρ::Real = 1 / 2, σ::Real = 1 / 2, + retraction_method::AbstractRetractionMethod = default_retraction_method(M, eltype(population.pts)), + inverse_retraction_method::AbstractInverseRetractionMethod = default_inverse_retraction_method(M, eltype(population.pts)), p::T = copy(M, population.pts[1]), ) where {T} - return new{ - T, - typeof(stopping_criterion), - typeof(α), - typeof(γ), - typeof(ρ), - typeof(σ), - typeof(retraction_method), - typeof(inverse_retraction_method), - }( - population, - stopping_criterion, - α, - γ, - ρ, - σ, - p, - [], - retraction_method, - inverse_retraction_method, + R = promote_type(typeof(α), typeof(γ), typeof(ρ), typeof(σ)) + α = convert(R, α); γ = convert(R, γ); ρ = convert(R, ρ); σ = convert(R, σ) + return NelderMeadState(; + population = population, stopping_criterion = stopping_criterion, α = α, γ = γ, ρ = ρ, σ = σ, + p = p, costs = R[], retraction_method = retraction_method, inverse_retraction_method = inverse_retraction_method, ) end end -function show(io::IO, nms::NelderMeadState) +function Base.show(io::IO, nms::NelderMeadState) + print(io, "NelderMeadState(; population = ", nms.population, ", α = ", nms.α, ", γ = ", nms.γ, "ρ = ", nms.ρ, " σ = ", nms.σ) + print(io, "p = ", nms.p, ", costs = ", nms.costs, ", stopping_criterion = ", nms.stop) + print(io, ", retraction_method = ", nms.retraction_method, ", inverse_retraction_method = ", nms.inverse_retraction_method) + return print(io, ")") +end +function status_summary(nms::NelderMeadState; context::Symbol = :default) + (context === :short) && return repr(nms) i = get_count(nms, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(nms.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the Nelder-Mead solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(nms.stop) ? "Yes" : "No" s = """ @@ -165,10 +156,9 @@ function show(io::IO, nms::NelderMeadState) * retraction method: $(nms.retraction_method) ## Stopping criterion - - $(status_summary(nms.stop)) + $(_in_str(status_summary(nms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end get_iterate(O::NelderMeadState) = O.p function set_iterate!(O::NelderMeadState, ::AbstractManifold, p) @@ -263,36 +253,20 @@ function NelderMead!(M::AbstractManifold, f, population::NelderMeadSimplex; kwar return NelderMead!(M, mco, population; kwargs...) end function NelderMead!( - M::AbstractManifold, - mco::O, - population::NelderMeadSimplex; - stopping_criterion::StoppingCriterion = StopAfterIteration(2000) | - StopWhenPopulationConcentrated(), - α = 1.0, - γ = 2.0, - ρ = 1 / 2, - σ = 1 / 2, - retraction_method::AbstractRetractionMethod = default_retraction_method( - M, eltype(population.pts) - ), - inverse_retraction_method::AbstractInverseRetractionMethod = default_inverse_retraction_method( - M, eltype(population.pts) - ), + M::AbstractManifold, mco::O, population::NelderMeadSimplex; + stopping_criterion::StoppingCriterion = StopAfterIteration(2000) | StopWhenPopulationConcentrated(), + α::Real = 1.0, γ::Real = 2.0, ρ::Real = 1 / 2, σ::Real = 1 / 2, + retraction_method::AbstractRetractionMethod = default_retraction_method(M, eltype(population.pts)), + inverse_retraction_method::AbstractInverseRetractionMethod = default_inverse_retraction_method(M, eltype(population.pts)), kwargs..., #collect rest ) where {O <: Union{AbstractManifoldCostObjective, AbstractDecoratedManifoldObjective}} keywords_accepted(NelderMead; kwargs...) dmco = decorate_objective!(M, mco; kwargs...) mp = DefaultManoptProblem(M, dmco) s = NelderMeadState( - M; - population = population, - stopping_criterion = stopping_criterion, - α = α, - γ = γ, - ρ = ρ, - σ = σ, - retraction_method = retraction_method, - inverse_retraction_method = inverse_retraction_method, + M; population = population, + stopping_criterion = stopping_criterion, α = α, γ = γ, ρ = ρ, σ = σ, + retraction_method = retraction_method, inverse_retraction_method = inverse_retraction_method, ) s = decorate_state!(s; kwargs...) solve!(mp, s) @@ -352,14 +326,11 @@ function step_solver!(mp::AbstractManoptProblem, s::NelderMeadState, ::Any) if continue_steps for i in 2:length(ind) ManifoldsBase.retract_fused!( - M, - s.population.pts[i], - s.population.pts[1], + M, s.population.pts[i], s.population.pts[1], inverse_retract( M, s.population.pts[1], s.population.pts[i], s.inverse_retraction_method ), - s.σ, - s.retraction_method, + s.σ, s.retraction_method, ) # update cost s.costs[i] = get_cost(mp, s.population.pts[i]) @@ -420,14 +391,13 @@ function get_reason(c::StopWhenPopulationConcentrated) end return "" end -function status_summary(c::StopWhenPopulationConcentrated) +function status_summary(c::StopWhenPopulationConcentrated; context::Symbol = :default) + (context === :short) && (return repr(c)) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Population concentration: in f < $(c.tol_f) and in p < $(c.tol_p):\t$s" + head = (!_is_inline(context) ? "Stop when the population of a swarm is concentrated in eher function values (tolerance: $(c.tol_f)) or points (tolerance: $(c.tol_p))\n$(_MANOPT_INDENT)" : "") + return head * "Population concentration: in f < $(c.tol_f) and in p < $(c.tol_p):$(_MANOPT_INDENT)$s" end -function show(io::IO, c::StopWhenPopulationConcentrated) - return print( - io, - "StopWhenPopulationConcentrated($(c.tol_f), $(c.tol_p))\n $(status_summary(c))", - ) +function Base.show(io::IO, c::StopWhenPopulationConcentrated) + return print(io, "StopWhenPopulationConcentrated($(c.tol_f), $(c.tol_p))") end diff --git a/src/solvers/adaptive_regularization_with_cubics.jl b/src/solvers/adaptive_regularization_with_cubics.jl index d1a5973d17..1ee11f394a 100644 --- a/src/solvers/adaptive_regularization_with_cubics.jl +++ b/src/solvers/adaptive_regularization_with_cubics.jl @@ -48,13 +48,8 @@ $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(100)")) $(_kwargs(:X)) """ mutable struct AdaptiveRegularizationState{ - P, - T, - Pr, - St <: AbstractManoptSolverState, - TStop <: StoppingCriterion, - R, - TRTM <: AbstractRetractionMethod, + P, T, Pr, St <: AbstractManoptSolverState, + SC <: StoppingCriterion, R, TRTM <: AbstractRetractionMethod, } <: AbstractManoptSolverState p::P X::T @@ -67,58 +62,41 @@ mutable struct AdaptiveRegularizationState{ ρ::R ρ_denominator::R ρ_regularization::R - stop::TStop + stop::SC retraction_method::TRTM σmin::R η1::R η2::R γ1::R γ2::R + function AdaptiveRegularizationState( + sub_problem::Pr, sub_state::St; + p::P, X::T, q::P, H::T, S::T, σ::R, ρ::R, ρ_denominator::R, ρ_regularization::R, + stopping_criterion::SC, retraction_method::TRTM, σmin::R, η1::R, η2::R, γ1::R, γ2::R, + ) where { + P, T, Pr, St <: AbstractManoptSolverState, SC <: StoppingCriterion, R, TRTM <: AbstractRetractionMethod, + } + return new{P, T, Pr, St, SC, R, TRTM}( + p, X, sub_problem, sub_state, q, H, S, σ, ρ, + ρ_denominator, ρ_regularization, stopping_criterion, retraction_method, σmin, η1, η2, γ1, γ2 + ) + end end - function AdaptiveRegularizationState( - M::AbstractManifold, - sub_problem::Pr, - sub_state::St; - p::P = rand(M), - X::T = zero_vector(M, p), - σ::R = 100.0 / sqrt(manifold_dimension(M)), # Had this to initial value of 0.01. However try same as in MATLAB: 100/sqrt(dim(M)) - ρ_regularization::R = 1.0e3, - stopping_criterion::SC = StopAfterIteration(100), + M::AbstractManifold, sub_problem::Pr, sub_state::St; + p::P = rand(M), X::T = zero_vector(M, p), σ::R = 100.0 / sqrt(manifold_dimension(M)), + ρ_regularization::R = 1.0e3, stopping_criterion::SC = StopAfterIteration(100), retraction_method::RTM = default_retraction_method(M, typeof(p)), - σmin::R = 1.0e-10, - η1::R = 0.1, - η2::R = 0.9, - γ1::R = 0.1, - γ2::R = 2.0, + σmin::R = 1.0e-10, η1::R = 0.1, η2::R = 0.9, γ1::R = 0.1, γ2::R = 2.0, ) where { - P, - T, - R, - Pr <: Union{<:AbstractManoptProblem, F} where {F}, - St <: AbstractManoptSolverState, - SC <: StoppingCriterion, - RTM <: AbstractRetractionMethod, + P, T, R, Pr <: Union{<:AbstractManoptProblem, F} where {F}, + St <: AbstractManoptSolverState, SC <: StoppingCriterion, RTM <: AbstractRetractionMethod, } - return AdaptiveRegularizationState{P, T, Pr, St, SC, R, RTM}( - p, - X, - sub_problem, - sub_state, - copy(M, p), - copy(M, p, X), - copy(M, p, X), - σ, - one(σ), - one(σ), - ρ_regularization, - stopping_criterion, - retraction_method, - σmin, - η1, - η2, - γ1, - γ2, + return AdaptiveRegularizationState( + sub_problem, sub_state; + p = p, X = X, q = copy(M, p), H = copy(M, p, X), S = copy(M, p, X), σ, ρ = one(σ), + ρ_denominator = one(σ), ρ_regularization = ρ_regularization, stopping_criterion = stopping_criterion, + retraction_method = retraction_method, σmin = σmin, η1 = η1, η2 = η2, γ1 = γ1, γ2 = γ2 ) end function AdaptiveRegularizationState( @@ -137,13 +115,25 @@ function set_gradient!(s::AdaptiveRegularizationState, X) s.X = X return s end - -function show(io::IO, arcs::AdaptiveRegularizationState) +function Base.show(io::IO, arcs::AdaptiveRegularizationState) + print(io, "AdaptiveRegularizationState(", arcs.sub_problem, ", ", arcs.sub_state, "; ") + print(io, "p = ", arcs.p, ", q = ", arcs.q, ", H = ", arcs.H, ", S = ", arcs.S, ", ") + print(io, "retraction_method = ", arcs.retraction_method, ", stopping_criterion = ", arcs.stop, ", ") + print(io, "X = ", arcs.X, ", η1 = ", arcs.η1, ", η2 = ", arcs.η2, ", γ1 = ", arcs.γ1, ", ") + print(io, "γ2 = ", arcs.γ2, ", ρ = ", arcs.ρ, ", ") + print(io, "ρ_denominator = ", arcs.ρ_denominator, ", ρ_regularization = ", arcs.ρ_regularization, ", ") + print(io, "σ = ", arcs.σ, ", σmin = ", arcs.σmin) + return print(io, ")") +end +function status_summary(arcs::AdaptiveRegularizationState; context::Symbol = :default) + (context === :short) && (return repr(arcs)) i = get_count(arcs, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(tcgs.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the adaptive regularization with cubics solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(arcs.stop) ? "Yes" : "No" - sub = repr(arcs.sub_state) - sub = replace(sub, "\n" => "\n | ") + sub = status_summary(arcs.sub_state; context = context) + sub = replace(sub, "\n" => "\n | ", "\n#" => "\n$(_MANOPT_INDENT)##") s = """ # Solver state for `Manopt.jl`s Adaptive Regularization with Cubics (ARC) $Iter @@ -157,10 +147,9 @@ function show(io::IO, arcs::AdaptiveRegularizationState) | $(sub) ## Stopping criterion - - $(status_summary(arcs.stop)) + $(_in_str(status_summary(arcs.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end _doc_ARC_model = """ diff --git a/src/solvers/alternating_gradient_descent.jl b/src/solvers/alternating_gradient_descent.jl index d6b09568fd..da43efcba3 100644 --- a/src/solvers/alternating_gradient_descent.jl +++ b/src/solvers/alternating_gradient_descent.jl @@ -1,3 +1,39 @@ +""" + AlternatingGradientRule <: AbstractGradientGroupDirectionRule + +Create a functor `(problem, state k) -> (s,X)` to evaluate the alternating gradient, +that is alternating between the components of the gradient and has an field for +partial evaluation of the gradient in-place. + +# Fields + +$(_fields(:X)) + +# Constructor + + AlternatingGradientRule(M::AbstractManifold; p=rand(M), X=zero_vector(M, p)) + +Initialize the alternating gradient processor with tangent vector type of `X`, +where both `M` and `p` are just help variables. + +# See also +[`alternating_gradient_descent`](@ref), [`AlternatingGradient`])@ref) +""" +struct AlternatingGradientRule{T} <: AbstractGradientGroupDirectionRule + X::T +end +function AlternatingGradientRule( + M::AbstractManifold; p = rand(M), X::T = zero_vector(M, p) + ) where {T} + return AlternatingGradientRule{T}(X) +end +function Base.show(io::IO, ag::AlternatingGradientRule) + return print(io, "AlternatingGradientRule($(ag.X))") +end +function status_summary(ag::AlternatingGradientRule; context::Symbol = :default) + (context === :short) && return repr(ag) + return "A alternating gradient processor" +end """ AlternatingGradientDescentState <: AbstractGradientDescentSolverState @@ -15,7 +51,7 @@ $(_fields([:retraction_method, :stepsize])) $(_fields(:stopping_criterion; name = "stop")) $(_fields(:p; add_properties = [:as_Iterate])) $(_fields(:X; add_properties = [:as_Gradient])) -* `k`, ì`: internal counters for the outer and inner iterations, respectively. +* `k`, ì`: internal counters for the outer and inner iterations, respectively. # Constructors @@ -31,14 +67,13 @@ $(_kwargs(:stepsize; default = "`[`default_stepsize`](@ref)`(M, AlternatingGradi $(_kwargs(:X)) Generate the options for point `p` and where `inner_iterations`, `order_type`, `order`, -`retraction_method`, `stopping_criterion`, and `stepsize`` are keyword arguments +`retraction_method`, `stopping_criterion`, and `stepsize`` are keyword arguments. + +For internal use, there also exists a constructor solely having the fields as keyword arguments, +but then all of them are mandatory. """ mutable struct AlternatingGradientDescentState{ - P, - T, - D <: DirectionUpdateRule, - TStop <: StoppingCriterion, - TStep <: Stepsize, + P, T, D <: DirectionUpdateRule, TStop <: StoppingCriterion, TStep <: Stepsize, RM <: AbstractRetractionMethod, } <: AbstractGradientSolverState p::P @@ -52,94 +87,72 @@ mutable struct AlternatingGradientDescentState{ k::Int # current iterate i::Int # inner iterate inner_iterations::Int + function AlternatingGradientDescentState( + M::AbstractManifold; + p::P = rand(M), X::T = zero_vector(M, p), + inner_iterations::Int = 5, + order_type::Symbol = :Linear, order::Vector{<:Int} = Int[], + retraction_method::AbstractRetractionMethod = default_retraction_method(M, typeof(p)), + stopping_criterion::StoppingCriterion = StopAfterIteration(1000), + stepsize::Stepsize = default_stepsize(M, AlternatingGradientDescentState), + ) where {P, T} + return AlternatingGradientDescentState(; + p = p, X = X, direction = _produce_type(AlternatingGradient(; p = p, X = X), M), + inner_iterations = inner_iterations, + order_type = order_type, order = order, + retraction_method = retraction_method, stopping_criterion = stopping_criterion, + stepsize = stepsize, + ) + end + function AlternatingGradientDescentState(; + p::P, X::T, direction::D, inner_iterations::Int, order_type::Symbol, order::Vector{<:Int}, + retraction_method::RTM, stopping_criterion::SC, stepsize::S, k::Int = 0, i::Int = 0 + ) where {P, T, RTM <: AbstractRetractionMethod, SC <: StoppingCriterion, S <: Stepsize, D <: AlternatingGradientRule} + return new{P, T, D, SC, S, RTM}( + p, X, direction, stopping_criterion, stepsize, + order_type, order, retraction_method, k, i, inner_iterations, + ) + end end -function AlternatingGradientDescentState( - M::AbstractManifold; - p::P = rand(M), - X::T = zero_vector(M, p), - inner_iterations::Int = 5, - order_type::Symbol = :Linear, - order::Vector{<:Int} = Int[], - retraction_method::AbstractRetractionMethod = default_retraction_method(M, typeof(p)), - stopping_criterion::StoppingCriterion = StopAfterIteration(1000), - stepsize::Stepsize = default_stepsize(M, AlternatingGradientDescentState), - ) where {P, T} - return AlternatingGradientDescentState{ - P, - T, - AlternatingGradientRule, - typeof(stopping_criterion), - typeof(stepsize), - typeof(retraction_method), - }( - p, - X, - _produce_type(AlternatingGradient(; p = p, X = X), M), - stopping_criterion, - stepsize, - order_type, - order, - retraction_method, - 0, - 0, - inner_iterations, - ) +function Base.show(io::IO, agds::AlternatingGradientDescentState) + print(io, "AlternatingGradientDescentState(; ") + print(io, "p = $(agds.p), ") + print(io, "X = $(agds.X), ") + print(io, "direction = $(agds.direction), ") + print(io, "inner_iterations = $(agds.inner_iterations), ") + print(io, "order_type = :$(agds.order_type), ") + print(io, "order = $(agds.order), ") + print(io, "retraction_method = $(agds.retraction_method), ") + print(io, "stepsize = $(agds.stepsize), ") + print(io, "stopping_criterion = $(status_summary(agds.stop, context = :short)), ") + return print(io, "i = $(agds.i), k = $(agds.k))") end -function show(io::IO, agds::AlternatingGradientDescentState) +function status_summary(agds::AlternatingGradientDescentState; context::Symbol = :default) + (context === :short) && return repr(agds) Iter = (agds.i > 0) ? "After $(agds.i) iterations\n" : "" Conv = indicates_convergence(agds.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(agds)) – $(Iter) $(has_converged(agds) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Alternating Gradient Descent Solver $Iter ## Parameters * order: :$(agds.order_type) * retraction method: $(agds.retraction_method) - + * direction: $(status_summary(agds.direction; context = :inline)) ## Stepsize $(agds.stepsize) ## Stopping criterion - - $(status_summary(agds.stop)) + $(_in_str(status_summary(agds.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end function get_message(agds::AlternatingGradientDescentState) # for now only step size is quipped with messages return get_message(agds.stepsize) end -""" - AlternatingGradientRule <: AbstractGradientGroupDirectionRule - -Create a functor `(problem, state k) -> (s,X)` to evaluate the alternating gradient, -that is alternating between the components of the gradient and has an field for -partial evaluation of the gradient in-place. - -# Fields - -$(_fields(:X)) - -# Constructor - - AlternatingGradientRule(M::AbstractManifold; p=rand(M), X=zero_vector(M, p)) - -Initialize the alternating gradient processor with tangent vector type of `X`, -where both `M` and `p` are just help variables. - -# See also -[`alternating_gradient_descent`](@ref), [`AlternatingGradient`])@ref) -""" -struct AlternatingGradientRule{T} <: AbstractGradientGroupDirectionRule - X::T -end -function AlternatingGradientRule( - M::AbstractManifold; p = rand(M), X::T = zero_vector(M, p) - ) where {T} - return AlternatingGradientRule{T}(X) -end - function (ag::AlternatingGradientRule)( amp::AbstractManoptProblem, agds::AlternatingGradientDescentState, k ) @@ -255,6 +268,7 @@ function initialize_solver!( get_gradient!(amp, agds.X, agds.p) (agds.order_type == :FixedRandom || agds.order_type == :Random) && (shuffle!(agds.order)) + initialize_stepsize!(agds.stepsize) return agds end function step_solver!(amp::AbstractManoptProblem, agds::AlternatingGradientDescentState, k) diff --git a/src/solvers/augmented_Lagrangian_method.jl b/src/solvers/augmented_Lagrangian_method.jl index 47b10f5d96..5b4fa50f30 100644 --- a/src/solvers/augmented_Lagrangian_method.jl +++ b/src/solvers/augmented_Lagrangian_method.jl @@ -166,8 +166,11 @@ function get_message(alms::AugmentedLagrangianMethodState) return get_message(alms.sub_state) end -function show(io::IO, alms::AugmentedLagrangianMethodState) +function status_summary(alms::AugmentedLagrangianMethodState; context::Symbol = :default) + (context === :short) && (return repr(alms)) i = get_count(alms, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(alms.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the augmented Lagrandigan method$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(alms.stop) ? "Yes" : "No" s = """ @@ -182,10 +185,9 @@ function show(io::IO, alms::AugmentedLagrangianMethodState) * current penalty: $(alms.penalty) ## Stopping criterion - - $(status_summary(alms.stop)) + $(_in_str(status_summary(alms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end _doc_alm_λ_update = raw""" diff --git a/src/solvers/cma_es.jl b/src/solvers/cma_es.jl index 9f1b5897c5..c748c9bce0 100644 --- a/src/solvers/cma_es.jl +++ b/src/solvers/cma_es.jl @@ -197,8 +197,11 @@ function CMAESState( ) end -function show(io::IO, s::CMAESState) +function status_summary(s::CMAESState; context::Symbol = :default) + (context === :short) && return repr(s) i = get_count(s, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(s.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the conjugate gradient descent solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(s.stop) ? "Yes" : "No" s = """ @@ -228,10 +231,9 @@ function show(io::IO, s::CMAESState) * σ: $(s.σ) ## Stopping criterion - - $(status_summary(s.stop)) + $(_in_str(status_summary(s.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end # # Access functions @@ -565,20 +567,21 @@ function (c::StopWhenCovarianceIllConditioned)( end return false end -function status_summary(c::StopWhenCovarianceIllConditioned) - has_stopped = c.at_iteration > 0 - s = has_stopped ? "reached" : "not reached" - return "cond(s.covariance_matrix) > $(c.threshold):\t$s" -end function get_reason(c::StopWhenCovarianceIllConditioned) if c.at_iteration >= 0 return "At iteration $(c.at_iteration) the condition number of covariance matrix ($(c.last_cond)) exceeded the threshold ($(c.threshold)).\n" end return "" end +function status_summary(c::StopWhenCovarianceIllConditioned; context::Symbol = :default) + (context == :short) && return repr(c) + has_stopped = c.at_iteration > 0 + s = has_stopped ? "reached" : "not reached" + return (_is_inline(context) ? "cond(s.covariance_matrix) > $(c.threshold):\t" : "Stop when the covariance matrix is ill-conditioned, i.e. the last condition number is larger than the threshold of $(c.threshold)\n$(_MANOPT_INDENT)") * s +end function show(io::IO, c::StopWhenCovarianceIllConditioned) return print( - io, "StopWhenCovarianceIllConditioned($(c.threshold))\n $(status_summary(c))" + io, "StopWhenCovarianceIllConditioned($(c.threshold))" ) end @@ -628,7 +631,8 @@ function (c::StopWhenBestCostInGenerationConstant)( end return false end -function status_summary(c::StopWhenBestCostInGenerationConstant) +function status_summary(c::StopWhenBestCostInGenerationConstant; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = is_active_stopping_criterion(c) s = has_stopped ? "reached" : "not reached" return "c.iterations_since_change > $(c.iteration_range):\t$s" @@ -707,12 +711,13 @@ function (c::StopWhenEvolutionStagnates)(::AbstractManoptProblem, s::CMAESState, end return false end -function status_summary(c::StopWhenEvolutionStagnates) +function status_summary(c::StopWhenEvolutionStagnates; context::Symbol = :default) + (context == :short) && return repr(c) has_stopped = is_active_stopping_criterion(c) s = has_stopped ? "reached" : "not reached" N = length(c.best_history) if N == 0 - return "best and median fitness not yet filled, stopping criterion:\t$s" + return "best and median fitness not yet filled, stopping criterion:$(_MANOPT_INDENT)$s" end threshold_low = Int(ceil(N * c.fraction)) threshold_high = Int(floor(N * (1 - c.fraction))) @@ -720,7 +725,14 @@ function status_summary(c::StopWhenEvolutionStagnates) median_best_new = median(c.best_history[threshold_high:end]) median_median_old = median(c.median_history[1:threshold_low]) median_median_new = median(c.median_history[threshold_high:end]) - return "generation >= $(c.min_size) && $(median_best_old) <= $(median_best_new) && $(median_median_old) <= $(median_median_new):\t$s" + inline = "generation >= $(c.min_size) && $(median_best_old) <= $(median_best_new) && $(median_median_old) <= $(median_median_new):$(_MANOPT_INDENT)" + _is_inline(context) && return "$(inline)$s" + return """ + A stopping criterion to stop when the evolution stagnates, i.e. + * generation >= $(c.min_size) + * the best mean did not decrease $(median_best_old) <= $(median_best_new)" + * the median did not decrease $(median_median_old) <= $(median_median_new) + overall:$(_MANOPT_INDENT)$s""" end function get_reason(c::StopWhenEvolutionStagnates) if c.at_iteration >= 0 @@ -779,10 +791,11 @@ function (c::StopWhenPopulationStronglyConcentrated)( end return false end -function status_summary(c::StopWhenPopulationStronglyConcentrated) +function status_summary(c::StopWhenPopulationStronglyConcentrated; context::Symbol = :default) + context === :short && return repr(c) has_stopped = is_active_stopping_criterion(c) s = has_stopped ? "reached" : "not reached" - return "norm(s.deviations, Inf) < $(c.tol) && norm(s.σ * s.p_c, Inf) < $(c.tol) :\t$s" + return "norm(s.deviations, Inf) < $(c.tol) && norm(s.σ * s.p_c, Inf) < $(c.tol) :$(_MANOPT_INDENT)$s" end function get_reason(c::StopWhenPopulationStronglyConcentrated) if c.at_iteration >= 0 @@ -791,9 +804,7 @@ function get_reason(c::StopWhenPopulationStronglyConcentrated) return "" end function show(io::IO, c::StopWhenPopulationStronglyConcentrated) - return print( - io, "StopWhenPopulationStronglyConcentrated($(c.tol))\n $(status_summary(c))" - ) + return print(io, "StopWhenPopulationStronglyConcentrated($(c.tol))") end """ @@ -828,10 +839,11 @@ function (c::StopWhenPopulationDiverges)(::AbstractManoptProblem, s::CMAESState, end return false end -function status_summary(c::StopWhenPopulationDiverges) +function status_summary(c::StopWhenPopulationDiverges; context::Symbol = :default) + context === :short && return repr(c) has_stopped = is_active_stopping_criterion(c) s = has_stopped ? "reached" : "not reached" - return "cur_σ_times_maxstddev / c.last_σ_times_maxstddev > $(c.tol) :\t$s" + return "cur_σ_times_maxstddev / c.last_σ_times_maxstddev > $(c.tol) :$(_MANOPT_INDENT)$s" end function get_reason(c::StopWhenPopulationDiverges) if c.at_iteration >= 0 @@ -840,7 +852,7 @@ function get_reason(c::StopWhenPopulationDiverges) return "" end function show(io::IO, c::StopWhenPopulationDiverges) - return print(io, "StopWhenPopulationDiverges($(c.tol))\n $(status_summary(c))") + return print(io, "StopWhenPopulationDiverges($(c.tol))") end """ @@ -888,10 +900,11 @@ function (c::StopWhenPopulationCostConcentrated)( end return false end -function status_summary(c::StopWhenPopulationCostConcentrated) +function status_summary(c::StopWhenPopulationCostConcentrated; context::Symbol = :default) + context === :short && return repr(c) has_stopped = is_active_stopping_criterion(c) s = has_stopped ? "reached" : "not reached" - return "range of best objective values in the last $(length(c.best_value_history)) generations and all objective values in the current one < $(c.tol) :\t$s" + return "range of best objective values in the last $(length(c.best_value_history)) generations and all objective values in the current one < $(c.tol) :$(_MANOPT_INDENT)$s" end function get_reason(c::StopWhenPopulationCostConcentrated) if c.at_iteration >= 0 @@ -901,6 +914,6 @@ function get_reason(c::StopWhenPopulationCostConcentrated) end function show(io::IO, c::StopWhenPopulationCostConcentrated) return print( - io, "StopWhenPopulationCostConcentrated($(c.tol))\n $(status_summary(c))" + io, "StopWhenPopulationCostConcentrated($(c.tol))" ) end diff --git a/src/solvers/conjugate_gradient_descent.jl b/src/solvers/conjugate_gradient_descent.jl index 600bd4aef2..a16159b686 100644 --- a/src/solvers/conjugate_gradient_descent.jl +++ b/src/solvers/conjugate_gradient_descent.jl @@ -8,27 +8,28 @@ function default_stepsize( M; retraction_method = retraction_method, initial_stepsize = 1.0 ) end -function show(io::IO, cgds::ConjugateGradientDescentState) +function status_summary(cgds::ConjugateGradientDescentState; context::Symbol = :default) + (context === :short) && (return repr(cgs)) i = get_count(cgds, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(cgds.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the conjugate gradient descent solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(cgds.stop) ? "Yes" : "No" - s = """ + return """ # Solver state for `Manopt.jl`s Conjugate Gradient Descent Solver $Iter ## Parameters - * conjugate gradient coefficient: $(cgds.coefficient) (last β=$(cgds.β)) - * restart condition: $(cgds.restart_condition) - * retraction method: $(cgds.retraction_method) - * vector transport method: $(cgds.vector_transport_method) + * conjugate gradient coefficient:$(_MANOPT_INDENT)$(cgds.coefficient) (last β=$(cgds.β)) + * restart condition: $(_MANOPT_INDENT)$(cgds.restart_condition) + * retraction method: $(_MANOPT_INDENT)$(cgds.retraction_method) + * vector transport method: $(_MANOPT_INDENT)$(cgds.vector_transport_method) ## Stepsize - $(cgds.stepsize) + $(_in_str(status_summary(cgds.stop; context = context); indent = 0, headers = 1)) ## Stopping criterion - - $(status_summary(cgds.stop)) + $(_in_str(status_summary(cgds.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) end _doc_CG_formula = raw""" @@ -178,6 +179,7 @@ function initialize_solver!(amp::AbstractManoptProblem, cgs::ConjugateGradientDe cgs.δ = -copy(get_manifold(amp), cgs.p, cgs.X) # remember the first gradient in coefficient calculation cgs.coefficient(amp, cgs, 0) + initialize_stepsize!(cgs.stepsize) cgs.β = 0.0 return cgs end diff --git a/src/solvers/conjugate_residual.jl b/src/solvers/conjugate_residual.jl index 992585caef..63c4b3e007 100644 --- a/src/solvers/conjugate_residual.jl +++ b/src/solvers/conjugate_residual.jl @@ -72,9 +72,7 @@ calls_with_kwargs(::typeof(conjugate_residual)) = (conjugate_residual!,) conjugate_residual!(TpM::TangentSpace, args...; kwargs...) function conjugate_residual!( - TpM::TangentSpace, - slso::SymmetricLinearSystemObjective, - X; + TpM::TangentSpace, slso::SymmetricLinearSystemObjective, X; stopping_criterion::SC = StopAfterIteration(manifold_dimension(TpM)) | StopWhenRelativeResidualLess( norm(base_manifold(TpM), base_point(TpM), get_b(TpM, slso)), 1.0e-8 @@ -82,9 +80,7 @@ function conjugate_residual!( kwargs..., ) where {SC <: StoppingCriterion} keywords_accepted(conjugate_residual!; kwargs...) - crs = ConjugateResidualState( - TpM, slso; stopping_criterion = stopping_criterion, kwargs... - ) + crs = ConjugateResidualState(TpM, slso; stopping_criterion = stopping_criterion, kwargs...) dslso = decorate_objective!(TpM, slso; kwargs...) dmp = DefaultManoptProblem(TpM, dslso) dcrs = decorate_state!(crs; kwargs...) diff --git a/src/solvers/convex_bundle_method.jl b/src/solvers/convex_bundle_method.jl index b30c7d75b0..87d5db6502 100644 --- a/src/solvers/convex_bundle_method.jl +++ b/src/solvers/convex_bundle_method.jl @@ -144,22 +144,12 @@ $(_kwargs(:X)) $(_kwargs(:vector_transport_method)) """ mutable struct ConvexBundleMethodState{ - P, - T, - Pr <: Union{F, AbstractManoptProblem} where {F}, - St <: AbstractManoptSolverState, - R, - A <: AbstractVector{<:R}, - B <: AbstractVector{Tuple{<:P, <:T}}, - C <: AbstractVector{T}, - D, - I, - IR <: AbstractInverseRetractionMethod, - TR <: AbstractRetractionMethod, - TS <: Stepsize, - TSC <: StoppingCriterion, - VT <: AbstractVectorTransportMethod, - } <: AbstractManoptSolverState where {R <: Real, P, T, I <: Int, Pr} + P, T, Pr <: Union{F, AbstractManoptProblem} where {F}, St <: AbstractManoptSolverState, + R <: Real, A <: AbstractVector{<:R}, B <: AbstractVector{Tuple{<:P, <:T}}, C <: AbstractVector{T}, + D, I <: Int, + IR <: AbstractInverseRetractionMethod, TR <: AbstractRetractionMethod, + TS <: Stepsize, TSC <: StoppingCriterion, VT <: AbstractVectorTransportMethod, + } <: AbstractManoptSolverState atol_λ::R atol_errors::R bundle::B @@ -189,49 +179,40 @@ mutable struct ConvexBundleMethodState{ sub_state::St ϱ::R function ConvexBundleMethodState( - M::TM, - sub_problem::Pr, - sub_state::St; - p::P = rand(M), - p_estimate = p, - atol_λ::R = eps(), - atol_errors::R = eps(), - bundle_cap::I = 25, - m::R = 1.0e-2, - diameter::R = 50.0, - domain::D = (M, p) -> isfinite(f(M, p)), - k_max = nothing, - k_min = nothing, - k_size = 100, + M::TM, sub_problem::Pr, sub_state::St; + p::P = rand(M), p_estimate = p, atol_λ::Real = eps(), atol_errors::Real = eps(), + bundle_cap::I = 25, m::Real = 1.0e-2, diameter::Real = 50.0, + domain::D = (M, p) -> isfinite(f(M, p)), k_max = nothing, k_min = nothing, k_size = 100, last_stepsize = one(number_eltype(atol_λ)), stepsize::S = default_stepsize(M, ConvexBundleMethodState), inverse_retraction_method::IR = default_inverse_retraction_method(M, typeof(p)), retraction_method::TR = default_retraction_method(M, typeof(p)), - stopping_criterion::SC = StopWhenLagrangeMultiplierLess(1.0e-8) | - StopAfterIteration(5000), + stopping_criterion::SC = StopWhenLagrangeMultiplierLess(1.0e-8) | StopAfterIteration(5000), X::T = zero_vector(M, p), vector_transport_method::VT = default_vector_transport_method(M, typeof(p)), ϱ = nothing, ) where { - D, - IR <: AbstractInverseRetractionMethod, - P, - T, - Pr <: Union{AbstractManoptProblem, F} where {F}, - St <: AbstractManoptSolverState, - I, - TM <: AbstractManifold, - TR <: AbstractRetractionMethod, - SC <: StoppingCriterion, - S <: Stepsize, - VT <: AbstractVectorTransportMethod, - R <: Real, + D, IR <: AbstractInverseRetractionMethod, P, T, + Pr <: Union{AbstractManoptProblem, F} where {F}, St <: AbstractManoptSolverState, + I, TM <: AbstractManifold, TR <: AbstractRetractionMethod, + SC <: StoppingCriterion, S <: Stepsize, VT <: AbstractVectorTransportMethod, } bundle = Vector{Tuple{P, T}}() g = zero_vector(M, p) + transported_subgradients = Vector{T}() + # “Unify” the real type before calling the internal state constructor + R = float(promote_type(typeof.([atol_λ, atol_errors, m, diameter, last_stepsize])...)) + !isnothing(k_max) && (R = promote_type(R, typeof(k_max))) + !isnothing(k_min) && (R = promote_type(R, typeof(k_min))) + !isnothing(ϱ) && (R = promote_type(R, typeof(ϱ))) + atol_λ, atol_errors, m, diameter, last_stepsize = convert.(Ref(R), [atol_λ, atol_errors, m, diameter, last_stepsize]) + !isnothing(k_max) && (k_max = convert(R, k_max)) + !isnothing(k_min) && (k_min = convert(R, k_min)) + !isnothing(ϱ) && (ϱ = convert(R, (ϱ))) + atol_errors = convert(R, atol_errors) + m, diameter, last_stepsize null_stepsize = one(R) linearization_errors = Vector{R}() - transported_subgradients = Vector{T}() ε = zero(R) λ = Vector{R}() ξ = zero(R) @@ -257,53 +238,43 @@ mutable struct ConvexBundleMethodState{ (k_max === nothing) && (k_max = maximum(s)) ϱ = max(ζ_1(k_min, diameter) - one(k_min), one(k_max) - ζ_2(k_max, diameter)) end - return new{ - P, - T, - Pr, - St, - R, - typeof(linearization_errors), - typeof(bundle), - typeof(transported_subgradients), - D, - I, - IR, - TR, - S, - SC, - VT, - }( - atol_λ, - atol_errors, - bundle, - bundle_cap, - diameter, - domain, - g, - inverse_retraction_method, - k_max, - k_min, - last_stepsize, - null_stepsize, - linearization_errors, - m, - p, - copy(M, p), - retraction_method, - stepsize, - stopping_criterion, - transported_subgradients, - vector_transport_method, - X, - ε, - ξ, - λ, - sub_problem, - sub_state, - ϱ, + return ConvexBundleMethodState( + sub_problem, sub_state; + atol_λ = atol_λ, atol_errors = atol_errors, bundle = bundle, bundle_cap = bundle_cap, + diameter = diameter, domain = domain, g = g, inverse_retraction_method = inverse_retraction_method, + k_max = k_max, k_min = k_min, last_stepsize = last_stepsize, null_stepsize = null_stepsize, + linearization_errors = linearization_errors, m = m, p = p, p_last_serious = copy(M, p), + retraction_method = retraction_method, stepsize = stepsize, stopping_criterion = stopping_criterion, + transported_subgradients = transported_subgradients, vector_transport_method = vector_transport_method, + X = X, ε = ε, ξ = ξ, λ = λ, ϱ = ϱ + ) + end + # internal constructor + # here we assume / enforce that the type of real is “resolved” to a unified R + function ConvexBundleMethodState( + sub_problem::Pr, sub_state::St; + atol_λ::R, atol_errors::R, bundle::B, bundle_cap::I, diameter::R, domain::D, + g::T, inverse_retraction_method::IR, k_max::R, k_min::R, last_stepsize::R, + null_stepsize::R, linearization_errors::A, m::R, p::P, p_last_serious::P, + retraction_method::TR, stepsize::TS, stopping_criterion::TSC, + transported_subgradients::C, vector_transport_method::VT, + X::T, ε::R, ξ::R, λ::A, ϱ::R + ) where { + P, T, Pr <: (Union{F, AbstractManoptProblem} where {F}), St <: AbstractManoptSolverState, + A <: AbstractVector{<:Real}, B <: AbstractVector{<:Tuple}, C <: AbstractVector, + D, I, R, + IR <: AbstractInverseRetractionMethod, TR <: AbstractRetractionMethod, + TS <: Stepsize, TSC <: StoppingCriterion, VT <: AbstractVectorTransportMethod, + } + return new{P, T, Pr, St, R, A, B, C, D, I, IR, TR, TS, TSC, VT}( + atol_λ, atol_errors, bundle, bundle_cap, diameter, domain, g, + inverse_retraction_method, k_max, k_min, last_stepsize, + null_stepsize, linearization_errors, m, p, p_last_serious, retraction_method, + stepsize, stopping_criterion, transported_subgradients, vector_transport_method, X, ε, ξ, λ, sub_problem, sub_state, ϱ ) end + # resolve an ambiguity + ConvexBundleMethodState(M::AbstractManifold, st::AbstractManoptSolverState; kwargs...) = error("Convex Bunde Method state can not be constructed based on $M and the sub state $st, a sub_problem is missing") end function ConvexBundleMethodState( M::AbstractManifold, @@ -325,9 +296,30 @@ function default_stepsize(M::AbstractManifold, ::Type{ConvexBundleMethodState}) return ConstantStepsize(M) end function show(io::IO, cbms::ConvexBundleMethodState) + print(io, "ConvexBundleMethodState(") + print(io, cbms.sub_problem, ", ", cbms.sub_state, "; ") + print(io, "atol_λ = ", cbms.atol_λ, ", atol_errors = ", cbms.atol_errors, ", ") + print(io, "bundle = ", cbms.bundle, ", bundle_cap = ", cbms.bundle_cap, ", ") + print(io, "diameter = ", cbms.diameter, ", domain = ", cbms.domain, ", ") + print(io, "g = ", cbms.g, ", inverse_retraction_method = ", cbms.inverse_retraction_method, ", ") + print(io, "k_max = ", cbms.k_max, ", k_min = ", cbms.k_min, ", ") + print(io, "last_stepsize = ", cbms.last_stepsize, ", linearization_errors = ", cbms.linearization_errors, ", ") + print(io, "null_stepsize = ", cbms.null_stepsize, ", m = ", cbms.m, ", p = ", cbms.p, ", ") + print(io, "p_last_serious = ", cbms.p_last_serious, ", retraction_method = ", cbms.retraction_method, ", ") + print(io, "stepsize = ", cbms.stepsize, ", stopping_criterion = ", cbms.stop, ", ") + print(io, "transported_subgradients = ", cbms.transported_subgradients, ", ") + print(io, "vector_transport_method = ", cbms.vector_transport_method, ", X = ", cbms.X) + print(io, ", ε = ", cbms.ε, ", ξ = ", cbms.ξ, ", ", "λ = ", cbms.λ, ", ϱ = ", cbms.ϱ) + return print(io, ")") +end +function status_summary(cbms::ConvexBundleMethodState; context::Symbol = :default) + (context === :short) && return repr(cbms) i = get_count(cbms, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(cbms.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the Convex Bundle Method$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(cbms.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(cbms)) – $(Iter) $(has_converged(cbms) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Convex Bundle Method $Iter @@ -345,9 +337,9 @@ function show(io::IO, cbms::ConvexBundleMethodState) * vector transport: $(cbms.vector_transport_method) ## Stopping criterion - $(status_summary(cbms.stop)) + $(_in_str(status_summary(cbms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end function _domain_condition(M, q, p, t, length, domain) @@ -380,20 +372,25 @@ mutable struct DomainBackTrackingStepsize{TRM <: AbstractRetractionMethod, P, F} last_stepsize::F message::String retraction_method::TRM + function DomainBackTrackingStepsize(; + candidate_point::P, contraction_factor::F, initial_stepsize::F, last_stepsize::F, message::String, retraction_method::TRM + ) where {TRM <: AbstractRetractionMethod, P, F} + return new{TRM, P, F}( + candidate_point, contraction_factor, initial_stepsize, last_stepsize, message, retraction_method, + ) + end function DomainBackTrackingStepsize( M::AbstractManifold; candidate_point::P = allocate_result(M, rand), - contraction_factor::F = 0.95, - initial_stepsize::F = 1.0, + contraction_factor::Real = 0.95, + initial_stepsize::Real = 1.0, retraction_method::TRM = default_retraction_method(M), - ) where {TRM, P, F} - return new{TRM, P, F}( - candidate_point, - contraction_factor, - initial_stepsize, - initial_stepsize, - "", # initialize an empty message - retraction_method, + ) where {TRM, P} + F = promote_type(typeof(contraction_factor), typeof(initial_stepsize)) + return DomainBackTrackingStepsize(; + candidate_point = candidate_point, contraction_factor = convert(F, contraction_factor), + initial_stepsize = convert(F, initial_stepsize), last_stepsize = convert(F, initial_stepsize), + message = "", retraction_method = retraction_method, ) end end @@ -403,45 +400,37 @@ function (dbt::DomainBackTrackingStepsize)( M = get_manifold(amp) dbt.last_stepsize = 1.0 retract!( - M, - dbt.candidate_point, - cbms.p_last_serious, - -dbt.last_stepsize * cbms.g, - dbt.retraction_method, + M, dbt.candidate_point, cbms.p_last_serious, -dbt.last_stepsize * cbms.g, dbt.retraction_method, ) while _domain_condition( - M, - dbt.candidate_point, - cbms.p_last_serious, - dbt.last_stepsize, - norm(M, cbms.p_last_serious, cbms.g), - cbms.domain, + M, dbt.candidate_point, cbms.p_last_serious, dbt.last_stepsize, norm(M, cbms.p_last_serious, cbms.g), cbms.domain, ) dbt.last_stepsize *= dbt.contraction_factor retract!( - M, - dbt.candidate_point, - cbms.p_last_serious, - -dbt.last_stepsize * cbms.g, - dbt.retraction_method, + M, dbt.candidate_point, cbms.p_last_serious, -dbt.last_stepsize * cbms.g, dbt.retraction_method, ) end return dbt.last_stepsize end get_initial_stepsize(dbt::DomainBackTrackingStepsize) = dbt.initial_stepsize -function show(io::IO, dbt::DomainBackTrackingStepsize) - return print( - io, - """ - DomainBackTracking(; - initial_stepsize=$(dbt.initial_stepsize) - retraction_method=$(dbt.retraction_method) - contraction_factor=$(dbt.contraction_factor) - )""", - ) -end -function status_summary(dbt::DomainBackTrackingStepsize) - return "$(dbt)\nand a computed last stepsize of $(dbt.last_stepsize)" +function Base.show(io::IO, dbt::DomainBackTrackingStepsize) + print(io, "DomainBackTrackingStepsize(; candidate_point = ", dbt.candidate_point) + print(io, ", contraction_factor = ", dbt.contraction_factor, ", initial_stepsize = ", dbt.initial_stepsize) + print(io, ", last_stepsize = ", dbt.last_stepsize, " message = ", dbt.message) + print(io, ", retraction_method = ", dbt.retraction_method) + return print(io, ")") +end +function status_summary(dbt::DomainBackTrackingStepsize; context::Symbol = :default) + (context === :short) && return repr(dbt) + (context === :inline) && return "A domain backtracking step size (last step size: $(dbt.last_stepsize))" + return """ + A domain backtracking stepsize + (last step size: $(dbt.last_stepsize)) + + ## Parameters + * contraction factor:$(_MANOPT_INDENT)$(dbt.contraction_factor) + * retraction method: $(_MANOPT_INDENT)$(dbt.retraction_method) + """ end get_message(dbt::DomainBackTrackingStepsize) = dbt.message function get_parameter(dbt::DomainBackTrackingStepsize, s::Val{:Iterate}) @@ -482,22 +471,26 @@ mutable struct NullStepBackTrackingStepsize{TRM <: AbstractRetractionMethod, P, message::String retraction_method::TRM X::T + function NullStepBackTrackingStepsize(; + candidate_point::P, contraction_factor::F, initial_stepsize::F, last_stepsize::F, message::String, retraction_method::TRM, X::T + ) where {TRM <: AbstractRetractionMethod, P, F, T} + return new{TRM, P, F, T}( + candidate_point, contraction_factor, initial_stepsize, last_stepsize, message, retraction_method, X + ) + end function NullStepBackTrackingStepsize( M::AbstractManifold; candidate_point::P = allocate_result(M, rand), - contraction_factor::F = 0.95, - initial_stepsize::F = 1.0, + contraction_factor::Real = 0.95, + initial_stepsize::Real = 1.0, retraction_method::TRM = default_retraction_method(M), X::T = zero_vector(M, candidate_point), - ) where {TRM, P, F, T} - return new{TRM, P, F, T}( - candidate_point, - contraction_factor, - initial_stepsize, - initial_stepsize, - "", # initialize an empty message - retraction_method, - X, + ) where {TRM, P, T} + F = promote_type(typeof(contraction_factor), typeof(initial_stepsize)) + return NullStepBackTrackingStepsize(; + candidate_point = candidate_point, contraction_factor = convert(F, contraction_factor), + initial_stepsize = convert(F, initial_stepsize), last_stepsize = convert(F, initial_stepsize), + message = "", retraction_method = retraction_method, X = X, ) end end @@ -507,34 +500,17 @@ function (nsbt::NullStepBackTrackingStepsize)( M = get_manifold(amp) nsbt.last_stepsize = cbms.last_stepsize retract!( - M, - nsbt.candidate_point, - cbms.p_last_serious, - -nsbt.last_stepsize * cbms.g, - nsbt.retraction_method, + M, nsbt.candidate_point, cbms.p_last_serious, -nsbt.last_stepsize * cbms.g, nsbt.retraction_method, ) get_subgradient!(amp, nsbt.X, nsbt.candidate_point) while _null_condition( - amp, - M, - nsbt.candidate_point, - cbms.p_last_serious, - nsbt.X, - cbms.g, - cbms.vector_transport_method, - cbms.inverse_retraction_method, - cbms.m, - nsbt.last_stepsize, - cbms.ξ, - cbms.ϱ, + amp, M, nsbt.candidate_point, cbms.p_last_serious, nsbt.X, cbms.g, + cbms.vector_transport_method, cbms.inverse_retraction_method, + cbms.m, nsbt.last_stepsize, cbms.ξ, cbms.ϱ, ) nsbt.last_stepsize *= nsbt.contraction_factor retract!( - M, - nsbt.candidate_point, - cbms.p_last_serious, - -nsbt.last_stepsize * cbms.g, - nsbt.retraction_method, + M, nsbt.candidate_point, cbms.p_last_serious, -nsbt.last_stepsize * cbms.g, nsbt.retraction_method, ) get_subgradient!(amp, nsbt.X, nsbt.candidate_point) end @@ -548,21 +524,23 @@ function get_parameter(nsbt::NullStepBackTrackingStepsize, s::Val{:Subgradient}) return nsbt.X end function show(io::IO, nsbt::NullStepBackTrackingStepsize) - return print( - io, - """ - NullStepBackTracking(; - initial_stepsize=$(nsbt.initial_stepsize) - retraction_method=$(nsbt.retraction_method) - contraction_factor=$(nsbt.contraction_factor) - candidate_point=$(nsbt.candidate_point) - X=$(nsbt.X) - last_stepsize=$(nsbt.last_stepsize) - )""", - ) -end -function status_summary(nsbt::NullStepBackTrackingStepsize) - return "$(nsbt)\nand a computed last stepsize of $(nsbt.last_stepsize)" + print(io, "NullStepBackTrackingStepsize(; candidate_point = ", nsbt.candidate_point) + print(io, ", contraction_factor = ", nsbt.contraction_factor, ", initial_stepsize = ", nsbt.initial_stepsize) + print(io, ", last_stepsize = ", nsbt.last_stepsize, " message = ", nsbt.message) + print(io, ", retraction_method = ", nsbt.retraction_method, ", X = ", nsbt.X) + return print(io, ")") +end +function status_summary(nsbt::NullStepBackTrackingStepsize; context::Symbol = :default) + (context === :short) && return repr(nsbt) + (context === :inline) && return "A null step backtracking step size (last step size: $(nsbt.last_stepsize))" + return """ + A null step backtracking stepsize + (last step size: $(nsbt.last_stepsize)) + + ## Parameters + * contraction factor:$(_MANOPT_INDENT)$(nsbt.contraction_factor) + * retraction method: $(_MANOPT_INDENT)$(nsbt.retraction_method) + """ end get_message(nsbt::NullStepBackTrackingStepsize) = nsbt.message diff --git a/src/solvers/cyclic_proximal_point.jl b/src/solvers/cyclic_proximal_point.jl index 3d6260782c..163e0d6e05 100644 --- a/src/solvers/cyclic_proximal_point.jl +++ b/src/solvers/cyclic_proximal_point.jl @@ -1,19 +1,3 @@ -function show(io::IO, cpps::CyclicProximalPointState) - i = get_count(cpps, :Iterations) - Iter = (i > 0) ? "After $i iterations\n" : "" - Conv = indicates_convergence(cpps.stop) ? "Yes" : "No" - s = """ - # Solver state for `Manopt.jl`s Cyclic Proximal Point Algorithm - $Iter - ## Parameters - * evaluation order of the proximal maps: :$(cpps.order_type) - - ## Stopping criterion - - $(status_summary(cpps.stop)) - This indicates convergence: $Conv""" - return print(io, s) -end _doc_CPPA = """ cyclic_proximal_point(M, f, proxes_f, p; kwargs...) cyclic_proximal_point(M, mpo, p; kwargs...) diff --git a/src/solvers/debug_solver.jl b/src/solvers/debug_solver.jl index ced0849e89..3b5709ce6b 100644 --- a/src/solvers/debug_solver.jl +++ b/src/solvers/debug_solver.jl @@ -8,14 +8,14 @@ triggered (with iteration number `0`) to trigger possible resets function initialize_solver!(amp::AbstractManoptProblem, dss::DebugSolverState) initialize_solver!(amp, dss.state) # Call Start - get(dss.debugDictionary, :Start, DebugDivider(""))(amp, get_state(dss), 0) + get(dss.debug_dictionary, :Start, DebugDivider(""))(amp, get_state(dss), 0) # Reset / Init (maybe with print at 0) (before) Iteration for key in [:BeforeIteration, :Iteration] - get(dss.debugDictionary, key, DebugDivider(""))(amp, get_state(dss), 0) + get(dss.debug_dictionary, key, DebugDivider(""))(amp, get_state(dss), 0) end # (just) reset Stop (do not print here) for key in [:Stop] - get(dss.debugDictionary, key, DebugDivider(""))(amp, get_state(dss), -1) + get(dss.debug_dictionary, key, DebugDivider(""))(amp, get_state(dss), -1) end return dss end @@ -26,9 +26,9 @@ Extend the `i`th step of the solver by a hook to run debug prints, that were added to the `:BeforeIteration` and `:Iteration` entries of the debug lists. """ function step_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, k) - get(dss.debugDictionary, :BeforeIteration, DebugDivider(""))(amp, get_state(dss), k) + get(dss.debug_dictionary, :BeforeIteration, DebugDivider(""))(amp, get_state(dss), k) step_solver!(amp, dss.state, k) - get(dss.debugDictionary, :Iteration, DebugDivider(""))(amp, get_state(dss), k) + get(dss.debug_dictionary, :Iteration, DebugDivider(""))(amp, get_state(dss), k) return dss end @@ -41,7 +41,7 @@ that were added to the `:Stop` entry of the debug lists. function stop_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, k::Int) stop = stop_solver!(amp, dss.state, k) if stop - get(dss.debugDictionary, :Stop, DebugDivider(""))(amp, get_state(dss), k) + get(dss.debug_dictionary, :Stop, DebugDivider(""))(amp, get_state(dss), k) end return stop end diff --git a/src/solvers/difference_of_convex_algorithm.jl b/src/solvers/difference_of_convex_algorithm.jl index 82bc8f082b..acc4198b63 100644 --- a/src/solvers/difference_of_convex_algorithm.jl +++ b/src/solvers/difference_of_convex_algorithm.jl @@ -48,18 +48,21 @@ mutable struct DifferenceOfConvexState{ sub_state::St stop::SC function DifferenceOfConvexState( - M::AbstractManifold, - sub_problem::Pr, - sub_state::St; - p::P = rand(M), - X::T = zero_vector(M, p), + M::AbstractManifold, sub_problem::Pr, sub_state::St; p::P = rand(M), X::T = zero_vector(M, p), stopping_criterion::SC = StopAfterIteration(300) | StopWhenChangeLess(M, 1.0e-9), ) where { - P, - T, - Pr <: Union{AbstractManoptProblem, F} where {F}, - St <: AbstractManoptSolverState, - SC <: StoppingCriterion, + P, T, Pr <: Union{AbstractManoptProblem, F} where {F}, + St <: AbstractManoptSolverState, SC <: StoppingCriterion, + } + return DifferenceOfConvexState(sub_problem, sub_state; p = p, X = X, stopping_criterion = stopping_criterion) + end + # resolve an ambiguity + DifferenceOfConvexState(M::AbstractManifold, st::AbstractManoptSolverState; kwargs...) = error("Difference of Convex Method state can not be constructed based on $M and the sub state $st, a sub_problem is missing") + function DifferenceOfConvexState( + sub_problem::Pr, sub_state::St; p::P, X::T, stopping_criterion::SC + ) where { + P, T, Pr <: Union{AbstractManoptProblem, F} where {F}, + St <: AbstractManoptSolverState, SC <: StoppingCriterion, } return new{P, T, Pr, St, SC}(p, X, sub_problem, sub_state, stopping_criterion) end @@ -85,13 +88,22 @@ function get_message(dcs::DifferenceOfConvexState) # for now only the sub solver might have messages return get_message(dcs.sub_state) end - -function show(io::IO, dcs::DifferenceOfConvexState) +function Base.show(io::IO, dcs::DifferenceOfConvexState) + print(io, "DifferenceOfConvexState("); print(io, dcs.sub_problem); print(io, dcs.sub_state); print(io, "; )") + print(io, "p = "); print(io, dcs.p); print(io, ", ") + print(io, "X = "); print(io, dcs.X); print(io, ", ") + print(io, "stopping_criterion = "); print(io, dcs.stop) + return print(io, ")") +end +function status_summary(dcs::DifferenceOfConvexState; context::Symbol = :default) + (context === :short) && return repr(s) i = get_count(dcs, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(dcs.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the differencce of convex algorithm$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(dcs.stop) ? "Yes" : "No" - sub = repr(dcs.sub_state) - sub = replace(sub, "\n" => "\n | ") + sub = status_summary(dcs.sub_state; context = context) + sub = replace(sub, "\n" => "\n | ", "\n#" => "\n$(_MANOPT_INDENT)##") s = """ # Solver state for `Manopt.jl`s Difference of Convex Algorithm $Iter @@ -100,11 +112,11 @@ function show(io::IO, dcs::DifferenceOfConvexState) | $(sub) ## Stopping criterion - - $(status_summary(dcs.stop)) + $(_in_str(status_summary(dcs.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end + _doc_DoC = """ difference_of_convex_algorithm(M, f, g, ∂h, p=rand(M); kwargs...) difference_of_convex_algorithm(M, mdco, p; kwargs...) @@ -174,15 +186,9 @@ $(_note(:OutputSection)) @doc "$(_doc_DoC)" difference_of_convex_algorithm(M::AbstractManifold, args...; kwargs...) function difference_of_convex_algorithm( - M::AbstractManifold, - f, - g, - ∂h, - p = rand(M); + M::AbstractManifold, f, g, ∂h, p = rand(M); evaluation::AbstractEvaluationType = AllocatingEvaluation(), - grad_g = nothing, - gradient = nothing, - kwargs..., + grad_g = nothing, gradient = nothing, kwargs..., ) p_ = _ensure_mutating_variable(p) f_ = _ensure_mutating_cost(f, p) @@ -194,14 +200,8 @@ function difference_of_convex_algorithm( f_, ∂h_; gradient = gradient_, evaluation = evaluation ) rs = difference_of_convex_algorithm( - M, - mdco, - p_; - g = g_, - evaluation = evaluation, - gradient = gradient_, - grad_g = grad_g_, - kwargs..., + M, mdco, p_; + g = g_, evaluation = evaluation, gradient = gradient_, grad_g = grad_g_, kwargs..., ) return _ensure_matching_output(p, rs) end @@ -217,38 +217,23 @@ calls_with_kwargs(::typeof(difference_of_convex_algorithm)) = (difference_of_con @doc "$(_doc_DoC)" difference_of_convex_algorithm!(M::AbstractManifold, args...; kwargs...) function difference_of_convex_algorithm!( - M::AbstractManifold, - f, - g, - ∂h, - p; - evaluation::AbstractEvaluationType = AllocatingEvaluation(), - gradient = nothing, - kwargs..., - ) - mdco = ManifoldDifferenceOfConvexObjective( - f, ∂h; gradient = gradient, evaluation = evaluation - ) - return difference_of_convex_algorithm!( - M, mdco, p; g = g, evaluation = evaluation, kwargs... + M::AbstractManifold, f, g, ∂h, p; + evaluation::AbstractEvaluationType = AllocatingEvaluation(), gradient = nothing, kwargs..., ) + mdco = ManifoldDifferenceOfConvexObjective(f, ∂h; gradient = gradient, evaluation = evaluation) + return difference_of_convex_algorithm!(M, mdco, p; g = g, evaluation = evaluation, kwargs...) end function difference_of_convex_algorithm!( - M::AbstractManifold, - mdco::O, - p; + M::AbstractManifold, mdco::O, p; evaluation::AbstractEvaluationType = AllocatingEvaluation(), - g = nothing, - grad_g = nothing, + g = nothing, grad_g = nothing, gradient = nothing, X = zero_vector(M, p), objective_type = :Riemannian, stopping_criterion = if isnothing(gradient) StopAfterIteration(300) | StopWhenChangeLess(M, 1.0e-9) else - StopAfterIteration(300) | - StopWhenChangeLess(M, 1.0e-9) | - StopWhenGradientNormLess(1.0e-9) + StopAfterIteration(300) | StopWhenChangeLess(M, 1.0e-9) | StopWhenGradientNormLess(1.0e-9) end, # Subsolver Magic Cascade. sub_cost = if isnothing(g) @@ -272,9 +257,7 @@ function difference_of_convex_algorithm!( if isnothing(sub_hess) ManifoldGradientObjective(sub_cost, sub_grad; evaluation = evaluation) else - ManifoldHessianObjective( - sub_cost, sub_grad, sub_hess; evaluation = evaluation - ) + ManifoldHessianObjective(sub_cost, sub_grad, sub_hess; evaluation = evaluation) end; objective_type = objective_type, sub_kwargs..., @@ -294,18 +277,12 @@ function difference_of_convex_algorithm!( decorate_state!( if isnothing(sub_hess) GradientDescentState( - M; - p = copy(M, p), - stopping_criterion = sub_stopping_criterion, - sub_kwargs..., + M; p = copy(M, p), stopping_criterion = sub_stopping_criterion, sub_kwargs... ) else TrustRegionsState( - M, - sub_objective; - p = copy(M, p), - stopping_criterion = sub_stopping_criterion, - sub_kwargs..., + M, sub_objective; + p = copy(M, p), stopping_criterion = sub_stopping_criterion, sub_kwargs... ) end; sub_kwargs..., @@ -326,12 +303,8 @@ function difference_of_convex_algorithm!( """, ) dcs = DifferenceOfConvexState( - M, - sub_problem, - maybe_wrap_evaluation_type(sub_state); - p = p, - stopping_criterion = stopping_criterion, - X = X, + M, sub_problem, maybe_wrap_evaluation_type(sub_state); + p = p, stopping_criterion = stopping_criterion, X = X, ) ddcs = decorate_state!(dcs; kwargs...) solve!(dmp, ddcs) diff --git a/src/solvers/difference_of_convex_proximal_point.jl b/src/solvers/difference_of_convex_proximal_point.jl index a91fd4a654..b5a8c5d3dd 100644 --- a/src/solvers/difference_of_convex_proximal_point.jl +++ b/src/solvers/difference_of_convex_proximal_point.jl @@ -47,15 +47,8 @@ $(_kwargs(:stopping_criterion; default = "`[StopWhenChangeLess`](@ref)`(1e-8)")) $(_kwargs(:X; add_properties = [:as_Memory])) """ mutable struct DifferenceOfConvexProximalState{ - P, - T, - Pr, - St <: AbstractManoptSolverState, - S <: Stepsize, - SC <: StoppingCriterion, - RTR <: AbstractRetractionMethod, - ITR <: AbstractInverseRetractionMethod, - Tλ, + P, T, Pr, St <: AbstractManoptSolverState, S <: Stepsize, SC <: StoppingCriterion, + RTR <: AbstractRetractionMethod, ITR <: AbstractInverseRetractionMethod, Tλ, } <: AbstractSubProblemSolverState λ::Tλ p::P @@ -69,42 +62,42 @@ mutable struct DifferenceOfConvexProximalState{ stepsize::S stop::SC function DifferenceOfConvexProximalState( - M::AbstractManifold, - sub_problem::Pr, - sub_state::St; - p::P = rand(M), - X::T = zero_vector(M, p), + M::AbstractManifold, sub_problem::Pr, sub_state::St; + p::P = rand(M), X::T = zero_vector(M, p), stepsize::S = ConstantStepsize(M), stopping_criterion::SC = StopWhenChangeLess(M, 1.0e-8), inverse_retraction_method::I = default_inverse_retraction_method(M, typeof(p)), retraction_method::R = default_retraction_method(M, typeof(p)), λ::Fλ = i -> 1, ) where { - P, - T, - Pr <: Union{AbstractManoptProblem, F} where {F}, - S <: Stepsize, - St <: AbstractManoptSolverState, - SC <: StoppingCriterion, - I <: AbstractInverseRetractionMethod, - R <: AbstractRetractionMethod, - Fλ, + P, T, Pr <: Union{AbstractManoptProblem, F} where {F}, + S <: Stepsize, St <: AbstractManoptSolverState, SC <: StoppingCriterion, + I <: AbstractInverseRetractionMethod, R <: AbstractRetractionMethod, Fλ, + } + return DifferenceOfConvexState( + sub_problem, sub_state; + λ = λ, p = p, q = copy(M, p), r = copy(M, p), X = X, + retraction_method = retraction_method, inverse_retraction_method = inverse_retraction_method, + stepsize = stepsize, stopping_criterion = stopping_criterion, + ) + end + function DifferenceOfConvexState( + sub_problem::Pr, sub_state::St; + λ::Fλ, p::P, q::P, r::P, X::T, + retraction_method::R, inverse_retraction_method::I, stepsize::S, stopping_criterion::SC + ) where { + P, T, Pr <: Union{AbstractManoptProblem, F} where {F}, + S <: Stepsize, St <: AbstractManoptSolverState, SC <: StoppingCriterion, + I <: AbstractInverseRetractionMethod, R <: AbstractRetractionMethod, Fλ, } return new{P, T, Pr, St, S, SC, R, I, Fλ}( - λ, - p, - copy(M, p), - copy(M, p), - sub_problem, - sub_state, - X, - retraction_method, - inverse_retraction_method, - stepsize, - stopping_criterion, + λ, p, q, r, sub_problem, sub_state, X, + retraction_method, inverse_retraction_method, stepsize, stopping_criterion, ) end end +# resolve an ambiguity +DifferenceOfConvexProximalState(M::AbstractManifold, st::AbstractManoptSolverState; kwargs...) = error("Difference of Convex Proximal Method state can not be constructed based on $M and the sub state $st, a sub_problem is missing") function DifferenceOfConvexProximalState( M::AbstractManifold, sub_problem; evaluation::E = AllocatingEvaluation(), kwargs... ) where {E <: AbstractEvaluationType} @@ -125,12 +118,28 @@ function get_message(dcs::DifferenceOfConvexProximalState) # for now only the sub solver might have messages return get_message(dcs.sub_state) end -function show(io::IO, dcps::DifferenceOfConvexProximalState) +function Base.show(io::IO, dcps::DifferenceOfConvexProximalState) + print(io, "DifferenceOfConvexProximalState("); print(io, dcps.sub_problem); print(io, dcps.sub_state); print(io, "; )") + print(io, "p = "); print(io, dcps.p); print(io, ", ") + print(io, "q = "); print(io, dcps.q); print(io, ", ") + print(io, "r = "); print(io, dcps.r); print(io, ", ") + print(io, "λ = "); print(io, dcps.λ); print(io, ", ") + print(io, "retraction_method = "); print(io, dcps.retraction_method); print(io, ", ") + print(io, "inverse_retraction_method = "); print(io, dcps.inverse_retraction_method); print(io, ", ") + print(io, "stopping_criterion = "); print(io, dcps.stop); print(io, ", ") + print(io, "stepsize = "); print(io, dcps.stepsize) + return print(io, ")") +end +function status_summary(dcps::DifferenceOfConvexProximalState; context::Symbol = :default) + (context === :short) && return repr(dcps) i = get_count(dcps, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(dcps.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the difference of convex proximal point algorithm$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(dcps.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(dcps)) – $(Iter) $(has_converged(dcps) ? "(converged)" : "")") sub = repr(dcps.sub_state) - sub = replace(sub, "\n" => "\n | ") + sub = replace(sub, "\n" => "\n | ", "\n#" => "\n$(_MANOPT_INDENT)##") s = """ # Solver state for `Manopt.jl`s Difference of Convex Proximal Point Algorithm $Iter @@ -141,13 +150,12 @@ function show(io::IO, dcps::DifferenceOfConvexProximalState) | $(sub) ## Stepsize - $(dcps.stepsize) + $(_in_str(status_summary(dcps.stepsize; context = context); indent = 0, headers = 1)) ## Stopping criterion - - $(status_summary(dcps.stop)) + $(_in_str(status_summary(dcps.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end # # Prox approach @@ -231,15 +239,9 @@ $(_note(:OutputSection)) @doc "$(_doc_DCPPA)" difference_of_convex_proximal_point(M::AbstractManifold, args...; kwargs...) function difference_of_convex_proximal_point( - M::AbstractManifold, - grad_h, - p = rand(M); - cost = nothing, - evaluation::AbstractEvaluationType = AllocatingEvaluation(), - gradient = nothing, - g = nothing, - grad_g = nothing, - prox_g = nothing, + M::AbstractManifold, grad_h, p = rand(M); + cost = nothing, evaluation::AbstractEvaluationType = AllocatingEvaluation(), + gradient = nothing, g = nothing, grad_g = nothing, prox_g = nothing, kwargs..., ) keywords_accepted(difference_of_convex_proximal_point; kwargs...) @@ -255,15 +257,9 @@ function difference_of_convex_proximal_point( grad_h_; cost = cost_, gradient = gradient_, evaluation = evaluation ) rs = difference_of_convex_proximal_point( - M, - mdcpo, - p_; - cost = cost_, - evaluation = evaluation, - gradient = gradient_, - g = g_, - grad_g = grad_g_, - prox_g = prox_g_, + M, mdcpo, p_; + cost = cost_, evaluation = evaluation, + gradient = gradient_, g = g_, grad_g = grad_g_, prox_g = prox_g_, kwargs..., ) return _ensure_matching_output(p, rs) @@ -283,13 +279,9 @@ calls_with_kwargs(::typeof(difference_of_convex_proximal_point)) = (difference_o @doc "$(_doc_DCPPA)" difference_of_convex_proximal_point!(M::AbstractManifold, args...; kwargs...) function difference_of_convex_proximal_point!( - M::AbstractManifold, - grad_h, - p; + M::AbstractManifold, grad_h, p; evaluation::AbstractEvaluationType = AllocatingEvaluation(), - cost = nothing, - gradient = nothing, - kwargs..., + cost = nothing, gradient = nothing, kwargs..., ) mdcpo = ManifoldDifferenceOfConvexProximalObjective( grad_h; cost = cost, gradient = gradient, evaluation = evaluation @@ -299,12 +291,8 @@ function difference_of_convex_proximal_point!( ) end function difference_of_convex_proximal_point!( - M::AbstractManifold, - mdcpo::O, - p; - g = nothing, - grad_g = nothing, - prox_g = nothing, + M::AbstractManifold, mdcpo::O, p; + g = nothing, grad_g = nothing, prox_g = nothing, X = zero_vector(M, p), λ = i -> 1 / 2, evaluation::AbstractEvaluationType = AllocatingEvaluation(), @@ -315,9 +303,7 @@ function difference_of_convex_proximal_point!( stopping_criterion = if isnothing(get_gradient_function(mdcpo)) StopAfterIteration(300) | StopWhenChangeLess(M, 1.0e-9) else - StopAfterIteration(300) | - StopWhenChangeLess(M, 1.0e-9) | - StopWhenGradientNormLess(1.0e-9) + StopAfterIteration(300) | StopWhenChangeLess(M, 1.0e-9) | StopWhenGradientNormLess(1.0e-9) end, sub_cost = isnothing(g) ? nothing : ProximalDCCost(g, copy(M, p), λ(1)), sub_grad = if isnothing(grad_g) @@ -403,12 +389,8 @@ function difference_of_convex_proximal_point!( dmdcpo = decorate_objective!(M, mdcpo; objective_type = objective_type, kwargs...) dmp = DefaultManoptProblem(M, dmdcpo) dcps = DifferenceOfConvexProximalState( - M, - sub_problem, - maybe_wrap_evaluation_type(sub_state); - p = p, - X = X, - stepsize = _produce_type(stepsize, M, p), + M, sub_problem, maybe_wrap_evaluation_type(sub_state); + p = p, X = X, stepsize = _produce_type(stepsize, M, p), stopping_criterion = stopping_criterion, inverse_retraction_method = inverse_retraction_method, retraction_method = retraction_method, @@ -449,9 +431,7 @@ end =# function step_solver!( amp::AbstractManoptProblem, - dcps::DifferenceOfConvexProximalState{ - P, T, <:Function, ClosedFormSubSolverState{InplaceEvaluation}, - }, + dcps::DifferenceOfConvexProximalState{P, T, <:Function, ClosedFormSubSolverState{InplaceEvaluation}}, k, ) where {P, T} M = get_manifold(amp) @@ -469,9 +449,7 @@ end =# function step_solver!( amp::AbstractManoptProblem, - dcps::DifferenceOfConvexProximalState{ - P, T, <:AbstractManoptProblem, <:AbstractManoptSolverState, - }, + dcps::DifferenceOfConvexProximalState{P, T, <:AbstractManoptProblem, <:AbstractManoptSolverState}, k, ) where {P, T} M = get_manifold(amp) diff --git a/src/solvers/exact_penalty_method.jl b/src/solvers/exact_penalty_method.jl index 58520605de..175795181a 100644 --- a/src/solvers/exact_penalty_method.jl +++ b/src/solvers/exact_penalty_method.jl @@ -49,11 +49,8 @@ $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(300)`$( [`exact_penalty_method`](@ref) """ mutable struct ExactPenaltyMethodState{ - P, - Pr <: Union{F, AbstractManoptProblem} where {F}, - St <: AbstractManoptSolverState, - R <: Real, - TStopping <: StoppingCriterion, + P, Pr <: Union{F, AbstractManoptProblem} where {F}, St <: AbstractManoptSolverState, + R <: Real, TStopping <: StoppingCriterion, } <: AbstractSubProblemSolverState p::P sub_problem::Pr @@ -68,45 +65,33 @@ mutable struct ExactPenaltyMethodState{ θ_ϵ::R stop::TStopping function ExactPenaltyMethodState( - M::AbstractManifold, - sub_problem::Pr, - sub_state::St; + M::AbstractManifold, sub_problem::Pr, sub_state::St; p::P = rand(M), - ϵ::R = 1.0e-3, - ϵ_min::R = 1.0e-6, - ϵ_exponent = 1 / 100, - θ_ϵ = (ϵ_min / ϵ)^(ϵ_exponent), - u::R = 1.0e-1, - u_min::R = 1.0e-6, - u_exponent = 1 / 100, - θ_u = (u_min / u)^(u_exponent), - ρ::R = 1.0, - θ_ρ::R = 0.3, - stopping_criterion::SC = StopAfterIteration(300) | ( - StopWhenSmallerOrEqual(:ϵ, ϵ_min) | StopWhenChangeLess(M, 1.0e-10) - ), + ϵ::Real = 1.0e-3, ϵ_min::Real = 1.0e-6, ϵ_exponent::Real = 1 / 100, + θ_ϵ::Real = (ϵ_min / ϵ)^(ϵ_exponent), + u::Real = 1.0e-1, u_min::Real = 1.0e-6, u_exponent::Real = 1 / 100, + θ_u::Real = (u_min / u)^(u_exponent), ρ::Real = 1.0, θ_ρ::Real = 0.3, + stopping_criterion::SC = StopAfterIteration(300) | (StopWhenSmallerOrEqual(:ϵ, ϵ_min) | StopWhenChangeLess(M, 1.0e-10)), ) where { - P, - Pr <: Union{F, AbstractManoptProblem} where {F}, - St <: AbstractManoptSolverState, - R <: Real, - SC <: StoppingCriterion, + P, Pr <: Union{F, AbstractManoptProblem} where {F}, St <: AbstractManoptSolverState, SC <: StoppingCriterion, } - sub_state_storage = maybe_wrap_evaluation_type(sub_state) - epms = new{P, Pr, typeof(sub_state_storage), R, SC}() - epms.p = p - epms.sub_problem = sub_problem - epms.sub_state = sub_state_storage - epms.ϵ = ϵ - epms.ϵ_min = ϵ_min - epms.u = u - epms.u_min = u_min - epms.ρ = ρ - epms.θ_ρ = θ_ρ - epms.θ_u = θ_u - epms.θ_ϵ = θ_ϵ - epms.stop = stopping_criterion - return epms + _sub_state = maybe_wrap_evaluation_type(sub_state) + # unify real type – for the keyword args that are stored in the state + R = float(promote_type(typeof.([ϵ, ϵ_min, u, u_min, ρ, θ_u, θ_ϵ, θ_ρ])...)) + ϵ = convert(R, ϵ); ϵ_min = convert(R, ϵ_min); u = convert(R, u); u_min = convert(R, u_min) + θ_ϵ = convert(R, θ_ϵ); θ_u = convert(R, θ_u); θ_ρ = convert(R, θ_ρ); ρ = convert(R, ρ) + return ExactPenaltyMethodState( + sub_problem, _sub_state; p = p, stopping_criterion = stopping_criterion, + ϵ = ϵ, ϵ_min = ϵ_min, u = u, u_min = u_min, ρ = ρ, θ_ρ = θ_ρ, θ_u = θ_u, θ_ϵ = θ_ϵ, + ) + end + function ExactPenaltyMethodState( + sub_problem::Pr, sub_state::St; + p::P, ϵ::R, ϵ_min::R, θ_ϵ::R, u::R, u_min::R, θ_u::R, ρ::R, θ_ρ::R, stopping_criterion::SC + ) where {P, R, Pr <: Union{F, AbstractManoptProblem} where {F}, St <: AbstractManoptSolverState, SC <: StoppingCriterion} + return new{P, Pr, St, R, SC}( + p, sub_problem, sub_state, ϵ, ϵ_min, u, u_min, ρ, θ_ρ, θ_u, θ_ϵ, stopping_criterion + ) end end function ExactPenaltyMethodState( @@ -115,7 +100,8 @@ function ExactPenaltyMethodState( cfs = ClosedFormSubSolverState(; evaluation = evaluation) return ExactPenaltyMethodState(M, sub_problem, cfs; kwargs...) end - +# resolve an ambiguity +ExactPenaltyMethodState(M::AbstractManifold, st::AbstractManoptSolverState; kwargs...) = error("Exact penalty method state can not be constructed based on $M and the sub state $st, a sub_problem is missing") get_iterate(epms::ExactPenaltyMethodState) = epms.p function get_message(epms::ExactPenaltyMethodState) # for now only the sub solver might have messages @@ -125,10 +111,22 @@ function set_iterate!(epms::ExactPenaltyMethodState, M, p) epms.p = p return epms end -function show(io::IO, epms::ExactPenaltyMethodState) +function Base.show(io::IO, epms::ExactPenaltyMethodState) + print(io, "ExactPenaltyMethodState("); print(io, epms.sub_problem); print(io, ", "); print(io, epms.sub_state) + print(io, "; ") + print(io, "p = $(epms.p), ϵ = $(epms.ϵ), ϵ_min = $(epms.ϵ_min), θ_ϵ = $(epms.θ_ϵ), ") + print(io, "u = $(epms.u), u_min = $(epms.u_min), θ_u = $(epms.θ_u), ρ = $(epms.ρ), θ_ρ = $(epms.θ_ρ), ") + print(io, "stopping_criterion = $(epms.stop)") + return print(io, ")") +end +function status_summary(epms::ExactPenaltyMethodState; context::Symbol = :default) + (context === :short) && return repr(epms) i = get_count(epms, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(epms.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the exact panelty method$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(epms.stop) ? "Yes" : "No" + (context === :inline) && (return "An exact penalty method state – $(Iter) $(has_converged(epms) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Exact Penalty Method $Iter @@ -138,10 +136,9 @@ function show(io::IO, epms::ExactPenaltyMethodState) * ρ: $(epms.ρ) (θ_ρ: $(epms.θ_ρ)) ## Stopping criterion - - $(status_summary(epms.stop)) + $(_in_str(status_summary(epms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end _doc_EPM_penalty = raw""" diff --git a/src/solvers/gradient_descent.jl b/src/solvers/gradient_descent.jl index 06d0aa14ed..ddac27e73d 100644 --- a/src/solvers/gradient_descent.jl +++ b/src/solvers/gradient_descent.jl @@ -25,7 +25,7 @@ $(_args(:M)) ## Keyword arguments -* `direction=`[`IdentityUpdateRule`](@ref)`()` +* `direction=`[`IdentityUpdateRule`](@ref)`()` specify a processor to modify the gradient direction $(_kwargs(:p; add_properties = [:as_Initial])) $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(100)")) $(_kwargs(:stepsize; default = "`[`default_stepsize`](@ref)`(M, `[`GradientDescentState`](@ref)`; retraction_method=retraction_method)")) @@ -52,9 +52,8 @@ mutable struct GradientDescentState{ retraction_method::TRTM end function GradientDescentState( - M::AbstractManifold; - p::P = rand(M), - X::T = zero_vector(M, p), + M::AbstractManifold = ManifoldsBase.DefaultManifold(); + p::P = rand(M), X::T = zero_vector(M, p), stopping_criterion::SC = StopAfterIteration(200) | StopWhenGradientNormLess(1.0e-8), retraction_method::RTM = default_retraction_method(M, typeof(p)), stepsize::S = _produce_type( @@ -65,12 +64,8 @@ function GradientDescentState( direction::D = IdentityUpdateRule(), kwargs..., # ignore rest ) where { - P, - T, - SC <: StoppingCriterion, - RTM <: AbstractRetractionMethod, - S <: Stepsize, - D <: DirectionUpdateRule, + P, T, SC <: StoppingCriterion, RTM <: AbstractRetractionMethod, + S <: Stepsize, D <: DirectionUpdateRule, } return GradientDescentState{P, T, SC, S, D, RTM}( p, X, direction, stepsize, stopping_criterion, retraction_method @@ -97,8 +92,19 @@ function get_message(gds::GradientDescentState) # for now only step size is quipped with messages return get_message(gds.stepsize) end -function show(io::IO, gds::GradientDescentState) + +function Base.show(io::IO, gds::GradientDescentState) + print(io, "GradientDescentState(; direction = ", gds.direction, " p = ", gds.p) + print(io, ", stepsize = ", gds.stepsize, ", stopping_criterion = ", status_summary(gds.stop; context = :short)) + print(io, ", retraction_method = ", gds.retraction_method, " X= ", gds.X) + return print(io, ")") +end + +function status_summary(gds::GradientDescentState; context::Symbol = :default) + (context === :short) && return repr(gds) i = get_count(gds, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(gds.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the gradient descent solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(gds.stop) ? "Yes" : "No" s = """ @@ -106,15 +112,15 @@ function show(io::IO, gds::GradientDescentState) $Iter ## Parameters * retraction method: $(gds.retraction_method) + * direction: $(status_summary(gds.direction; context = :inline)) ## Stepsize - $(gds.stepsize) + $(_in_str(status_summary(gds.stepsize; context = context); indent = 0, headers = 1)) ## Stopping criterion - - $(status_summary(gds.stop)) + $(_in_str(status_summary(gds.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end _doc_gd_iterate = raw""" @@ -254,6 +260,7 @@ calls_with_kwargs(::typeof(gradient_descent!)) = (decorate_objective!, decorate_ # function initialize_solver!(mp::AbstractManoptProblem, s::GradientDescentState) get_gradient!(mp, s.X, s.p) + initialize_stepsize!(s.stepsize) return s end function step_solver!(p::AbstractManoptProblem, s::GradientDescentState, k) diff --git a/src/solvers/interior_point_Newton.jl b/src/solvers/interior_point_Newton.jl index 1d39a6d47f..fc4c96c284 100644 --- a/src/solvers/interior_point_Newton.jl +++ b/src/solvers/interior_point_Newton.jl @@ -9,7 +9,7 @@ _doc_IPN = """ interior_point_Newton(M, f, grad_f, Hess_f, p=rand(M); kwargs...) interior_point_Newton(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...) interior_point_Newton!(M, f, grad]_f, Hess_f, p; kwargs...) - interior_point_Newton(M, ConstrainedManifoldObjective, p; kwargs...) + interior_point_Newton!(M, cmo::ConstrainedManifoldObjective, p; kwargs...) perform the interior point Newton method following [LaiYoshise:2024](@cite). @@ -69,8 +69,8 @@ $(_kwargs(:retraction_method)) * `s=copy(μ)`: initial value for the slack variables * `σ=`[`calculate_σ`](@ref)`(M, cmo, p, μ, λ, s)`: scaling factor for the barrier parameter `β` in the sub problem, which is updated during the iterations * `step_objective`: a [`ManifoldGradientObjective`](@ref) of the norm of the KKT vector field [`KKTVectorFieldNormSq`](@ref) and its gradient [`KKTVectorFieldNormSqGradient`](@ref) -* `step_problem`: the manifold ``$(_math(:Manifold))nifold))) × ℝ^m × ℝ^n × ℝ^m`` together with the `step_objective` - as the problem the linesearch `stepsize=` employs for determining a step size +* `step_problem`: the manifold ``$(_math(:Manifold)) × ℝ^m × ℝ^n × ℝ^m`` together with the `step_objective` + as the problem the line search `stepsize=` employs for determining a step size * `step_state`: the [`StepsizeState`](@ref) with point and search direction $(_kwargs(:stepsize; default = "`[`ArmijoLinesearch`](@ref)`()")) with the `centrality_condition` keyword as additional criterion to accept a step, if this is provided")) @@ -193,7 +193,7 @@ function interior_point_Newton!( ), step_problem = DefaultManoptProblem(_step_M, step_objective), _step_p = rand(_step_M), - step_state = StepsizeState(_step_p, zero_vector(_step_M, _step_p)), + step_state = StepsizeState(; p = _step_p, X = zero_vector(_step_M, _step_p)), stepsize::Union{Stepsize, ManifoldDefaultsFactory} = ArmijoLinesearch( _step_M; retraction_method = default_retraction_method(_step_M), diff --git a/src/solvers/particle_swarm.jl b/src/solvers/particle_swarm.jl index bc141854c1..ddd704fb8c 100644 --- a/src/solvers/particle_swarm.jl +++ b/src/solvers/particle_swarm.jl @@ -48,76 +48,80 @@ $(_kwargs(:vector_transport_method)) [`particle_swarm`](@ref) """ mutable struct ParticleSwarmState{ - P, - T, - TX <: AbstractVector{P}, - TVelocity <: AbstractVector{T}, - TParams <: Real, - TStopping <: StoppingCriterion, - TRetraction <: AbstractRetractionMethod, - TInvRetraction <: AbstractInverseRetractionMethod, - TVTM <: AbstractVectorTransportMethod, + P, T, F <: Real, VP <: AbstractVector{P}, VT <: AbstractVector{T}, + SC <: StoppingCriterion, RM <: AbstractRetractionMethod, + IRM <: AbstractInverseRetractionMethod, VTM <: AbstractVectorTransportMethod, } <: AbstractManoptSolverState - swarm::TX - positional_best::TX + swarm::VP + positional_best::VP p::P - velocity::TVelocity - inertia::TParams - social_weight::TParams - cognitive_weight::TParams + velocity::VT + inertia::F + social_weight::F + cognitive_weight::F q::P social_vector::T cognitive_vector::T - stop::TStopping - retraction_method::TRetraction - inverse_retraction_method::TInvRetraction - vector_transport_method::TVTM - + stop::SC + retraction_method::RM + inverse_retraction_method::IRM + vector_transport_method::VTM + function ParticleSwarmState(; + swarm::VP, positional_best::VP, p::P, velocity::VT, + inertia::F, social_weight::F, cognitive_weight::F, q::P, social_vector::T, + cognitive_vector::T, stopping_criterion::SC, retraction_method::RM, + inverse_retraction_method::IRM, vector_transport_method::VTM + ) where { + P, T, F, VP <: AbstractVector, VT <: AbstractVector, SC <: StoppingCriterion, + RM <: AbstractRetractionMethod, IRM <: AbstractInverseRetractionMethod, VTM <: AbstractVectorTransportMethod, + } + return new{P, T, F, VP, VT, SC, RM, IRM, VTM}( + swarm, positional_best, p, velocity, inertia, social_weight, cognitive_weight, q, + social_vector, cognitive_vector, stopping_criterion, retraction_method, + inverse_retraction_method, vector_transport_method + ) + end function ParticleSwarmState( - M::AbstractManifold, - swarm::VP, - velocity::VT; - inertia = 0.65, - social_weight = 1.4, - cognitive_weight = 1.4, + M::AbstractManifold, swarm::VP, velocity::VT; + inertia::Real = 0.65, social_weight::Real = 1.4, cognitive_weight::Real = 1.4, stopping_criterion::SCT = StopAfterIteration(500) | StopWhenChangeLess(M, 1.0e-4), retraction_method::RTM = default_retraction_method(M, eltype(swarm)), inverse_retraction_method::IRM = default_inverse_retraction_method(M, eltype(swarm)), vector_transport_method::VTM = default_vector_transport_method(M, eltype(swarm)), ) where { - P, - T, - VP <: AbstractVector{<:P}, - VT <: AbstractVector{<:T}, - RTM <: AbstractRetractionMethod, - SCT <: StoppingCriterion, - IRM <: AbstractInverseRetractionMethod, - VTM <: AbstractVectorTransportMethod, + P, T, VP <: AbstractVector{<:P}, VT <: AbstractVector{<:T}, + RTM <: AbstractRetractionMethod, SCT <: StoppingCriterion, + IRM <: AbstractInverseRetractionMethod, VTM <: AbstractVectorTransportMethod, } - s = new{ - P, T, VP, VT, typeof(inertia + social_weight + cognitive_weight), SCT, RTM, IRM, VTM, - }() - s.swarm = swarm - s.positional_best = copy.(Ref(M), swarm) - s.q = copy(M, first(swarm)) - s.p = copy(M, first(swarm)) - s.social_vector = zero_vector(M, s.q) - s.cognitive_vector = zero_vector(M, s.q) - s.velocity = velocity - s.inertia = inertia - s.social_weight = social_weight - s.cognitive_weight = cognitive_weight - s.stop = stopping_criterion - s.retraction_method = retraction_method - s.inverse_retraction_method = inverse_retraction_method - s.vector_transport_method = vector_transport_method - return s + R = promote_type(typeof(inertia), typeof(social_weight), typeof(cognitive_weight)) + inertia = convert(R, inertia); social_weight = convert(R, social_weight); cognitive_weight = convert(R, cognitive_weight) + return ParticleSwarmState(; + swarm = swarm, positional_best = copy.(Ref(M), swarm), + q = copy(M, first(swarm)), p = copy(M, first(swarm)), + social_vector = zero_vector(M, first(swarm)), cognitive_vector = zero_vector(M, first(swarm)), + velocity = velocity, inertia = inertia, social_weight = social_weight, cognitive_weight = cognitive_weight, + stopping_criterion = stopping_criterion, retraction_method = retraction_method, + inverse_retraction_method = inverse_retraction_method, vector_transport_method = vector_transport_method + ) end end -function show(io::IO, pss::ParticleSwarmState) +function Base.show(io::IO, pss::ParticleSwarmState) + print(io, "ParticleSwarmState(; swarm = ", pss.swarm, ", positional_best = ", pss.positional_best) + print(io, ", p = ", pss.p, ", velocity = ", pss.velocity) + print(io, ", inertia = ", pss.inertia, ", social_weight = ", pss.social_weight, ", cognitive_weight = ", pss.cognitive_weight) + print(io, ", q = ", pss.q, ", social_vector = ", pss.social_vector, ", cognitive_vector = ", pss.cognitive_vector) + print(io, ", stopping_criterion = ", pss.stop, ", retraction_method = ", pss.retraction_method) + print(io, ", inverse_retraction_method = ", pss.inverse_retraction_method, ", vector_transport_method = ", pss.vector_transport_method) + return print(io, ")") +end +function status_summary(pss::ParticleSwarmState; context::Symbol = :default) + (context === :short) && return repr(pss) i = get_count(pss, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(pss.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the particle swarm solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(pss.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(pss)) – $(Iter) $(has_converged(pss) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Particle Swarm Optimization Algorithm $Iter @@ -130,10 +134,9 @@ function show(io::IO, pss::ParticleSwarmState) * vector transport method: $(pss.vector_transport_method) ## Stopping criterion - - $(status_summary(pss.stop)) + $(_in_str(status_summary(pss.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end # # Access functions @@ -300,16 +303,10 @@ function particle_swarm!( dmco = decorate_objective!(M, mco; kwargs...) mp = DefaultManoptProblem(M, dmco) pss = ParticleSwarmState( - M, - swarm, - velocity; - inertia = inertia, - social_weight = social_weight, - cognitive_weight = cognitive_weight, - stopping_criterion = stopping_criterion, - retraction_method = retraction_method, - inverse_retraction_method = inverse_retraction_method, - vector_transport_method = vector_transport_method, + M, swarm, velocity; + inertia = inertia, social_weight = social_weight, cognitive_weight = cognitive_weight, + stopping_criterion = stopping_criterion, retraction_method = retraction_method, + inverse_retraction_method = inverse_retraction_method, vector_transport_method = vector_transport_method, ) dpss = decorate_state!(pss; kwargs...) solve!(mp, dpss) @@ -330,11 +327,7 @@ function step_solver!(mp::AbstractManoptProblem, s::ParticleSwarmState, ::Any) M = get_manifold(mp) for i in 1:length(s.swarm) inverse_retract!( - M, - s.cognitive_vector, - s.swarm[i], - s.positional_best[i], - s.inverse_retraction_method, + M, s.cognitive_vector, s.swarm[i], s.positional_best[i], s.inverse_retraction_method, ) inverse_retract!(M, s.social_vector, s.swarm[i], s.p, s.inverse_retraction_method) s.velocity[i] .= @@ -408,11 +401,11 @@ function get_reason(c::StopWhenSwarmVelocityLess) end return "" end -function status_summary(c::StopWhenSwarmVelocityLess) +function status_summary(c::StopWhenSwarmVelocityLess; context::Symbol = :default) has_stopped = (c.at_iteration >= 0) && (norm(c.velocity_norms) < c.threshold) s = has_stopped ? "reached" : "not reached" - return "swarm velocity norm < $(c.threshold):\t$s" + return "swarm velocity norm < $(c.threshold):$(_MANOPT_INDENT)$s" end -function show(io::IO, c::StopWhenSwarmVelocityLess) - return print(io, "StopWhenSwarmVelocityLess($(c.threshold))\n $(status_summary(c))") +function Base.show(io::IO, c::StopWhenSwarmVelocityLess) + return print(io, "StopWhenSwarmVelocityLess($(c.threshold))") end diff --git a/src/solvers/projected_gradient_method.jl b/src/solvers/projected_gradient_method.jl index 8902b10711..f53bf02102 100644 --- a/src/solvers/projected_gradient_method.jl +++ b/src/solvers/projected_gradient_method.jl @@ -69,10 +69,14 @@ end get_iterate(pgms::ProjectedGradientMethodState) = pgms.p get_gradient(pgms::ProjectedGradientMethodState) = pgms.X -function show(io::IO, pgms::ProjectedGradientMethodState) +function status_summary(pgms::ProjectedGradientMethodState; context::Symbol = :default) + (context === :short) && return repr(pgms) i = get_count(pgms, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(pgms.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the projected gradient solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(pgms.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(pdsns)) – $(Iter) $(has_converged(pdsns) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Projected Gradient Method $Iter @@ -81,16 +85,15 @@ function show(io::IO, pgms::ProjectedGradientMethodState) * retraction method: $(pgms.retraction_method) ## Stepsize for the gradient step - $(pgms.stepsize) + $(_in_str(status_summary(pgms.stepsize; context = context); indent = 0, headers = 1)) ## Stepsize for the complete step - $(pgms.backtrack) + $(_in_str(status_summary(pgms.backtrack; context = context); indent = 0, headers = 1)) ## Stopping criterion - - $(status_summary(pgms.stop)) + $(_in_str(status_summary(pgms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end # @@ -146,13 +149,14 @@ end indicates_convergence(c::StopWhenProjectedGradientStationary) = true function show(io::IO, c::StopWhenProjectedGradientStationary) return print( - io, "StopWhenProjectedGradientStationary($(c.threshold))\n $(status_summary(c))" + io, "StopWhenProjectedGradientStationary($(c.threshold))" ) end -function status_summary(c::StopWhenProjectedGradientStationary) +function status_summary(c::StopWhenProjectedGradientStationary; context::Symbol = :default) + (context === :short) && return repr(c) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "projected gradient stationary (<$(c.threshold)): \t$s" + return (_is_inline(context) ? "projected gradient stationary (<$(c.threshold)):$(_MANOPT_INDENT)" : "A stopping criterion to stop when the projected gradient is stationary, i.e. in norm less than $(c.threshold).\n$(_MANOPT_INDENT)") * s end # # @@ -262,6 +266,7 @@ calls_with_kwargs(::typeof(projected_gradient_method!)) = (decorate_objective!, function initialize_solver!(amp::AbstractManoptProblem, pgms::ProjectedGradientMethodState) get_gradient!(amp, pgms.X, pgms.p) + initialize_stepsize!(pgms.stepsize) return pgms end diff --git a/src/solvers/proximal_bundle_method.jl b/src/solvers/proximal_bundle_method.jl index fc56f4a420..7032ab7d5e 100644 --- a/src/solvers/proximal_bundle_method.jl +++ b/src/solvers/proximal_bundle_method.jl @@ -53,116 +53,89 @@ $(_kwargs(:vector_transport_method)) $(_kwargs(:X)) to specify the type of tangent vector to use. """ mutable struct ProximalBundleMethodState{ - P, - T, - Pr, - St <: AbstractManoptSolverState, - R <: Real, - IR <: AbstractInverseRetractionMethod, - TR <: AbstractRetractionMethod, - TSC <: StoppingCriterion, - VT <: AbstractVectorTransportMethod, - } <: AbstractManoptSolverState where {P, T, Pr} - approx_errors::AbstractVector{R} - bundle::AbstractVector{Tuple{P, T}} + P, T, Pr, St <: AbstractManoptSolverState, R <: Real, + IR <: AbstractInverseRetractionMethod, TR <: AbstractRetractionMethod, + TSC <: StoppingCriterion, VTM <: AbstractVectorTransportMethod, + VR <: AbstractVector{<:R}, VPT <: AbstractVector{<:Tuple{<:P, <:T}}, VT <: AbstractVector{<:T}, + } <: AbstractManoptSolverState + approx_errors::VR + bundle::VPT c::R d::T inverse_retraction_method::IR - lin_errors::AbstractVector{R} + lin_errors::VR m::R p::P p_last_serious::P retraction_method::TR bundle_size::Integer stop::TSC - transported_subgradients::AbstractVector{T} - vector_transport_method::VT + transported_subgradients::VT + vector_transport_method::VTM X::T α::R α₀::R ε::R δ::R η::R - λ::AbstractVector{R} + λ::VR μ::R ν::R sub_problem::Pr sub_state::St function ProximalBundleMethodState( - M::TM, - sub_problem::Pr, - sub_state::St; + M::TM, sub_problem::Pr, sub_state::St; p::P = rand(M), - m::R = 0.0125, inverse_retraction_method::IR = default_inverse_retraction_method(M, typeof(p)), retraction_method::TR = default_retraction_method(M, typeof(p)), - stopping_criterion::SC = StopWhenLagrangeMultiplierLess(1.0e-8) | - StopAfterIteration(5000), + stopping_criterion::SC = StopWhenLagrangeMultiplierLess(1.0e-8) | StopAfterIteration(5000), bundle_size::Integer = 50, vector_transport_method::VT = default_vector_transport_method(M, typeof(p)), X::T = zero_vector(M, p), - α₀::R = 1.2, - ε::R = 1.0e-2, - δ::R = 1.0, - μ::R = 0.5, + m::Real = 0.0125, α₀::Real = 1.2, ε::Real = 1.0e-2, δ::Real = 1.0, μ::Real = 0.5, ) where { - P, - T, - Pr <: Union{AbstractManoptProblem, F} where {F}, - St <: AbstractManoptSolverState, - R <: Real, - IR <: AbstractInverseRetractionMethod, - TM <: AbstractManifold, - TR <: AbstractRetractionMethod, - SC <: StoppingCriterion, - VT <: AbstractVectorTransportMethod, + P, T, TM <: AbstractManifold, + Pr <: Union{AbstractManoptProblem, F} where {F}, St <: AbstractManoptSolverState, + IR <: AbstractInverseRetractionMethod, TR <: AbstractRetractionMethod, + SC <: StoppingCriterion, VT <: AbstractVectorTransportMethod, } - sub_state_storage = maybe_wrap_evaluation_type(sub_state) - # Initialize index set, bundle points, linearization errors, and stopping parameter - approx_errors = [zero(R)] - bundle = [(copy(M, p), copy(M, p, X))] - c = zero(R) - d = copy(M, p, X) - lin_errors = [zero(R)] - transported_subgradients = [copy(M, p, X)] - α = zero(R) - λ = [zero(R)] - η = zero(R) - ν = zero(R) - return new{P, T, Pr, St, R, IR, TR, SC, VT}( - approx_errors, - bundle, - c, - d, - inverse_retraction_method, - lin_errors, - m, - p, - copy(M, p), - retraction_method, - bundle_size, - stopping_criterion, - transported_subgradients, - vector_transport_method, - X, - α, - α₀, - ε, - δ, - η, - λ, - μ, - ν, - sub_problem, - sub_state, + R = promote_type(typeof(m), typeof(α₀), typeof(ε), typeof(δ), typeof(μ)) + m = convert(R, m); α₀ = convert(R, α₀); ε = convert(R, ε); δ = convert(R, δ); μ = convert(R, μ) + _sub_state = maybe_wrap_evaluation_type(sub_state) + return ProximalBundleMethodState( + sub_problem, _sub_state; + approx_errors = [zero(R)], bundle = [(copy(M, p), copy(M, p, X))], c = zero(R), d = copy(M, p, X), + inverse_retraction_method = inverse_retraction_method, lin_errors = [zero(R)], + m = m, p = p, p_last_serious = copy(M, p), retraction_method = retraction_method, + bundle_size = bundle_size, stopping_criterion = stopping_criterion, transported_subgradients = [copy(M, p, X)], + vector_transport_method = vector_transport_method, X = X, α = zero(R), α₀ = α₀, ε = ε, δ = δ, μ = μ, + λ = [zero(R)], η = zero(R), ν = zero(R) + ) + end + function ProximalBundleMethodState( + sub_problem::Pr, sub_state::St; + approx_errors::VR, bundle::VPT, c::R, d::T, inverse_retraction_method::IR, + lin_errors::VR, m::R, p::P, p_last_serious::P, retraction_method::TR, bundle_size::Integer, + stopping_criterion::TSC, transported_subgradients::VT, vector_transport_method::VTM, + X::T, α::R, α₀::R, ε::R, δ::R, η::R, λ::VR, μ::R, ν::R + ) where { + P, T, Pr, St <: AbstractManoptSolverState, R <: Real, + IR <: AbstractInverseRetractionMethod, TR <: AbstractRetractionMethod, + TSC <: StoppingCriterion, VTM <: AbstractVectorTransportMethod, + VR <: AbstractVector, VPT <: AbstractVector, VT <: AbstractVector, + } + return new{P, T, Pr, St, R, IR, TR, TSC, VTM, VR, VPT, VT}( + approx_errors, bundle, c, d, inverse_retraction_method, + lin_errors, m, p, p_last_serious, retraction_method, bundle_size, + stopping_criterion, transported_subgradients, vector_transport_method, + X, α, α₀, ε, δ, η, λ, μ, ν, sub_problem, sub_state, ) end end +ProximalBundleMethodState(M::AbstractManifold, st::AbstractManoptSolverState; kwargs...) = error("Proximal Bunde Method state can not be constructed based on $M and the sub state $st, a sub_problem is missing") function ProximalBundleMethodState( - M::AbstractManifold, - sub_problem = proximal_bundle_method_subsolver; - evaluation::E = AllocatingEvaluation(), - kwargs..., + M::AbstractManifold, sub_problem = proximal_bundle_method_subsolver; + evaluation::E = AllocatingEvaluation(), kwargs..., ) where {E <: AbstractEvaluationType} cfs = ClosedFormSubSolverState(; evaluation = evaluation) return ProximalBundleMethodState(M, sub_problem, cfs; kwargs...) @@ -176,15 +149,27 @@ end get_subgradient(pbms::ProximalBundleMethodState) = pbms.d function show(io::IO, pbms::ProximalBundleMethodState) + print(io, "ProximalBundleMethodState(", pbms.sub_problem, ", ", pbms.sub_state, "; ") + print(io, "approx_errors = ", pbms.approx_errors, ", bundle = ", pbms.bundle, ", c = ", pbms.c) + print(io, ", d = ", pbms.d, "inverse_retraction_method = ", pbms.inverse_retraction_method) + print(io, ", lin_errors = ", pbms.lin_errors, ", m = ", pbms.m, ", p = ", pbms.p, ", p_last_serious = ", pbms.p_last_serious) + print(io, ", retraction_method = ", pbms.retraction_method, ", bundle_size = ", pbms.bundle_size) + print(io, ", stopping_criterion = ", pbms.stop, ", transported_subgradients = ", pbms.transported_subgradients) + print(io, ", vector_transport_method = ", pbms.vector_transport_method, ", X = ", pbms.X) + print(io, ", α = ", pbms.α, ", α₀ = ", pbms.α₀, ", ε = ", pbms.ε, "δ = ", pbms.δ, "η = ", pbms.η, "λ = ", pbms.λ, ", μ = ", pbms.μ, ", ν = ", pbms.ν) + return print(io, ")") +end +function status_summary(pbms::ProximalBundleMethodState; context::Symbol = :default) + (context === :short) && return repr(pbms) i = get_count(pbms, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(pbms.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the proximal bundle method$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(pbms.stop) ? "Yes" : "No" s = """ # Solver state for `Manopt.jl`s Proximal Bundle Method $Iter - ## Parameters - * bundle size: $(pbms.bundle_size) * inverse retraction: $(pbms.inverse_retraction_method) * descent test parameter: $(pbms.m) @@ -198,9 +183,9 @@ function show(io::IO, pbms::ProximalBundleMethodState) * proximal parameter μ: $(pbms.μ) ## Stopping criterion - $(status_summary(pbms.stop)) + $(_in_str(status_summary(pbms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end _doc_PBM_dk = raw""" diff --git a/src/solvers/proximal_gradient_method.jl b/src/solvers/proximal_gradient_method.jl index 03bb8ea1a9..201a6916c0 100644 --- a/src/solvers/proximal_gradient_method.jl +++ b/src/solvers/proximal_gradient_method.jl @@ -182,6 +182,7 @@ function initialize_solver!(amp::AbstractManoptProblem, pgms::ProximalGradientMe M = get_manifold(amp) zero_vector!(M, pgms.X, pgms.p) copyto!(M, pgms.a, pgms.p) + initialize_stepsize!(pgms.stepsize) return pgms end diff --git a/src/solvers/proximal_point.jl b/src/solvers/proximal_point.jl index c2909f176c..32fd17fa0a 100644 --- a/src/solvers/proximal_point.jl +++ b/src/solvers/proximal_point.jl @@ -30,33 +30,38 @@ $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(100)")) [`proximal_point`](@ref) """ -mutable struct ProximalPointState{P, Tλ, TStop <: StoppingCriterion} <: - AbstractGradientSolverState +mutable struct ProximalPointState{P, Tλ, TStop <: StoppingCriterion} <: AbstractGradientSolverState λ::Tλ p::P stop::TStop + function ProximalPointState( + M::AbstractManifold; + λ::F = k -> 1.0, p::P = rand(M), stopping_criterion::SC = StopAfterIteration(200), + ) where {P, F, SC <: StoppingCriterion} + return ProximalPointState(; λ = λ, p = p, stopping_criterion = stopping_criterion) + end + function ProximalPointState(; λ::F, p::P, stopping_criterion::SC) where {P, F, SC <: StoppingCriterion} + return new{P, F, SC}(λ, p, stopping_criterion) + end end -function ProximalPointState( - M::AbstractManifold; - λ::F = k -> 1.0, - p::P = rand(M), - stopping_criterion::SC = StopAfterIteration(200), - ) where {P, F, SC <: StoppingCriterion} - return ProximalPointState{P, F, SC}(λ, p, stopping_criterion) +function Base.show(io::IO, pps::ProximalPointState) + print(io, "ProximalPointState(; ") + return print(io, "λ = $(pps.λ), p = $(pps.p), stopping_criterion = $(pps.stop))") end -function show(io::IO, gds::ProximalPointState) - i = get_count(gds, :Iterations) +function status_summary(pps::ProximalPointState; context::Symbol = :default) + (context === :short) && return repr(pps) + i = get_count(pps, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(pps.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the proximal point algorithm$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" - Conv = indicates_convergence(gds.stop) ? "Yes" : "No" + Conv = indicates_convergence(pps.stop) ? "Yes" : "No" s = """ # Solver state for `Manopt.jl`s Proximal Point Method $Iter - ## Stopping criterion - - $(status_summary(gds.stop)) + $(_in_str(status_summary(pps.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end # # diff --git a/src/solvers/quasi_Newton.jl b/src/solvers/quasi_Newton.jl index a3c00c1b0a..388f0e03f8 100644 --- a/src/solvers/quasi_Newton.jl +++ b/src/solvers/quasi_Newton.jl @@ -44,15 +44,10 @@ $(_kwargs(:X; add_properties = [:as_Memory])) [`quasi_Newton`](@ref) """ mutable struct QuasiNewtonState{ - P, - T, - D <: AbstractQuasiNewtonDirectionUpdate, - SC <: StoppingCriterion, - S <: Stepsize, - RTR <: AbstractRetractionMethod, - VT <: AbstractVectorTransportMethod, - R, - TPrecon <: QuasiNewtonPreconditioner, + P, T, D <: AbstractQuasiNewtonDirectionUpdate, + SC <: StoppingCriterion, S <: Stepsize, + RTR <: AbstractRetractionMethod, VT <: AbstractVectorTransportMethod, + R, TPrecon <: QuasiNewtonPreconditioner, } <: AbstractGradientSolverState p::P p_old::P @@ -69,65 +64,55 @@ mutable struct QuasiNewtonState{ vector_transport_method::VT nondescent_direction_behavior::Symbol nondescent_direction_value::R -end -function QuasiNewtonState( - M::AbstractManifold; - p::P = rand(M), - initial_vector::T = zero_vector(M, p), # deprecated - X::T = initial_vector, - vector_transport_method::VTM = default_vector_transport_method(M, typeof(p)), - preconditioner::Union{QuasiNewtonPreconditioner, Nothing} = nothing, - initial_scale::Union{<:Real, Nothing} = isnothing(preconditioner) ? 1.0 : nothing, - memory_size::Int = 20, - direction_update::D = QuasiNewtonLimitedMemoryDirectionUpdate( - M, - p, - InverseBFGS(), - memory_size; - vector_transport_method = vector_transport_method, - initial_scale = initial_scale, - ), - stopping_criterion::SC = StopAfterIteration(1000) | StopWhenGradientNormLess(1.0e-6), - retraction_method::RM = default_retraction_method(M, typeof(p)), - stepsize::S = default_stepsize( - M, - QuasiNewtonState; - retraction_method = retraction_method, - vector_transport_method = vector_transport_method, - ), - nondescent_direction_behavior::Symbol = :reinitialize_direction_update, - kwargs..., # collect but ignore rest to be more tolerant - ) where { - P, - T, - D <: AbstractQuasiNewtonDirectionUpdate, - SC <: StoppingCriterion, - S <: Stepsize, - RM <: AbstractRetractionMethod, - VTM <: AbstractVectorTransportMethod, - } - precon = if isnothing(preconditioner) - QuasiNewtonPreconditioner((M, p, X) -> X) - else - preconditioner + function QuasiNewtonState( + M::AbstractManifold; + p::P = rand(M), + initial_vector::T = zero_vector(M, p), # deprecated + X::T = initial_vector, + vector_transport_method::VTM = default_vector_transport_method(M, typeof(p)), + preconditioner::Union{QuasiNewtonPreconditioner, Nothing} = nothing, + initial_scale::Union{<:Real, Nothing} = isnothing(preconditioner) ? 1.0 : nothing, + memory_size::Int = 20, + direction_update::D = QuasiNewtonLimitedMemoryDirectionUpdate( + M, p, InverseBFGS(), memory_size; + vector_transport_method = vector_transport_method, initial_scale = initial_scale, + ), + stopping_criterion::SC = StopAfterIteration(1000) | StopWhenGradientNormLess(1.0e-6), + retraction_method::RM = default_retraction_method(M, typeof(p)), + stepsize::S = default_stepsize( + M, QuasiNewtonState; + retraction_method = retraction_method, vector_transport_method = vector_transport_method, + ), + nondescent_direction_behavior::Symbol = :reinitialize_direction_update, + kwargs..., + ) where { + P, T, D <: AbstractQuasiNewtonDirectionUpdate, SC <: StoppingCriterion, + S <: Stepsize, RM <: AbstractRetractionMethod, VTM <: AbstractVectorTransportMethod, + } + precon = isnothing(preconditioner) ? QuasiNewtonPreconditioner((M, p, X) -> X) : preconditioner + return QuasiNewtonState(; + p = p, p_old = copy(M, p), η = copy(M, p, X), X = X, sk = copy(M, p, X), yk = copy(M, p, X), + direction_update = direction_update, preconditioner = precon, + retraction_method = retraction_method, stepsize = stepsize, stopping_criterion = stopping_criterion, + X_old = copy(M, p, X), vector_transport_method = vector_transport_method, + nondescent_direction_behavior = nondescent_direction_behavior, nondescent_direction_value = 1.0 + ) + end + function QuasiNewtonState(; + p::P, p_old::P, η::T, X::T, sk::T, yk::T, direction_update::D, preconditioner::TP, + retraction_method::RTR, stepsize::S, stopping_criterion::SC, X_old::T, vector_transport_method::VTM, + nondescent_direction_behavior::Symbol, nondescent_direction_value::R + ) where { + P, T, D <: AbstractQuasiNewtonDirectionUpdate, + SC <: StoppingCriterion, S <: Stepsize, + RTR <: AbstractRetractionMethod, VTM <: AbstractVectorTransportMethod, R, TP <: QuasiNewtonPreconditioner, + } + return new{P, T, D, SC, S, RTR, VTM, R, TP}( + p, p_old, η, X, sk, yk, + direction_update, preconditioner, retraction_method, stepsize, stopping_criterion, + X_old, vector_transport_method, nondescent_direction_behavior, nondescent_direction_value, + ) end - return QuasiNewtonState{P, T, D, SC, S, RM, VTM, Float64, typeof(precon)}( - p, - copy(M, p), - copy(M, p, X), - X, - copy(M, p, X), - copy(M, p, X), - direction_update, - precon, - retraction_method, - stepsize, - stopping_criterion, - copy(M, p, X), - vector_transport_method, - nondescent_direction_behavior, - 1.0, - ) end function get_message(qns::QuasiNewtonState) # collect messages from @@ -148,8 +133,20 @@ function get_message(qns::QuasiNewtonState) d = "$(length(d) > 0 ? "\n" : "")$(msg3)" return d end -function show(io::IO, qns::QuasiNewtonState) +function Base.show(io::IO, qns::QuasiNewtonState) + print(io, "QuasiNewtonState(; ") + print(io, "direction_update = ", qns.direction_update, ", p = ", qns.p, ", p_old = ", qns.p_old) + print(io, ", η = ", qns.η, ", X = ", qns.X, ", sk = ", qns.sk, ", yk = ", qns.yk, ", ") + print(io, "nondescent_direction_behavior = ", qns.nondescent_direction_behavior, ", nondescent_direction_value = ", qns.nondescent_direction_value, ", ") + print(io, "preconditioner = ", qns.preconditioner, ", retraction_method = ", qns.retraction_method, ", stepsize = ", qns.stepsize, ", ") + print(io, "stopping_critertion = ", qns.stop, ", X_old = ", qns.X_old, ", vector_transport_method = ", qns.vector_transport_method) + return print(io, ")") +end +function status_summary(qns::QuasiNewtonState; context::Symbol = :default) + (context === :short) && return repr(qns) i = get_count(qns, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(qns.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the quasi Newton solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(qns.stop) ? "Yes" : "No" s = """ @@ -161,13 +158,12 @@ function show(io::IO, qns::QuasiNewtonState) * vector transport method: $(qns.vector_transport_method) ## Stepsize - $(qns.stepsize) + $(_in_str(status_summary(qns.stepsize; context = context); indent = 0, headers = 1)) ## Stopping criterion - - $(status_summary(qns.stop)) + $(_in_str(status_summary(qns.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end get_iterate(qns::QuasiNewtonState) = qns.p function set_iterate!(qns::QuasiNewtonState, M, p) @@ -206,10 +202,10 @@ $(_args([:M, :f, :grad_f, :p])) # Keyword arguments -* `basis=`[`DefaultOrthonormalBasis`](@extref ManifoldsBase.DefaultOrthonormalBasis)`()`: +* `basis::AbstractBasis=`[`DefaultOrthonormalBasis`](@extref ManifoldsBase.DefaultOrthonormalBasis)`()`: basis to use within each of the the tangent spaces to represent the Hessian (inverse) for the cases where it is stored in full (matrix) form. -* `cautious_update=false`: +* `cautious_update::Bool=false`: whether or not to use the [`QuasiNewtonCautiousDirectionUpdate`](@ref) which wraps the `direction_update`. * `cautious_function=(x) -> x * 1e-4`: @@ -225,7 +221,7 @@ $(_kwargs(:evaluation; add_properties = [:GradientExample])) See also `initial_scale`. * `initial_scale=1.0`: scale initial `s` to use in with $(_doc_QN_init_scaling) in the computation of the limited memory approach. see also `initial_operator` -* `memory_size=20`: limited memory, number of ``s_k, y_k`` to store. +* `memory_size::Int=min(manifold_dimension(M), 20)`: limited memory, number of ``s_k, y_k`` to store. Set to a negative value to use a full memory (matrix) representation * `nondescent_direction_behavior=:reinitialize_direction_update`: specify how non-descent direction is handled. This can be @@ -327,12 +323,15 @@ function quasi_Newton!( ), stopping_criterion::StoppingCriterion = StopAfterIteration(max(1000, memory_size)) | StopWhenGradientNormLess(1.0e-6), + nonpositive_curvature_behavior::Symbol = :ignore, + sy_tol::Real = 1.0e-8, kwargs..., ) where { E <: AbstractEvaluationType, O <: Union{AbstractManifoldFirstOrderObjective{E}, AbstractDecoratedManifoldObjective{E}}, } keywords_accepted(quasi_Newton!; kwargs...) + local local_dir_upd # COV_EXCL_LINE if memory_size >= 0 local_dir_upd = QuasiNewtonLimitedMemoryDirectionUpdate( M, @@ -342,7 +341,12 @@ function quasi_Newton!( initial_scale = initial_scale, (project!) = (project!), vector_transport_method = vector_transport_method, + nonpositive_curvature_behavior = nonpositive_curvature_behavior, + sy_tol = sy_tol, ) + if has_anisotropic_max_stepsize(M) + local_dir_upd = QuasiNewtonLimitedMemoryBoxDirectionUpdate(local_dir_upd) + end else local_dir_upd = QuasiNewtonMatrixDirectionUpdate( M, @@ -381,21 +385,32 @@ function quasi_Newton!( end calls_with_kwargs(::typeof(quasi_Newton!)) = (decorate_objective!, decorate_state!) +function _get_max_stepsize(M::AbstractManifold, qns::QuasiNewtonState) + current_max_stepsize = get_parameter(qns.direction_update, Val(:max_stepsize)) + if !isnothing(current_max_stepsize) && !isfinite(current_max_stepsize) + return max_stepsize(M, qns.p) / norm(qns.η) + else + return current_max_stepsize + end +end + function initialize_solver!(amp::AbstractManoptProblem, qns::QuasiNewtonState) M = get_manifold(amp) get_gradient!(amp, qns.X, qns.p) copyto!(M, qns.sk, qns.p, qns.X) copyto!(M, qns.yk, qns.p, qns.X) initialize_update!(qns.direction_update) + initialize_stepsize!(qns.stepsize) return qns end function step_solver!(mp::AbstractManoptProblem, qns::QuasiNewtonState, k) M = get_manifold(mp) - get_gradient!(mp, qns.X, qns.p) + # qns.X should be the correct gradient at qns.p from initialization or the previous step qns.direction_update(qns.η, mp, qns) + current_max_stepsize = _get_max_stepsize(M, qns) if !(qns.nondescent_direction_behavior === :ignore) qns.nondescent_direction_value = real(inner(M, qns.p, qns.η, qns.X)) - if qns.nondescent_direction_value > 0 + if qns.nondescent_direction_value >= 0 if qns.nondescent_direction_behavior === :step_towards_negative_gradient || qns.nondescent_direction_behavior === :reinitialize_direction_update copyto!(M, qns.η, qns.X) @@ -404,9 +419,20 @@ function step_solver!(mp::AbstractManoptProblem, qns::QuasiNewtonState, k) if qns.nondescent_direction_behavior === :reinitialize_direction_update initialize_update!(qns.direction_update) end + # update direction after reinitialization to get a valid one + if qns.nondescent_direction_behavior === :step_towards_negative_gradient || + qns.nondescent_direction_behavior === :reinitialize_direction_update + qns.direction_update(qns.η, mp, qns) + current_max_stepsize = _get_max_stepsize(M, qns) + end end end - α = qns.stepsize(mp, qns, k, qns.η; gradient = qns.X) + local α # COV_EXCL_LINE + if isnothing(current_max_stepsize) + α = qns.stepsize(mp, qns, k, qns.η; gradient = qns.X) + else + α = qns.stepsize(mp, qns, k, qns.η; gradient = qns.X, stop_when_stepsize_exceeds = current_max_stepsize) + end copyto!(M, qns.p_old, get_iterate(qns)) ManifoldsBase.retract_fused!(M, qns.p, qns.p, qns.η, α, qns.retraction_method) qns.η .*= α @@ -707,6 +733,52 @@ function update_hessian!( return d end +function fill_rho_i!(M::AbstractManifold, p, d::QuasiNewtonLimitedMemoryDirectionUpdate, i::Int) + v = inner(M, p, d.memory_s[i], d.memory_y[i]) + if d.nonpositive_curvature_behavior === :ignore && iszero(v) + d.ρ[i] = zero(eltype(d.ρ)) + if length(d.message) > 0 + d.message = replace(d.message, " i=" => " i=$i,") + d.message = replace(d.message, "summand in" => "summands in") + else + d.message = "The inner products ⟨s_i,y_i⟩ ≈ 0, i=$i, ignoring summand in approximation." + end + elseif d.nonpositive_curvature_behavior === :byrd && v <= d.sy_tol * norm(M, p, d.memory_y[i]) + d.ρ[i] = zero(eltype(d.ρ)) + if length(d.message) > 0 + d.message = replace(d.message, " i=" => " i=$i,") + d.message = replace(d.message, "summand in" => "summands in") + else + d.message = "The inner products ⟨s_i,y_i⟩ <= $(d.sy_tol * norm(M, p, d.memory_y[i])), i=$i, removing summand from approximation." + end + else + d.ρ[i] = 1 / v + end + return d +end + +function _drop_zero_rho_vectors!(d::QuasiNewtonLimitedMemoryDirectionUpdate{U}) where {U <: InverseBFGS} + T = eltype(d.memory_s) + memory_size = capacity(d.memory_s) + new_scb = CircularBuffer{T}(memory_size) + new_ycb = CircularBuffer{T}(memory_size) + new_ρ = similar(d.ρ) + fill!(new_ρ, 0) + j = 1 + for i in 1:length(d.memory_s) + if !iszero(d.ρ[i]) + push!(new_scb, d.memory_s[i]) + push!(new_ycb, d.memory_y[i]) + new_ρ[j] = d.ρ[i] + j += 1 + end + end + d.memory_s = new_scb + d.memory_y = new_ycb + d.ρ = new_ρ + return d +end + # Limited-memory update function update_hessian!( d::QuasiNewtonLimitedMemoryDirectionUpdate{U}, @@ -720,6 +792,7 @@ function update_hessian!( start = length(d.memory_s) == capacity(d.memory_s) ? 2 : 1 M = get_manifold(mp) p = get_iterate(st) + reforming_required = false for i in start:length(d.memory_s) # transport all stored tangent vectors in the tangent space of the next iterate vector_transport_to!( @@ -728,6 +801,28 @@ function update_hessian!( vector_transport_to!( M, d.memory_y[i], p_old, d.memory_y[i], p, d.vector_transport_method ) + + # what if division by zero happened here, setting to zero ignores this in the next step + # pre-compute in case inner is expensive + fill_rho_i!(M, p, d, i) + if d.nonpositive_curvature_behavior === :byrd && iszero(d.ρ[i]) + reforming_required = true + end + end + + if reforming_required + # we need to move first vectors in memory too because they most likely won't be + # overwritten by new pairs + if start == 2 + vector_transport_to!( + M, d.memory_s[1], p_old, d.memory_s[1], p, d.vector_transport_method + ) + vector_transport_to!( + M, d.memory_y[1], p_old, d.memory_y[1], p, d.vector_transport_method + ) + fill_rho_i!(M, p, d, 1) + end + _drop_zero_rho_vectors!(d) end # add newest @@ -736,6 +831,7 @@ function update_hessian!( old_sk = popfirst!(d.memory_s) copyto!(M, old_sk, st.sk) push!(d.memory_s, old_sk) + circshift!(d.ρ, -1) else push!(d.memory_s, copy(M, st.sk)) end @@ -746,6 +842,9 @@ function update_hessian!( else push!(d.memory_y, copy(M, st.yk)) end + + fill_rho_i!(M, p, d, length(d.memory_s)) + return d end @@ -774,7 +873,27 @@ function update_hessian!( vector_transport_to!( M, d.update.memory_y[i], p_old, d.update.memory_y[i], p, d.update.vector_transport_method, ) + fill_rho_i!(M, p, d.update, i) end end return d end + +function get_cost( + mp::AbstractManoptProblem, s::QuasiNewtonState{ + P, T, + <:AbstractQuasiNewtonDirectionUpdate, + <:StoppingCriterion, + <:HagerZhangLinesearchStepsize, + } + ) where {P, T} + + hzls = s.stepsize + if hzls.last_evaluation_index === 0 + # if no evaluation was performed, we need to compute the cost + return get_cost(mp, s.p) + else + # we can reuse the stored function value from the linesearch + return hzls.triples[hzls.last_evaluation_index].f + end +end diff --git a/src/solvers/solver.jl b/src/solvers/solver.jl index bcaab743aa..ec64db0a65 100644 --- a/src/solvers/solver.jl +++ b/src/solvers/solver.jl @@ -124,7 +124,7 @@ function decorate_objective!( # 3) _then_ cache, # count should not be affected by 1) but cache should be on manifold not embedding # => only count _after_ cache misses - # and always last wrapper: `ReturnObjective`. + # and always last wrapper: `ReturnManifoldObjective`. deco_o = o if objective_type ∈ [:Embedding, :Euclidean] deco_o = EmbeddedManifoldObjective(o, _embedded_p, _embedded_X) @@ -189,3 +189,20 @@ end function stop_solver!(p::AbstractManoptProblem, s::ReturnSolverState, k) return stop_solver!(p, s.state, k) end + +""" + get_cost(p::AbstractManoptProblem, s::AbstractManoptSolverState) + +Get cost at the current iterate of the solver state `s` for the problem `p`. +The method may be implemented by particular solvers if they store the cost at the current +iterate in the state, but by default it is obtained by calling `get_cost(p, get_iterate(s))`. +""" +function get_cost(p::AbstractManoptProblem, s::AbstractManoptSolverState) + return _get_cost(p, s, dispatch_state_decorator(s)) +end +function _get_cost(p::AbstractManoptProblem, s::AbstractManoptSolverState, ::Val{false}) + return get_cost(p, get_iterate(s)) +end +function _get_cost(p::AbstractManoptProblem, s::AbstractManoptSolverState, ::Val{true}) + return get_cost(p, s.state) +end diff --git a/src/solvers/stochastic_gradient_descent.jl b/src/solvers/stochastic_gradient_descent.jl index 6f92708dd6..9448b3d4a4 100644 --- a/src/solvers/stochastic_gradient_descent.jl +++ b/src/solvers/stochastic_gradient_descent.jl @@ -23,7 +23,7 @@ Create a `StochasticGradientDescentState` with start point `p`. # Keyword arguments -* `direction=`[`StochasticGradientRule`](@ref)`(M, $(_link(:zero_vector))) +* `direction=`[`StochasticGradientRule`](@ref)`(M, `$(_link(:zero_vector))`)` * `order_type=:RandomOrder`` * `order=Int[]`: specify how to store the order of indices for the next epoche $(_kwargs(:retraction_method)) @@ -34,22 +34,27 @@ $(_kwargs(:X; add_properties = [:as_Memory])) """ mutable struct StochasticGradientDescentState{ - TX, - TV, - D <: DirectionUpdateRule, - TStop <: StoppingCriterion, - TStep <: Stepsize, - RM <: AbstractRetractionMethod, + P, T, D <: DirectionUpdateRule, SC <: StoppingCriterion, S <: Stepsize, RM <: AbstractRetractionMethod, V <: Vector{<:Int}, } <: AbstractGradientSolverState - p::TX - X::TV + p::P + X::T direction::D - stop::TStop - stepsize::TStep + stop::SC + stepsize::S order_type::Symbol - order::Vector{<:Int} + order::V retraction_method::RM k::Int # current iterate + function StochasticGradientDescentState(; + direction::D, p::P, X::T, stopping_criterion::SC, stepsize::S, + order_type::Symbol, order::V, retraction_method::RM, k = 0 + ) where { + P, T, D <: DirectionUpdateRule, SC <: StoppingCriterion, S <: Stepsize, RM <: AbstractRetractionMethod, V <: Vector{<:Int}, + } + return new{P, T, D, SC, S, RM, V}( + p, X, direction, stopping_criterion, stepsize, order_type, order, retraction_method, k + ) + end end function StochasticGradientDescentState( @@ -63,44 +68,47 @@ function StochasticGradientDescentState( stopping_criterion::SC = StopAfterIteration(1000), stepsize::S = default_stepsize(M, StochasticGradientDescentState), ) where { - P, - T, - D <: DirectionUpdateRule, - RM <: AbstractRetractionMethod, - SC <: StoppingCriterion, - S <: Stepsize, + P, T, D <: DirectionUpdateRule, RM <: AbstractRetractionMethod, SC <: StoppingCriterion, S <: Stepsize, } - return StochasticGradientDescentState{P, T, D, SC, S, RM}( - p, - X, - direction, - stopping_criterion, - stepsize, - order_type, - order, - retraction_method, - 0, + return StochasticGradientDescentState(; + p = p, X = X, direction = direction, stopping_criterion = stopping_criterion, + stepsize = stepsize, order_type = order_type, order = order, retraction_method = retraction_method, k = 0, ) end -function show(io::IO, sgds::StochasticGradientDescentState) +function Base.show(io::IO, sgds::StochasticGradientDescentState) + print(io, "StochasticGradientDescentState(; ") + print(io, "direction = "); print(io, sgds.direction); print(io, ", ") + print(io, "order = "); print(io, sgds.order); print(io, ", ") + print(io, "order_type = :$(sgds.order_type), ") + print(io, "p = $(sgds.p), ") + print(io, "retraction_method = "); print(io, sgds.retraction_method); print(io, ", ") + print(io, "stepsize = "); print(io, sgds.stepsize); print(io, ", ") + print(io, "stopping_crierion = "); print(io, status_summary(sgds.stop; context = :short)); print(io, ", ") + print(io, "X = "); print(io, sgds.X) + return print(io, ")") +end +function status_summary(sgds::StochasticGradientDescentState; context::Symbol = :default) + (context === :short) && return repr(sgds) i = get_count(sgds, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(sgds.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the stochastic gradient descent algorithm$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(sgds.stop) ? "Yes" : "No" s = """ # Solver state for `Manopt.jl`s Stochastic Gradient Descent $Iter ## Parameters + * direction: $(status_summary(sgds.direction; context = :inline)) * order: $(sgds.order_type) * retraction method: $(sgds.retraction_method) ## Stepsize - $(sgds.stepsize) + $(_in_str(status_summary(sgds.stepsize; context = context); indent = 0, headers = 1)) ## Stopping criterion - - $(status_summary(sgds.stop)) + $(_in_str(status_summary(sgds.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end """ StochasticGradientRule<: AbstractGradientGroupDirectionRule @@ -133,7 +141,6 @@ function StochasticGradientRule( ) where {T} return StochasticGradientRule{T}(X) end - function (sg::StochasticGradientRule)( apm::AbstractManoptProblem, sgds::StochasticGradientDescentState, k ) @@ -143,7 +150,13 @@ function (sg::StochasticGradientRule)( j = sgds.order_type == :Random ? rand(1:length(sgds.order)) : sgds.order[sgds.k] return sgds.stepsize(apm, sgds, k), get_gradient!(apm, sg.X, sgds.p, j) end - +function Base.show(io::IO, sg::StochasticGradientRule) + return print(io, "StochasticGradientRule($(sg.X)") +end +function status_summary(sg::StochasticGradientRule; context::Symbol = :default) + (context === :short) && return repr(sg) + return "A stochastic gradient processor" +end @doc """ StochasticGradient(; kwargs...) StochasticGradient(M::AbstractManifold; kwargs...) @@ -190,7 +203,8 @@ then using the `cost=` keyword does not have any effect since if so, the cost is # Keyword arguments * `cost=missing`: you can provide a cost function for example to track the function value -* `direction=`[`StochasticGradient`](@ref)`($(_link(:zero_vector))) +* `direction=`[`StochasticGradient`](@ref)`(`$(_link(:zero_vector))`)` add a post-processor to + the direction obtained from evaluating the sub-gradient. $(_kwargs(:evaluation)) * `evaluation_order=:Random`: specify whether to use a randomly permuted sequence (`:FixedRandom`:, a per cycle permuted sequence (`:Linear`) or the default `:Random` one. @@ -212,12 +226,8 @@ function stochastic_gradient_descent(M::AbstractManifold, grad_f; kwargs...) return stochastic_gradient_descent(M, grad_f, rand(M); kwargs...) end function stochastic_gradient_descent( - M::AbstractManifold, - grad_f, - p; - cost = Missing(), - evaluation::AbstractEvaluationType = AllocatingEvaluation(), - kwargs..., + M::AbstractManifold, grad_f, p; + cost = Missing(), evaluation::AbstractEvaluationType = AllocatingEvaluation(), kwargs..., ) p_ = _ensure_mutating_variable(p) cost_ = _ensure_mutating_cost(cost, p) @@ -247,25 +257,18 @@ calls_with_kwargs(::typeof(stochastic_gradient_descent)) = (stochastic_gradient_ @doc "$(_doc_SGD)" stochastic_gradient_descent!(::AbstractManifold, args...; kwargs...) function stochastic_gradient_descent!( - M::AbstractManifold, - grad_f, - p; - cost = Missing(), - evaluation::AbstractEvaluationType = AllocatingEvaluation(), - kwargs..., + M::AbstractManifold, grad_f, p; + cost = Missing(), evaluation::AbstractEvaluationType = AllocatingEvaluation(), kwargs..., ) msgo = ManifoldStochasticGradientObjective(grad_f; cost = cost, evaluation = evaluation) return stochastic_gradient_descent!(M, msgo, p; evaluation = evaluation, kwargs...) end function stochastic_gradient_descent!( - M::AbstractManifold, - msgo::O, - p; + M::AbstractManifold, msgo::O, p; direction::Union{<:DirectionUpdateRule, ManifoldDefaultsFactory} = StochasticGradient(; p = p ), - stopping_criterion::StoppingCriterion = StopAfterIteration(10000) | - StopWhenGradientNormLess(1.0e-9), + stopping_criterion::StoppingCriterion = StopAfterIteration(10000) | StopWhenGradientNormLess(1.0e-9), stepsize::Union{Stepsize, ManifoldDefaultsFactory} = default_stepsize( M, StochasticGradientDescentState ), @@ -278,15 +281,10 @@ function stochastic_gradient_descent!( dmsgo = decorate_objective!(M, msgo; kwargs...) mp = DefaultManoptProblem(M, dmsgo) sgds = StochasticGradientDescentState( - M; - p = p, - X = zero_vector(M, p), - direction = _produce_type(direction, M, p), - stopping_criterion = stopping_criterion, - stepsize = _produce_type(stepsize, M, p), - order_type = order_type, - order = order, - retraction_method = retraction_method, + M; p = p, X = zero_vector(M, p), + direction = _produce_type(direction, M, p), stepsize = _produce_type(stepsize, M, p), + order_type = order_type, order = order, + stopping_criterion = stopping_criterion, retraction_method = retraction_method, ) dsgds = decorate_state!(sgds; kwargs...) solve!(mp, dsgds) @@ -297,6 +295,7 @@ calls_with_kwargs(::typeof(stochastic_gradient_descent!)) = (decorate_objective! function initialize_solver!(::AbstractManoptProblem, s::StochasticGradientDescentState) s.k = 1 (s.order_type == :FixedRandom) && (shuffle!(s.order)) + initialize_stepsize!(s.stepsize) return s end function step_solver!(mp::AbstractManoptProblem, s::StochasticGradientDescentState, iter) diff --git a/src/solvers/subgradient.jl b/src/solvers/subgradient.jl index 3ba78c939e..5e4198daea 100644 --- a/src/solvers/subgradient.jl +++ b/src/solvers/subgradient.jl @@ -26,36 +26,45 @@ $(_kwargs(:stopping_criterion; default = "`[`StopAfterIteration`](@ref)`(5000)") $(_kwargs(:X; add_properties = [:as_Memory])) """ mutable struct SubGradientMethodState{ - TR <: AbstractRetractionMethod, TS <: Stepsize, TSC <: StoppingCriterion, P, T, + P, T, RM <: AbstractRetractionMethod, S <: Stepsize, SC <: StoppingCriterion, } <: AbstractManoptSolverState where {P, T} p::P p_star::P - retraction_method::TR - stepsize::TS - stop::TSC + retraction_method::RM + stepsize::S + stop::SC X::T + function SubGradientMethodState(; + p::P, p_star::P, retraction_method::RM, stepsize::S, stopping_criterion::SC, X::T + ) where {P, T, RM <: AbstractRetractionMethod, S <: Stepsize, SC <: StoppingCriterion} + return new{P, T, RM, S, SC}(p, p_star, retraction_method, stepsize, stopping_criterion, X) + end function SubGradientMethodState( - M::TM; - p::P = rand(M), + M::TM; p::P = rand(M), stopping_criterion::SC = StopAfterIteration(5000), stepsize::S = default_stepsize(M, SubGradientMethodState), X::T = zero_vector(M, p), retraction_method::TR = default_retraction_method(M, typeof(p)), ) where { - TM <: AbstractManifold, - P, - T, - SC <: StoppingCriterion, - S <: Stepsize, - TR <: AbstractRetractionMethod, + TM <: AbstractManifold, P, T, SC <: StoppingCriterion, S <: Stepsize, TR <: AbstractRetractionMethod, } - return new{TR, S, SC, P, T}( - p, copy(M, p), retraction_method, stepsize, stopping_criterion, X + return SubGradientMethodState(; + p = p, p_star = copy(M, p), retraction_method = retraction_method, stepsize = stepsize, + stopping_criterion = stopping_criterion, X = X ) end end -function show(io::IO, sgms::SubGradientMethodState) +function Base.show(io::IO, sgms::SubGradientMethodState) + print(io, "SubGradientMethodState(; p = ", sgms.p, "p_star = ", sgms.p_star) + print(io, ", retraction_method = ", sgms.retraction_method) + print(io, ", stepsize = ", sgms.stepsize, ", stopping_criterion = ", sgms.stop, ", X = ", sgms.X) + return print(io, ")") +end +function status_summary(sgms::SubGradientMethodState; context::Symbol = :default) + (context === :short) && return repr(sgms) i = get_count(sgms, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(sgms.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the subgradient method$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(sgms.stop) ? "Yes" : "No" s = """ @@ -65,13 +74,12 @@ function show(io::IO, sgms::SubGradientMethodState) * retraction method: $(sgms.retraction_method) ## Stepsize - $(sgms.stepsize) + $(_in_str(status_summary(sgms.stepsize; context = context); indent = 0, headers = 1)) ## Stopping criterion - - $(status_summary(sgms.stop)) + $(_in_str(status_summary(sgms.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end get_iterate(sgs::SubGradientMethodState) = sgs.p get_subgradient(sgs::SubGradientMethodState) = sgs.X @@ -193,6 +201,7 @@ function initialize_solver!(mp::AbstractManoptProblem, sgs::SubGradientMethodSta M = get_manifold(mp) copyto!(M, sgs.p_star, sgs.p) sgs.X = zero_vector(M, sgs.p) + initialize_stepsize!(sgs.stepsize) return sgs end function step_solver!(mp::AbstractManoptProblem, sgs::SubGradientMethodState, k) diff --git a/src/solvers/truncated_conjugate_gradient_descent.jl b/src/solvers/truncated_conjugate_gradient_descent.jl index 7a6bfcff13..3a2bd409b8 100644 --- a/src/solvers/truncated_conjugate_gradient_descent.jl +++ b/src/solvers/truncated_conjugate_gradient_descent.jl @@ -91,7 +91,15 @@ mutable struct TruncatedConjugateGradientState{T, R <: Real, SC <: StoppingCrite StopWhenModelIncreased(), kwargs..., ) where {T, R <: Real, F} - tcgs = new{T, R, typeof(stopping_criterion), F}() + return TruncatedConjugateGradientState(; + X = X, trust_region_radius = trust_region_radius, randomize = randomize, + (project!) = project!, stopping_criterion = stopping_criterion, + ) + end + function TruncatedConjugateGradientState(; + X::T, trust_region_radius::R, randomize::Bool, project!::F, stopping_criterion::SC, + ) where {T, R <: Real, F, SC <: StoppingCriterion} + tcgs = new{T, R, SC, F}() tcgs.stop = stopping_criterion tcgs.Y = X tcgs.trust_region_radius = trust_region_radius @@ -101,11 +109,22 @@ mutable struct TruncatedConjugateGradientState{T, R <: Real, SC <: StoppingCrite return tcgs end end -function show(io::IO, tcgs::TruncatedConjugateGradientState) +function Base.show(io::IO, tcgs::TruncatedConjugateGradientState) + print(io, "TruncatedConjugateGradientState(;") + print(io, "(project!) = $(tcgs.project!), ") + print(io, "randomize = $(tcgs.randomize), ") + print(io, "stopping_criterion = $(tcgs.stop), ") + print(io, "trust_region_radius = $(tcgs.trust_region_radius), ") + return print(io, "X = $(tcgs.Y))") +end +function status_summary(tcgs::TruncatedConjugateGradientState; context::Symbol = :default) + (context === :short) && return repr(tcgs) i = get_count(tcgs, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(tcgs.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && "A solver state for the truncated conjugate gradient descent$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(tcgs.stop) ? "Yes" : "No" - s = """ + return """ # Solver state for `Manopt.jl`s Truncated Conjugate Gradient Descent $Iter ## Parameters @@ -113,11 +132,10 @@ function show(io::IO, tcgs::TruncatedConjugateGradientState) * trust region radius: $(tcgs.trust_region_radius) ## Stopping criterion - - $(status_summary(tcgs.stop)) + $(_in_str(status_summary(tcgs.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) end + function set_parameter!(tcgs::TruncatedConjugateGradientState, ::Val{:Iterate}, Y) return tcgs.Y = Y end @@ -190,16 +208,15 @@ function get_reason(c::StopWhenResidualIsReducedByFactorOrPower) end return "" end -function status_summary(c::StopWhenResidualIsReducedByFactorOrPower) +function status_summary(c::StopWhenResidualIsReducedByFactorOrPower; context::Symbol = :default) + (context === :short) && (return repr(c)) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Residual reduced by factor $(c.κ) or power $(c.θ):\t$s" + (context === :inline) && (return "Residual reduced by factor $(c.κ) or power $(c.θ):$(_MANOPT_INDENT)$s") + return "A stopping criterion used within tCG to check whether the residual is reduced by factor $(c.κ) or power 1+$(c.θ)\n$(_MANOPT_INDENT)$s" end -function show(io::IO, c::StopWhenResidualIsReducedByFactorOrPower) - return print( - io, - "StopWhenResidualIsReducedByFactorOrPower($(c.κ), $(c.θ))\n $(status_summary(c))", - ) +function Base.show(io::IO, c::StopWhenResidualIsReducedByFactorOrPower) + return print(io, "StopWhenResidualIsReducedByFactorOrPower($(c.κ), $(c.θ))") end @doc """ @@ -277,13 +294,15 @@ function get_reason(c::StopWhenTrustRegionIsExceeded) end return "" end -function status_summary(c::StopWhenTrustRegionIsExceeded) +function status_summary(c::StopWhenTrustRegionIsExceeded; context::Symbol = :default) + (context === :short) && (return repr(c)) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Trust region exceeded:\t$s" + (context === :inline) && (return "Trust region exceeded:$(_MANOPT_INDENT)$s") + return "A stopping criterion to stop when the trust region radius (0.0) is exceeded.\n$(_MANOPT_INDENT)$s" end -function show(io::IO, c::StopWhenTrustRegionIsExceeded) - return print(io, "StopWhenTrustRegionIsExceeded()\n $(status_summary(c))") +function Base.show(io::IO, ::StopWhenTrustRegionIsExceeded) + return print(io, "StopWhenTrustRegionIsExceeded()") end @doc """ @@ -334,13 +353,15 @@ function get_reason(c::StopWhenCurvatureIsNegative) end return "" end -function status_summary(c::StopWhenCurvatureIsNegative) +function status_summary(c::StopWhenCurvatureIsNegative; context::Symbol = :default) + (context === :short) && (return repr(c)) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Curvature is negative:\t$s" + (context === :inline) && (return "Curvature is negative:$(_MANOPT_INDENT)$s") + return "A stopping criterion to stop when the is negative\n$(_MANOPT_INDENT)$s" end -function show(io::IO, c::StopWhenCurvatureIsNegative) - return print(io, "StopWhenCurvatureIsNegative()\n $(status_summary(c))") +function Base.show(io::IO, ::StopWhenCurvatureIsNegative) + return print(io, "StopWhenCurvatureIsNegative()") end @doc """ @@ -351,7 +372,7 @@ A functor for testing if the curvature of the model value increased. # Fields $(_fields(:at_iteration)) -* `model_value`stre the last model value +* `model_value` store the last model value * `inc_model_value` store the model value that increased # Constructor @@ -390,13 +411,15 @@ function get_reason(c::StopWhenModelIncreased) end return "" end -function status_summary(c::StopWhenModelIncreased) +function status_summary(c::StopWhenModelIncreased; context::Symbol = :default) + (context === :short) && (repr(c)) has_stopped = (c.at_iteration >= 0) s = has_stopped ? "reached" : "not reached" - return "Model Increased:\t$s" + (context === :inline) && (return "Model Increased:$(_MANOPT_INDENT)$s") + return "A stopping criterion to indicate when the model increased.\n$(_MANOPT_INDENT)$s" end -function show(io::IO, c::StopWhenModelIncreased) - return print(io, "StopWhenModelIncreased()\n $(status_summary(c))") +function Base.show(io::IO, c::StopWhenModelIncreased) + return print(io, "StopWhenModelIncreased()") end _doc_TCG_subproblem = raw""" @@ -423,8 +446,8 @@ solve the trust-region subproblem $(_doc_TCG_subproblem) -on a manifold ``$(_math(:Manifold))nifold)))`` by using the Steihaug-Toint truncated conjugate-gradient (tCG) method. -This can be done inplace of `X`. +on a manifold ``$(_math(:Manifold))`` by using the Steihaug-Toint truncated conjugate-gradient (tCG) method. +This can be done in-place of `X`. For a description of the algorithm and theorems offering convergence guarantees, see [AbsilBakerGallivan:2006, ConnGouldToint:2000](@cite). diff --git a/src/solvers/trust_regions.jl b/src/solvers/trust_regions.jl index 8f308e252a..645955fcd6 100644 --- a/src/solvers/trust_regions.jl +++ b/src/solvers/trust_regions.jl @@ -58,14 +58,8 @@ $(_kwargs(:X; add_properties = [:as_Memory])) [`trust_regions`](@ref) """ mutable struct TrustRegionsState{ - P, - T, - Pr, - St <: AbstractManoptSolverState, - SC <: StoppingCriterion, - RTR <: AbstractRetractionMethod, - R <: Real, - Proj, + P, T, Pr, St <: AbstractManoptSolverState, + SC <: StoppingCriterion, RTR <: AbstractRetractionMethod, R <: Real, Proj, } <: AbstractSubProblemSolverState p::P X::T @@ -81,6 +75,11 @@ mutable struct TrustRegionsState{ sub_state::St p_proposal::P f_proposal::R + σ::R + reduction_threshold::R + reduction_factor::R + augmentation_threshold::R + augmentation_factor::R # Only required for Random mode Random HX::T Y::T @@ -88,38 +87,24 @@ mutable struct TrustRegionsState{ Z::T HZ::T τ::R - σ::R - reduction_threshold::R - reduction_factor::R - augmentation_threshold::R - augmentation_factor::R - function TrustRegionsState{P, T, Pr, St, SC, RTR, R, Proj}( - p::P, - X::T, - trust_region_radius::R, - max_trust_region_radius::R, - acceptance_rate::R, - ρ_regularization::R, - randomize::Bool, - stopping_criterion::SC, - retraction_method::RTR, - reduction_threshold::R, - augmentation_threshold::R, - sub_problem::Pr, - sub_state::St, - project!::Proj = (copyto!), - reduction_factor = 0.25, - augmentation_factor = 2.0, - σ::R = random ? 1.0e-6 : 0.0, + function TrustRegionsState( + sub_problem::Pr, sub_state::St; + p::P, X::T, + trust_region_radius::R, max_trust_region_radius::R, acceptance_rate::R, + ρ_regularization::R, randomize::Bool, + stopping_criterion::SC, retraction_method::RTR, reduction_threshold::R, + augmentation_threshold::R, project!::Proj = (copyto!), + reduction_factor::R, augmentation_factor::R, σ::R, + #random mode ones can stay uninitielized if not provided + HX::Union{T, Nothing} = nothing, + Y::Union{T, Nothing} = nothing, + HY::Union{T, Nothing} = nothing, + Z::Union{T, Nothing} = nothing, + HZ::Union{T, Nothing} = nothing, + τ::Union{R, Nothing} = nothing, ) where { - P, - T, - Pr, - St <: AbstractManoptSolverState, - SC <: StoppingCriterion, - RTR <: AbstractRetractionMethod, - R <: Real, - Proj, + P, T, Pr, St <: AbstractManoptSolverState, + SC <: StoppingCriterion, RTR <: AbstractRetractionMethod, R <: Real, Proj, } trs = new{P, T, Pr, St, SC, RTR, R, Proj}() trs.p = p @@ -139,57 +124,52 @@ mutable struct TrustRegionsState{ trs.augmentation_factor = augmentation_factor trs.project! = project! trs.σ = σ + !isnothing(HX) && (trs.HX = HX) + !isnothing(Y) && (trs.Y = Y) + !isnothing(HY) && (trs.HY = HY) + !isnothing(Z) && (trs.HZ = Z) + !isnothing(HZ) && (trs.HZ = HZ) + !isnothing(τ) && (trs.τ = τ) return trs end end - +TrustRegionsState(M::AbstractManifold, st::AbstractManoptSolverState; kwargs...) = error("Trust region method state can not be constructed based on $M and the sub state $st, a sub_problem is missing") function TrustRegionsState( - M::AbstractManifold, - sub_problem::Pr, - sub_state::St; - p::P = rand(M), - X::T = zero_vector(M, p), - acceptance_rate = 0.1, - ρ_regularization::R = 1000.0, + M::AbstractManifold, sub_problem::Pr, sub_state::St; + p::P = rand(M), X::T = zero_vector(M, p), + acceptance_rate::Real = 0.1, ρ_regularization::Real = 1000.0, randomize::Bool = false, stopping_criterion::SC = StopAfterIteration(1000) | StopWhenGradientNormLess(1.0e-6), - max_trust_region_radius::R = sqrt(manifold_dimension(M)), - trust_region_radius::R = max_trust_region_radius / 8, + max_trust_region_radius::Real = sqrt(manifold_dimension(M)), + trust_region_radius::Real = max_trust_region_radius / 8, retraction_method::RTR = default_retraction_method(M, typeof(p)), - reduction_threshold::R = 0.1, - reduction_factor = 0.25, - augmentation_threshold::R = 0.75, - augmentation_factor = 2.0, - project!::Proj = (copyto!), - σ = randomize ? 1.0e-4 : 0.0, + reduction_threshold::Real = 0.1, reduction_factor = 0.25, + augmentation_threshold::Real = 0.75, augmentation_factor::Real = 2.0, + project!::Proj = (copyto!), σ::Real = randomize ? 1.0e-4 : 0.0, ) where { - P, - T, - Pr <: Union{AbstractManoptProblem, F} where {F}, - St <: AbstractManoptSolverState, - R <: Real, - SC <: StoppingCriterion, - RTR <: AbstractRetractionMethod, - Proj, + P, T, Pr <: Union{AbstractManoptProblem, F} where {F}, St <: AbstractManoptSolverState, + SC <: StoppingCriterion, RTR <: AbstractRetractionMethod, Proj, } - return TrustRegionsState{P, T, Pr, St, SC, RTR, R, Proj}( - p, - X, - trust_region_radius, - max_trust_region_radius, - acceptance_rate, - ρ_regularization, - randomize, - stopping_criterion, - retraction_method, - reduction_threshold, - augmentation_threshold, - sub_problem, - sub_state, - project!, - reduction_factor, - augmentation_factor, - σ, + R = promote_type( + typeof(acceptance_rate), typeof(ρ_regularization), typeof(max_trust_region_radius), + typeof(trust_region_radius), typeof(reduction_threshold), typeof(reduction_factor), + typeof(augmentation_factor), typeof(augmentation_threshold), typeof(σ) + ) + acceptance_rate = convert(R, acceptance_rate); ρ_regularization = convert(R, ρ_regularization) + max_trust_region_radius = convert(R, max_trust_region_radius); trust_region_radius = convert(R, trust_region_radius) + reduction_threshold = convert(R, reduction_threshold); reduction_factor = convert(R, reduction_factor) + augmentation_factor = convert(R, augmentation_factor); augmentation_threshold = convert(R, augmentation_threshold) + σ = convert(R, σ) + + return TrustRegionsState( + sub_problem, sub_state; + p = p, X = X, + trust_region_radius = trust_region_radius, max_trust_region_radius = max_trust_region_radius, + acceptance_rate = acceptance_rate, ρ_regularization = ρ_regularization, + (project!) = project!, randomize = randomize, σ = σ, + stopping_criterion = stopping_criterion, retraction_method = retraction_method, + reduction_threshold = reduction_threshold, augmentation_threshold = augmentation_threshold, + reduction_factor = reduction_factor, augmentation_factor = augmentation_factor, ) end function TrustRegionsState( @@ -221,12 +201,32 @@ function get_message(dcs::TrustRegionsState) # for now only the sub solver might have messages return get_message(dcs.sub_state) end -function show(io::IO, trs::TrustRegionsState) +function Base.show(io::IO, trs::TrustRegionsState) + print(io, "TrustRegionsState("); print(io, trs.sub_problem); print(io, ", "); print(io, trs.sub_state) + print(io, "; ") + print(io, "p = $(trs.p), X = $(trs.X), ") + print(io, "trust_region_radius = $(trs.trust_region_radius), max_trust_region_radius = $(trs.max_trust_region_radius), ") + print(io, "acceptance_rate = $(trs.acceptance_rate), ρ_regularization = $(trs.ρ_regularization), randomize = $(trs.randomize), ") + print(io, "reduction_threshold = $(trs.reduction_threshold), augmentation_threshold = $(trs.augmentation_threshold), ") + print(io, "(project!) = $(trs.project!), reduction_factor = $(trs.reduction_factor), augmentation_factor = $(trs.augmentation_factor), σ = $(trs.σ), ") + isdefined(trs, :HX) && print(io, "HX = $(trs.HX), ") + isdefined(trs, :Y) && print(io, "Y = $(trs.Y), ") + isdefined(trs, :HY) && print(io, "HY = $(trs.HY), ") + isdefined(trs, :Z) && print(io, "Z = $(trs.Z), ") + isdefined(trs, :HZ) && print(io, "HZ = $(trs.HZ), ") + isdefined(trs, :τ) && print(io, "τ = $(trs.τ), ") + print(io, "stopping_criterion = $(trs.stop), retraction_method = $(trs.retraction_method)") + return print(io, ")") +end +function status_summary(trs::TrustRegionsState; context::Symbol = :default) + (context === :short) && return repr(trs) i = get_count(trs, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(trs.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the trust region solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(trs.stop) ? "Yes" : "No" - sub = repr(trs.sub_state) - sub = replace(sub, "\n" => "\n | ") + (context === :inline) && (return "A trust regions method state – $(Iter) $(has_converged(trs) ? "(converged)" : "")") + sub = _in_str(status_summary(trs.sub_state; context = context); indent = 1, headers = 1, indent_end = "| ") s = """ # Solver state for `Manopt.jl`s Trust Region Method $Iter @@ -238,14 +238,13 @@ function show(io::IO, trs::TrustRegionsState) * retraction method: $(trs.retraction_method) * ρ_regularization: $(trs.ρ_regularization) * trust region radius: $(trs.trust_region_radius) (max: $(trs.max_trust_region_radius)) - * sub solver state : - | $(sub) + * sub solver state: + $(sub) ## Stopping criterion - - $(status_summary(trs.stop)) + $(_in_str(status_summary(trs.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end _doc_TR = """ @@ -321,11 +320,7 @@ function trust_regions( end # Hessian (Function) and point function trust_regions( - M::AbstractManifold, - f, - grad_f, - Hess_f::TH, - p; + M::AbstractManifold, f, grad_f, Hess_f::TH, p; evaluation::AbstractEvaluationType = AllocatingEvaluation(), preconditioner = if evaluation isa InplaceEvaluation (M, Y, p, X) -> (Y .= X) @@ -363,14 +358,8 @@ function trust_regions( M, copy(M, p), grad_f; evaluation = evaluation, retraction_method = retraction_method ) return trust_regions( - M, - f, - grad_f, - hess_f, - p; - evaluation = evaluation, - retraction_method = retraction_method, - kwargs..., + M, f, grad_f, hess_f, p; + evaluation = evaluation, retraction_method = retraction_method, kwargs..., ) end # Objective @@ -400,11 +389,7 @@ function trust_regions!( M, copy(M, p), grad_f; evaluation = evaluation, retraction_method = retraction_method ) return trust_regions!( - M, - f, - grad_f, - hess_f, - p; + M, f, grad_f, hess_f, p; evaluation = evaluation, retraction_method = retraction_method, kwargs..., diff --git a/src/solvers/vectorbundle_newton.jl b/src/solvers/vectorbundle_newton.jl index a7b09e6978..914c97db24 100644 --- a/src/solvers/vectorbundle_newton.jl +++ b/src/solvers/vectorbundle_newton.jl @@ -46,6 +46,12 @@ mutable struct VectorBundleNewtonState{ stop::TStop stepsize::TStep retraction_method::TRTM + function VectorBundleNewtonState( + sub_problem::Pr, sub_state::St; + p::P, p_trial::P, X::T, stopping_criterion::TStop, stepsize::TStep, retraction_method::TRTM + ) where {P, T, Pr, St, TStop <: StoppingCriterion, TStep <: Stepsize, TRTM <: AbstractRetractionMethod} + return new{P, T, Pr, St, TStop, TStep, TRTM}(p, p_trial, X, sub_problem, sub_state, stopping_criterion, stepsize, retraction_method) + end end function VectorBundleNewtonState( @@ -57,12 +63,19 @@ function VectorBundleNewtonState( ) where { P, T, Pr, Op, RM <: AbstractRetractionMethod, SC <: StoppingCriterion, S <: Stepsize, } - return VectorBundleNewtonState{P, T, Pr, Op, SC, S, RM}( - p, copy(M, p), X, - sub_problem, sub_state, stopping_criterion, stepsize, retraction_method + return VectorBundleNewtonState( + sub_problem, sub_state; p = p, p_trial = copy(M, p), X = X, + stopping_criterion = stopping_criterion, stepsize = stepsize, retraction_method = retraction_method ) end +function Base.show(io::IO, vbns::VectorBundleNewtonState) + print(io, "VectorBundleNewtonState(", vbns.sub_problem, ", ", vbns.sub_state, "; p = ", vbns.p) + print(io, "retraction_method = ", vbns.retraction_method, ", stopping_criterion = $(status_summary(vbns.stop; context = :short)),") + print(io, "stepsize = ", vbns.stepsize, ", X = ", vbns.X) + return print(io, ")") +end + @doc """ AffineCovariantStepsize <: Stepsize @@ -108,8 +121,8 @@ Initializes all fields, where none of them is mandatory. The length is set to `` Since the computation of the convergence monitor ``θ`` requires simplified Newton directions a method for computing them has to be provided. This should be implemented as a method of the `newton_equation(M, VB, p, p_trial)` as parameters and returning a representation of the (transported) ``F(p_{$(_tex(:rm, "trial"))})``. """ -mutable struct AffineCovariantStepsize{T, R <: Real, N <: Union{Real, Missing}} <: Stepsize - α::T +mutable struct AffineCovariantStepsize{R <: Real, N <: Union{Real, Missing}} <: Stepsize + α::R θ::R θ_des::R θ_acc::R @@ -117,12 +130,35 @@ mutable struct AffineCovariantStepsize{T, R <: Real, N <: Union{Real, Missing}} outer_norm::N end function AffineCovariantStepsize( - M::AbstractManifold = DefaultManifold(2); - α = 1.0, θ = 1.3, θ_des = 0.5, θ_acc = 1.1 * θ_des, outer_norm::N = missing + ::AbstractManifold = DefaultManifold(2); + α::Real = 1.0, θ::Real = 1.3, θ_des::Real = 0.5, θ_acc::Real = 1.1 * θ_des, outer_norm::N = missing ) where {N <: Union{Real, Missing}} - return AffineCovariantStepsize{typeof(α), typeof(θ), N}(α, θ, θ_des, θ_acc, 1.0, outer_norm) + R = promote_type(typeof(α), typeof(θ), typeof(θ_des), typeof(θ_acc)) + return AffineCovariantStepsize{R, N}( + convert(R, α), convert(R, θ), convert(R, θ_des), convert(R, θ_acc), convert(R, 1.0), outer_norm + ) +end +function Base.show(io::IO, acs::AffineCovariantStepsize) + print(io, "AffineCovariantStepsize(; α = ", acs.α, ", θ = ", acs.θ, ", θ_des = ", acs.θ_des) + print(io, ", θ_acc = ", acs.θ_acc) + !(ismissing(acs.outer_norm)) && print(io, ", outer_norm = ", acs.outer_norm) + return print(io, ")") end +function status_summary(acs::AffineCovariantStepsize; context = :default) + (context === :short) && repr(acs) + (context === :inline) && return "An affine covariant step size (last step size: $(acs.last_stepsize))" + on = ismissing(acs.outer_norm) ? "" : "\n* outer norm: $(_MANOPT_INDENT)$(acs.outer_norm)" + return """ + An affine covariant step size + (last step size: $(acs.last_stepsize)) + ## Parameters + * damping factor α: $(_MANOPT_INDENT)$(acs.α) + * θ: $(_MANOPT_INDENT)$(acs.θ) + * desired θ: $(_MANOPT_INDENT)$(acs.θ_des) + * acceptable θ: $(_MANOPT_INDENT)$(acs.θ_acc)$(on) + """ +end function (acs::AffineCovariantStepsize)( amp::AbstractManoptProblem, ams::VectorBundleNewtonState, ::Any, args...; kwargs... ) @@ -159,22 +195,27 @@ end default_stepsize(M::AbstractManifold, ::Type{VectorBundleNewtonState}) = ConstantStepsize(M) -function show(io::IO, vbns::VectorBundleNewtonState) +function status_summary(vbns::VectorBundleNewtonState; context::Symbol = :default) + (context === :short) && return repr(vbns) i = get_count(vbns, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(vbns.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the vector bundle Newton solver$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(vbns.stop) ? "Yes" : "No" + _is_inline(context) && (return "$(repr(vbns)) – $(Iter) $(has_converged(vbns) ? "(converged)" : "")") s = """ # Solver state for `Manopt.jl`s Vector bundle Newton method $Iter ## Parameters * retraction method: $(vbns.retraction_method) - * step size: $(vbns.stepsize) - ## Stopping criterion + ## Stepsize + $(_in_str(status_summary(vbns.stepsize; context = context); indent = 0, headers = 1)) - $(status_summary(vbns.stop)) + ## Stopping criterion + $(_in_str(status_summary(vbns.stop; context = context); indent = 0, headers = 1)) This indicates convergence: $Conv""" - return print(io, s) + return s end @@ -192,6 +233,28 @@ struct VectorBundleManoptProblem{ newton_equation::O end +function Base.show(io::IO, vbmp::VectorBundleManoptProblem) + print(io, "VectorBundleManoptProblem(", vbmp.manifold, ", ", vbmp.vectorbundle, ", ") + return print(io, vbmp.newton_equation, ")") +end + +function status_summary(vbmp::VectorBundleManoptProblem; context::Symbol = :default) + (context === :short) && return repr(vbmp) + (context === :inline) && return "A vector bundle problem defined on $(vbmp.manifold) with range $(vbmp.vectorbundle) and newton equation $(vbmp.newton_equation)" + return """ + A vector bundle problem representing a vector bundle newton equation objective + + ## Manifold + $(_in_str(repr(vbmp.manifold); indent = 1)) + + ## Range + $(_in_str(repr(vbmp.vectorbundle); indent = 1)) + + ## Vector bundle newton equation + $(_in_str(repr(vbmp.newton_equation); indent = 1)) + """ +end + @doc """ get_vectorbundle(vbp::VectorBundleManoptProblem) diff --git a/test/MOI_wrapper.jl b/test/MOI_wrapper.jl deleted file mode 100644 index 467ba0d34f..0000000000 --- a/test/MOI_wrapper.jl +++ /dev/null @@ -1,136 +0,0 @@ -using Test -using LinearAlgebra -using Manifolds -using Manopt -using JuMP - -function _test_allocs(problem::Manopt.AbstractManoptProblem, x, g) - Manopt.get_cost(problem, x) # Compilation - @test 0 == @allocated Manopt.get_cost(problem, x) - Manopt.get_gradient!(problem, g, x) # Compilation - @test 0 == @allocated Manopt.get_gradient!(problem, g, x) - return nothing -end - -_test_allocs(optimizer, x, g) = _test_allocs(optimizer.problem, x, g) - -function _test_sphere_sum(model, obj_sign) - @test MOI.get(unsafe_backend(model), MOI.ResultCount()) == 0 - optimize!(model) - @test MOI.get(unsafe_backend(model), MOI.NumberOfVariables()) == 3 - @test termination_status(model) == MOI.LOCALLY_SOLVED - @test primal_status(model) == MOI.FEASIBLE_POINT - @test primal_status(model) == MOI.FEASIBLE_POINT - @test dual_status(model) == MOI.NO_SOLUTION - @test objective_value(model) ≈ obj_sign * √3 - @test value.(model[:x]) ≈ obj_sign * inv(√3) * ones(3) rtol = 1.0e-2 - @test raw_status(model) isa String - @test raw_status(model)[end] != '\n' - _test_allocs(unsafe_backend(model), zeros(3), zeros(3)) - return nothing -end - -function test_sphere(descent_state_type; kws...) - model = Model(Manopt.JuMP_Optimizer; kws...) - @test MOI.supports(JuMP.backend(model), MOI.RawOptimizerAttribute("descent_state_type")) - set_attribute(model, "descent_state_type", descent_state_type) - start = normalize(1:3) - @variable(model, x[i = 1:3] in Sphere(2), start = start[i]) - - objective = let - # We create `grad_f` here to avoid having an allocation in `eval_grad_sum_cb` - # so that we can easily test that the rest is allocation-free by testing that - # `@allocated` returns 0 the whole call to `get_gradient!`. - # To avoid creating a closure capturing the `grad_f` object, - # we use the `let` block trick detailed in: - # https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-captured - grad_f = ones(3) - - function eval_sum_cb(M, x) - return sum(x) - end - - function eval_grad_sum_cb(M, g, X) - return Manopt.riemannian_gradient!(M, g, X, grad_f) - end - - Manopt.ManifoldGradientObjective( - eval_sum_cb, eval_grad_sum_cb; evaluation = Manopt.InplaceEvaluation() - ) - end - - @testset "$obj_sense" for (obj_sense, obj_sign) in - [(MOI.MIN_SENSE, -1), (MOI.MAX_SENSE, 1)] - @testset "JuMP objective" begin - @objective(model, obj_sense, sum(x)) - _test_sphere_sum(model, obj_sign) - end - - @testset "Manopt objective" begin - @objective(model, obj_sense, objective) - _test_sphere_sum(model, obj_sign) - end - end - @test contains( - sprint(show, model), - "Vector{VariableRef} in ManoptJuMPExt.ManifoldSet{Sphere{ℝ, ManifoldsBase.TypeParameter{Tuple{2}}}}: 1", - ) - @test contains(sprint(print, model), "[x[1], x[2], x[3]] in $(Sphere(2))") - @test contains( - JuMP.model_string(MIME("text/latex"), model), - "[x_{1}, x_{2}, x_{3}] \\in $(Sphere(2))", - ) - - @objective(model, Min, sum(xi^4 for xi in x)) - set_start_value.(x, start) - optimize!(model) - @test objective_value(model) ≈ 1 / 3 rtol = 1.0e-4 - @test value.(x) ≈ inv(√3) * ones(3) rtol = 1.0e-2 - @test raw_status(model) isa String - @test raw_status(model)[end] != '\n' - - set_objective_sense(model, MOI.FEASIBILITY_SENSE) - optimize!(model) - @test sum(value.(x) .^ 2) ≈ 1 - - set_start_value(x[3], nothing) - err = ErrorException("No starting value specified for `3`th variable.") - @test_throws err optimize!(model) - set_start_value(x[3], 1.0) - - @variable(model, [1:2, 1:2] in Stiefel(2, 2)) - @test_throws MOI.AddConstraintNotAllowed optimize!(model) - return -end - -@testset "JuMP tests" begin - test_sphere(GradientDescentState; add_bridges = true) - test_sphere(GradientDescentState; add_bridges = false) -end - -function test_runtests() - optimizer = Manopt.JuMP_Optimizer() - config = MOI.Test.Config(; exclude = Any[MOI.ListOfModelAttributesSet]) - # Test MOI getter - @test MOI.get(optimizer, MOI.RawOptimizerAttribute("descent_state_type")) == GradientDescentState - return MOI.Test.runtests( - optimizer, - config; - exclude = String[ - # See https://github.com/jump-dev/MathOptInterface.jl/pull/2195 - "test_model_copy_to_UnsupportedConstraint", - "test_model_copy_to_UnsupportedAttribute", - "test_model_ScalarFunctionConstantNotZero", - # See https://github.com/jump-dev/MathOptInterface.jl/pull/2196/ - "test_objective_ScalarQuadraticFunction_in_ListOfModelAttributesSet", - "test_objective_ScalarAffineFunction_in_ListOfModelAttributesSet", - "test_objective_VariableIndex_in_ListOfModelAttributesSet", - "test_objective_set_via_modify", - "test_objective_ObjectiveSense_in_ListOfModelAttributesSet", - ], - ) -end - -@testset "MOI tests" begin - test_runtests() -end diff --git a/test/helpers/test_linesearches.jl b/test/helpers/test_linesearches.jl index 69e0526df7..1e10f37d6c 100644 --- a/test/helpers/test_linesearches.jl +++ b/test/helpers/test_linesearches.jl @@ -26,12 +26,8 @@ using Test x0 = vcat(zeros(n_dims - 1), 1.0) ls_hz = Manopt.LineSearchesStepsize(M, LineSearches.HagerZhang()) x_opt = quasi_Newton( - M, - rosenbrock, - rosenbrock_grad!, - x0; - stepsize = ls_hz, - debug = [], + M, rosenbrock, rosenbrock_grad!, x0; + stepsize = ls_hz, debug = [], evaluation = InplaceEvaluation(), stopping_criterion = StopAfterIteration(1000) | StopWhenGradientNormLess(1.0e-6), return_state = true, @@ -39,6 +35,7 @@ using Test @test rosenbrock(M, get_iterate(x_opt)) < 1.503084 @test startswith(sprint(show, ls_hz), "LineSearchesStepsize(HagerZhang") + @test startswith(Manopt.status_summary(ls_hz), "A step size wrapper for LineSearches.jl") # make sure get_last_stepsize works mgo = ManifoldGradientObjective( @@ -62,4 +59,20 @@ using Test initialize_solver!(mp, st_qn) ls_mt = Manopt.LineSearchesStepsize(M, LineSearches.MoreThuente()) @test_throws ErrorException ls_mt(mp_throw, st_qn, 1; fp = rosenbrock(M, x0)) + + # test max stepsize limit enforcement + @test ls_hz(mp, st_qn, 1, [1.0, 2.0, 3.0, 4.0, 0.0]; stop_when_stepsize_exceeds = 0.1) == 0.1 + + @testset "max stepsize limit setting" begin + lss = [ + LineSearches.MoreThuente(), + LineSearches.HagerZhang(), + ] + for ls in lss + nls = Manopt.linesearches_set_max_alpha(ls, 0.5) + @test Manopt.linesearches_get_max_alpha(nls) == 0.5 + nls2 = Manopt.linesearches_set_max_alpha(ls, Inf) + @test Manopt.linesearches_get_max_alpha(nls2) == Inf + end + end end diff --git a/test/helpers/test_manifold_extra_functions.jl b/test/helpers/test_manifold_extra_functions.jl index f7a7540429..c775cb1cc9 100644 --- a/test/helpers/test_manifold_extra_functions.jl +++ b/test/helpers/test_manifold_extra_functions.jl @@ -69,6 +69,7 @@ Random.seed!(42) R3 = Euclidean(3) TR3 = TangentBundle(R3) + p = [0.0, 1.0, 0.0] X = [0.0, 0.0, 0.0] @@ -83,6 +84,14 @@ Random.seed!(42) @test Manopt.max_stepsize(R3, p) == Inf @test Manopt.max_stepsize(TR3, ArrayPartition(p, X)) == Inf + S_R3 = ProductManifold(M, R3) + @test Manopt.max_stepsize(S_R3) ≈ π + @test Manopt.max_stepsize(S_R3, ArrayPartition(p, [0.0, 0.0, 0.0])) ≈ π + + S_pow = PowerManifold(M, NestedPowerRepresentation(), 3) + @test Manopt.max_stepsize(S_pow) ≈ π + @test Manopt.max_stepsize(S_pow, [p, p, p]) ≈ π + Mfr = FixedRankMatrices(5, 4, 2) pfr = SVDMPoint( [ @@ -103,6 +112,9 @@ Random.seed!(42) M = Hyperrectangle([-3, -1.5], [3, 1.5]) @test Manopt.max_stepsize(M) ≈ 6.0 @test Manopt.max_stepsize(M, [-1, 0.5]) ≈ 4.0 + + M = ProbabilitySimplex(3) + @test Manopt.max_stepsize(M) == 1.0 end @testset "Vector space default" begin @test Manopt.Rn(Val(:Manopt), 3) isa ManifoldsBase.DefaultManifold diff --git a/test/plans/test_cache.jl b/test/plans/test_cache.jl index dc9a0d2a0a..43d32f6d70 100644 --- a/test/plans/test_cache.jl +++ b/test/plans/test_cache.jl @@ -67,19 +67,15 @@ end mgoa = ManifoldGradientObjective(TestCostCount(0), TestGradCount(0)) # Init to copy of p - init cache sco1 = Manopt.SimpleManifoldCachedObjective(M, mgoa; p = copy(M, p)) - @test repr(sco1) == "SimpleManifoldCachedObjective{AllocatingEvaluation,$(mgoa)}" - @test startswith( - repr((sco1, 1.0)), - """## Cache - A `SimpleManifoldCachedObjective`""", - ) - @test startswith( - repr((sco1, Manopt.Test.DummyState())), - """Manopt.Test.DummyState(Float64[]) - - ## Cache - A `SimpleManifoldCachedObjective`""", - ) + sco1r = repr(sco1) + @test startswith(sco1r, "SimpleManifoldCachedObjective") + @test contains(sco1r, "initialized = ") + sco1s = Manopt.status_summary(sco1) + @test contains(sco1s, "## Cache") + # together with a state -> append cache information after state + sco1t = repr((sco1, GradientDescentState(M))) + @test contains(sco1t, "# Solver state for `Manopt.jl`s Gradient Descent") + @test contains(sco1t, "## Cache") # evaluated on init -> 1 @test sco1.objective.functions[:cost].i == 1 @test sco1.objective.functions[:gradient].i == 1 @@ -224,10 +220,9 @@ end o = ManifoldGradientObjective(f, grad_f) co = ManifoldCountObjective(M, o, [:Cost, :Gradient, :Differential]) lco = objective_cache_factory(M, co, (:LRU, [:Cost, :Gradient, :Differential])) - @test startswith(repr(lco), "## Cache\n * ") - @test startswith( - repr((lco, Manopt.Test.DummyState())), - "Manopt.Test.DummyState(Float64[])\n\n## Cache\n * ", + @test contains(repr(lco), "## Cache\n * ") + @test contains( + repr((lco, Manopt.Test.DummyState())), "Manopt.Test.DummyState(Float64[])", ) ro = Manopt.Test.DummyDecoratedObjective(o) #undecorated works as well diff --git a/test/plans/test_conjugate_residual_plan.jl b/test/plans/test_conjugate_residual_plan.jl index 26e408c8d3..dd9eb66a28 100644 --- a/test/plans/test_conjugate_residual_plan.jl +++ b/test/plans/test_conjugate_residual_plan.jl @@ -51,7 +51,8 @@ using Manifolds, Manopt, Test @test set_gradient!(crs, TpM, X0) == crs # setters return state @test get_gradient(crs) == X0 @test startswith( - repr(crs), "# Solver state for `Manopt.jl`s Conjugate Residual Method" + Manopt.status_summary(crs; context = :default), + "# Solver state for `Manopt.jl`s Conjugate Residual Method\n" ) crs2 = ConjugateResidualState(TpM, slso2) @test set_iterate!(crs2, TpM, X0) == crs2 # setters return state @@ -59,7 +60,8 @@ using Manifolds, Manopt, Test @test set_gradient!(crs2, TpM, X0) == crs2 # setters return state @test get_gradient(crs2) == X0 @test startswith( - repr(crs2), "# Solver state for `Manopt.jl`s Conjugate Residual Method" + Manopt.status_summary(crs2; context = :default), + "# Solver state for `Manopt.jl`s Conjugate Residual Method\n" ) end @testset "StopWhenRelativeResidualLess" begin diff --git a/test/plans/test_constrained_plan.jl b/test/plans/test_constrained_plan.jl index 28c0a07edf..bc5b0c9663 100644 --- a/test/plans/test_constrained_plan.jl +++ b/test/plans/test_constrained_plan.jl @@ -71,51 +71,31 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools f, grad_f, g, grad_g, h, grad_h; inequality_constraints = 2, equality_constraints = 1 ) cofaA = ConstrainedManifoldObjective( # Array representation tangent vector - f, - grad_f, - g, - grad_gA, - h, - grad_hA; - inequality_constraints = 2, - equality_constraints = 1, + f, grad_f, g, grad_gA, h, grad_hA; + inequality_constraints = 2, equality_constraints = 1, ) cofm = ConstrainedManifoldObjective( - f, - grad_f!, - g!, - grad_g!, - h!, - grad_h!; - evaluation = InplaceEvaluation(), - inequality_constraints = 2, - equality_constraints = 1, + f, grad_f!, g!, grad_g!, h!, grad_h!; + evaluation = InplaceEvaluation(), inequality_constraints = 2, equality_constraints = 1, ) cova = ConstrainedManifoldObjective( - f, - grad_f, - [g1, g2], - [grad_g1, grad_g2], - [h1], - [grad_h1]; - inequality_constraints = 2, - equality_constraints = 1, + f, grad_f, [g1, g2], [grad_g1, grad_g2], [h1], [grad_h1]; + inequality_constraints = 2, equality_constraints = 1, ) covm = ConstrainedManifoldObjective( - f, - grad_f!, - [g1, g2], - [grad_g1!, grad_g2!], - [h1], - [grad_h1!]; - evaluation = InplaceEvaluation(), - inequality_constraints = 2, - equality_constraints = 1, + f, grad_f!, [g1, g2], [grad_g1!, grad_g2!], [h1], [grad_h1!]; + evaluation = InplaceEvaluation(), inequality_constraints = 2, equality_constraints = 1, ) - @test repr(cofa) === "ConstrainedManifoldObjective{AllocatingEvaluation}" - @test repr(cofm) === "ConstrainedManifoldObjective{InplaceEvaluation}" - @test repr(cova) === "ConstrainedManifoldObjective{AllocatingEvaluation}" - @test repr(covm) === "ConstrainedManifoldObjective{InplaceEvaluation}" + rcofa = repr(cofa); rcofm = repr(cofm); rcova = repr(cova); rcofm = repr(covm) + for r in [rcofa, rcofm, rcova, rcofm] + @test startswith(r, "ConstrainedManifoldObjective(ManifoldFirstOrderObjective(; cost = f") + end + for r in [rcofa, rcova] + @test contains(r, "evaluation = AllocatingEvaluation()") + end + for r in [rcofm, rcofm] + @test contains(r, "evaluation = InplaceEvaluation()") + end # Test cost/grad pass through @test Manopt.get_cost_function(cofa)(M, p) == f(M, p) @test Manopt.get_gradient_function(cofa)(M, p) == grad_f(M, p) @@ -135,69 +115,34 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools @test Manopt.get_unconstrained_objective(cofa) isa ManifoldFirstOrderObjective cofha = ConstrainedManifoldObjective( - f, - grad_f, - g, - grad_g, - h, - grad_h; - hess_f = hess_f, - hess_g = hess_g, - hess_h = hess_h, - inequality_constraints = 2, - equality_constraints = 1, + f, grad_f, g, grad_g, h, grad_h; + hess_f = hess_f, hess_g = hess_g, hess_h = hess_h, + inequality_constraints = 2, equality_constraints = 1, ) cofhm = ConstrainedManifoldObjective( - f, - grad_f!, - g!, - grad_g!, - h!, - grad_h!; - hess_f = (hess_f!), - hess_g = (hess_g!), - hess_h = (hess_h!), - evaluation = InplaceEvaluation(), - inequality_constraints = 2, - equality_constraints = 1, + f, grad_f!, g!, grad_g!, h!, grad_h!; + hess_f = (hess_f!), hess_g = (hess_g!), hess_h = (hess_h!), + evaluation = InplaceEvaluation(), inequality_constraints = 2, equality_constraints = 1, ) covha = ConstrainedManifoldObjective( - f, - grad_f, - [g1, g2], - [grad_g1, grad_g2], - [h1], - [grad_h1]; - hess_f = hess_f, - hess_g = [hess_g1, hess_g2], - hess_h = [hess_h1], - inequality_constraints = 2, - equality_constraints = 1, + f, grad_f, [g1, g2], [grad_g1, grad_g2], [h1], [grad_h1]; + hess_f = hess_f, hess_g = [hess_g1, hess_g2], hess_h = [hess_h1], + inequality_constraints = 2, equality_constraints = 1, ) covhm = ConstrainedManifoldObjective( - f, - grad_f!, - [g1, g2], - [grad_g1!, grad_g2!], - [h1], - [grad_h1!]; - hess_f = (hess_f!), - hess_g = [hess_g1!, hess_g2!], - hess_h = [hess_h1!], - evaluation = InplaceEvaluation(), - inequality_constraints = 2, - equality_constraints = 1, + f, grad_f!, [g1, g2], [grad_g1!, grad_g2!], [h1], [grad_h1!]; + hess_f = (hess_f!), hess_g = [hess_g1!, hess_g2!], hess_h = [hess_h1!], + evaluation = InplaceEvaluation(), inequality_constraints = 2, equality_constraints = 1, ) mp = DefaultManoptProblem(M, cofha) cop = ConstrainedManoptProblem(M, cofha) cop2 = ConstrainedManoptProblem( - M, - cofaA; - gradient_equality_range = ArrayPowerRepresentation(), - gradient_inequality_range = ArrayPowerRepresentation(), + M, cofaA; + gradient_equality_range = ArrayPowerRepresentation(), gradient_inequality_range = ArrayPowerRepresentation(), ) - + @test startswith(Manopt.status_summary(cop), "A constrained optimization problem for Manopt.jl") + @test startswith(repr(cop), "ConstrainedManoptProblem(") @testset "ConstrainedManoptProblem special cases" begin Y = zero_vector(M, p) for mcp in [mp, cop] @@ -357,28 +302,22 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools end @testset "is_feasible & DebugFeasibility" begin coh = ConstrainedManifoldObjective( - f, - grad_f; - hess_f = hess_f, - g = g, - grad_g = grad_g, - hess_g = hess_g, - h = h, - grad_h = grad_h, - hess_h = hess_h, - M = M, + f, grad_f; M = M, hess_f = hess_f, + g = g, grad_g = grad_g, hess_g = hess_g, + h = h, grad_h = grad_h, hess_h = hess_h, ) @test is_feasible(M, coh, [-2.0, 3.0, 0.5]; error = :info) @test_throws ErrorException is_feasible(M, coh, p; error = :error) @test_logs (:info,) !is_feasible(M, coh, p; error = :info) @test_logs (:warn,) !is_feasible(M, coh, p; error = :warn) - st = Manopt.StepsizeState(p, X) + st = Manopt.StepsizeState(; p = p, X = X) mp = DefaultManoptProblem(M, coh) io = IOBuffer() df = DebugFeasibility(; io = io) @test repr(df) === "DebugFeasibility([\"feasible: \", :Feasible], at_init=true)" + @test startswith(Manopt.status_summary(df), "A DebugAction printing Feasibility information of the current iterate") # short form: - @test Manopt.status_summary(df) === "(:Feasibility, [\"feasible: \", :Feasible])" + @test Manopt.status_summary(df; context = :short) === "(:Feasibility, [\"feasible: \", :Feasible])" df(mp, st, 1) @test String(take!(io)) == "feasible: No" end @@ -394,17 +333,10 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools q[N, 3] = λ q[N, 4] = s coh = ConstrainedManifoldObjective( - f, - grad_f; - hess_f = hess_f, - g = g, - grad_g = grad_g, - hess_g = hess_g, - h = h, - grad_h = grad_h, - hess_h = hess_h, - M = M, + f, grad_f; hess_f = hess_f, + g = g, grad_g = grad_g, hess_g = hess_g, h = h, grad_h = grad_h, hess_h = hess_h, M = M, ) + @test startswith(Manopt.status_summary(coh), "A constrained objective with") @testset "Lagrangian Cost, Grad and Hessian" begin Lc = LagrangianCost(coh, μ, λ) @test startswith(repr(Lc), "LagrangianCost") @@ -439,7 +371,8 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools @testset "Full KKT and its norm" begin # Full KKT Vector field KKTvf = KKTVectorField(coh) - @test startswith(repr(KKTvf), "KKTVectorField\n") + @test startswith(repr(KKTvf), "KKTVectorField(") + @test startswith(Manopt.status_summary(KKTvf), "The KKT vector field for the constrained objective") Xp = LagrangianGradient(coh, μ, λ)(M, p) #Xμ = g + s; Xλ = h, Xs = μ .* s Y = KKTvf(N, q) @test Y[N, 1] == Xp @@ -447,7 +380,8 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools @test Y[N, 3] == c[2] @test Y[N, 4] == μ .* s KKTvfJ = KKTVectorFieldJacobian(coh) - @test startswith(repr(KKTvfJ), "KKTVectorFieldJacobian\n") + @test startswith(repr(KKTvfJ), "KKTVectorFieldJacobian(") + @test startswith(Manopt.status_summary(KKTvfJ), "The Jacobian of the KKT vector field for the constrained objective") # Xp = LagrangianHessian(coh, μ, λ)(M, p, Y[N, 1]) + @@ -462,11 +396,10 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools @test Z[N, 4] == μ .* Y[N, 4] + s .* Y[N, 2] KKTvfAdJ = KKTVectorFieldAdjointJacobian(coh) - @test startswith(repr(KKTvfAdJ), "KKTVectorFieldAdjointJacobian\n") - Xp2 = - LagrangianHessian(coh, μ, λ)(M, p, Y[N, 1]) + - sum(Y[N, 2] .* gg) + - sum(Y[N, 3] .* gh) + @test startswith(repr(KKTvfAdJ), "KKTVectorFieldAdjointJacobian(") + @test startswith(Manopt.status_summary(KKTvfAdJ), "The adjoint Jacobian of the KKT vector field for the constrained objective") + + Xp2 = LagrangianHessian(coh, μ, λ)(M, p, Y[N, 1]) + sum(Y[N, 2] .* gg) + sum(Y[N, 3] .* gh) Xμ2 = [inner(M, p, gg[i], Y[N, 1]) + s[i] * Y[N, 4][i] for i in 1:length(gg)] Xλ2 = [inner(M, p, gh[j], Y[N, 1]) for j in 1:length(gh)] Z2 = KKTvfAdJ(N, q, Y) @@ -477,11 +410,13 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools # Full KKT Vector field norm – the Merit function KKTvfN = KKTVectorFieldNormSq(coh) - @test startswith(repr(KKTvfN), "KKTVectorFieldNormSq\n") + @test startswith(repr(KKTvfN), "KKTVectorFieldNormSq(") + @test startswith(Manopt.status_summary(KKTvfN), "The KKT vector field in normed squared for the constrained objective") vfn = KKTvfN(N, q) @test vfn == norm(N, q, Y)^2 KKTvfNG = KKTVectorFieldNormSqGradient(coh) - @test startswith(repr(KKTvfNG), "KKTVectorFieldNormSqGradient\n") + @test startswith(repr(KKTvfNG), "KKTVectorFieldNormSqGradient(") + @test startswith(Manopt.status_summary(KKTvfNG), "The gradient of the KKT vector field in normed squared for the constrained objective") Zg1 = KKTvf(N, q) Zg2 = 2.0 * KKTvfAdJ(N, q, Zg1) W = KKTvfNG(N, q) @@ -489,7 +424,7 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools end @testset "Condensed KKT, Jacobian" begin CKKTvf = CondensedKKTVectorField(coh, μ, s, β) - @test startswith(repr(CKKTvf), "CondensedKKTVectorField\n") + @test startswith(repr(CKKTvf), "CondensedKKTVectorField(") b1 = gf + sum(λ .* gh) + @@ -507,7 +442,7 @@ using LRUCache, Manifolds, ManifoldsBase, Manopt, Test, RecursiveArrayTools CKKTvf(Nc, V2, qc) @test V2 == V CKKTVfJ = CondensedKKTVectorFieldJacobian(coh, μ, s, β) - @test startswith(repr(CKKTVfJ), "CondensedKKTVectorFieldJacobian\n") + @test startswith(repr(CKKTVfJ), "CondensedKKTVectorFieldJacobian(") Yc = zero_vector(Nc, qc) Yc[Nc, 1] = [1.0, 3.0, 5.0] Yc[Nc, 2] = [7.0] diff --git a/test/plans/test_counts.jl b/test/plans/test_counts.jl index 07a4944a51..ddb4970b71 100644 --- a/test/plans/test_counts.jl +++ b/test/plans/test_counts.jl @@ -52,12 +52,13 @@ using LinearAlgebra: Symmetric ) @test get_count(c_obj3, :Gradient) == 2 @test get_count(c_obj3, :Cost) == -1 # nonexistent - @test startswith(repr(c_obj), "## Statistics") - @test startswith(Manopt.status_summary(c_obj), "## Statistics") - # also for the `repr` call - @test startswith(repr((c_obj, p)), "## Statistics") - # but this also includes the hint, how to access the result - @test endswith(repr((c_obj, p)), "on this variable.") + @test startswith(repr(c_obj), "ManifoldCountObjective(ManifoldFirstOrderObjective") + status_obj = Manopt.status_summary(c_obj) + @test contains(status_obj, "## Statistics") + io = IOBuffer() + Manopt.status_summary(io, c_obj) + @test String(take!(io)) == status_obj + @test contains(Manopt.status_summary(c_obj; context = :inline), "(statistics:") rc_obj = Manopt.Test.DummyDecoratedObjective(c_obj) @test get_count(rc_obj, :Gradient) == 4 #still works if count is encapsulated @test_throws ErrorException get_count(obj, :Gradient) # no count objective diff --git a/test/plans/test_debug.jl b/test/plans/test_debug.jl index f1686cb0b8..886e64fd09 100644 --- a/test/plans/test_debug.jl +++ b/test/plans/test_debug.jl @@ -8,6 +8,8 @@ function ManifoldsBase.default_inverse_retraction_method(::TestPolarManifold) end struct TestDebugAction <: DebugAction end +Base.show(io::IO, ::TestDebugAction) = print(io, "TestDebugAction()") + struct TestMessageState <: AbstractManoptSolverState end Manopt.get_message(::TestMessageState) = "DebugTest" @@ -27,25 +29,33 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value M = ManifoldsBase.DefaultManifold(2) p = [4.0, 2.0] st = GradientDescentState( - M; - p = p, - stopping_criterion = StopAfterIteration(10), - stepsize = Manopt.ConstantStepsize(M), + M; p = p, + stopping_criterion = StopAfterIteration(10), stepsize = Manopt.ConstantStepsize(M), ) f(M, q) = distance(M, q, p) .^ 2 grad_f(M, q) = -2 * log(M, q, p) - # summary fallback to show + tda = TestDebugAction() + # summary fallback to show - inherited from AbstractStateAction(s) @test Manopt.status_summary(TestDebugAction()) === "TestDebugAction()" + show(io, tda) + @test String(take!(io)) === "TestDebugAction()" mp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) a1 = DebugDivider("|"; io = io) - @test Manopt.dispatch_state_decorator(DebugSolverState(st, a1)) === Val{true}() + dst = DebugSolverState(st, a1) + dst_empty = DebugSolverState(st, []) + @test Manopt.dispatch_state_decorator(dst) === Val{true}() # constructors - @test DebugSolverState(st, a1).debugDictionary[:Iteration] == a1 - @test DebugSolverState(st, [a1]).debugDictionary[:Iteration].group[1] == a1 - @test DebugSolverState(st, Dict(:A => a1)).debugDictionary[:A] == a1 - @test DebugSolverState(st, ["|"]).debugDictionary[:Iteration].divider == a1.divider - @test endswith(repr(DebugSolverState(st, a1)), "\"|\"") - @test repr(DebugSolverState(st, Dict{Symbol, DebugAction}())) == repr(st) + @test DebugSolverState(st, a1).debug_dictionary[:Iteration] == a1 + @test DebugSolverState(st, [a1]).debug_dictionary[:Iteration].group[1] == a1 + @test DebugSolverState(st, Dict(:A => a1)).debug_dictionary[:A] == a1 + @test DebugSolverState(st, ["|"]).debug_dictionary[:Iteration].divider == a1.divider + @test endswith(Manopt.status_summary(dst), "A DebugAction printing the String “|” as a divider") + # Without any actual debug, do not print debug + @test !contains(Manopt.status_summary(dst_empty), "## Debug") + @test Manopt.status_summary(a1; context = :short) == "\"|\"" + @test Manopt.status_summary(a1; context = :default) == "A DebugAction printing the String “|” as a divider" + empty_dbg = Dict{Symbol, DebugAction}() + @test repr(DebugSolverState(st, empty_dbg)) == "DebugSolverState($(repr(st)), $(repr(empty_dbg)))" # Passthrough dss = DebugSolverState(st, a1) Manopt.set_parameter!(dss, :StoppingCriterion, :MaxIteration, 20) @@ -74,10 +84,9 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value @test String(take!(io)) == "" # Change of Iterate and recording a custom field a2 = DebugChange(; - storage = StoreStateAction(M; store_points = Tuple{:Iterate}, p_init = p), - prefix = "Last: ", - io = io, + storage = StoreStateAction(M; store_points = Tuple{:Iterate}, p_init = p), prefix = "Last: ", io = io, ) + @test startswith(Manopt.status_summary(a2), "A DebugAction to print the change of the iterate ") a2(mp, st, 0) # init st.p = [3.0, 2.0] a2(mp, st, 1) @@ -115,11 +124,13 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value DebugIteration(; io = io)(mp, st, 23) @test String(take!(io)) == "# 23 " @test repr(DebugIteration()) == "DebugIteration(; format=\"# %-6d\")" - @test Manopt.status_summary(DebugIteration()) == "(:Iteration, \"# %-6d\")" + @test Manopt.status_summary(DebugIteration(); context = :short) == "(:Iteration, \"# %-6d\")" + @test Manopt.status_summary(DebugIteration()) == "A DebugAction that prints the current iteration number in format “# %-6d”" # `DebugEntryChange` dec = DebugEntryChange(:p, x -> x) @test startswith(repr(dec), "DebugEntryChange(:p") - # DEbugEntryChange - reset + @test startswith(Manopt.status_summary(dec), "A DebugAction that prints the change of the entry") + # DebugEntryChange - reset st.p = p a3 = DebugEntryChange( :p, @@ -155,19 +166,21 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value @test String(take!(io)) == "At iteration 20 the algorithm reached its maximal number of iterations (20).\n" @test repr(DebugStoppingCriterion()) == "DebugStoppingCriterion()" - @test Manopt.status_summary(DebugStoppingCriterion()) == ":Stop" + @test Manopt.status_summary(DebugStoppingCriterion(); context = :short) == ":Stop" + @test Manopt.status_summary(DebugStoppingCriterion()) == "A DebugAction printing the reason why a solver has stopped." # Status for multiple dictionaries dss = DebugSolverState(st, DebugFactory([:Stop, 20, "|"])) @test contains(Manopt.status_summary(dss), ":Stop") @test Manopt.get_message(dss) == "" # DebugEvery summary de = DebugEvery(DebugGroup([DebugDivider("|"), DebugIteration()]), 10) - @test Manopt.status_summary(de) == "[\"|\", (:Iteration, \"# %-6d\"), 10]" + @test Manopt.status_summary(de; context = :short) == "[\"|\", (:Iteration, \"# %-6d\"), 10]" # DebugGradientChange dgc = DebugGradientChange() dgc_s = "DebugGradientChange(; format=\"Last Change: %f\", vector_transport_method=ParallelTransport())" @test repr(dgc) == dgc_s - @test Manopt.status_summary(dgc) == "(:GradientChange, \"Last Change: %f\")" + @test Manopt.status_summary(dgc; context = :short) == "(:GradientChange, \"Last Change: %f\")" + @test Manopt.status_summary(dgc) == "A DebugAction printing the change of the gradient with format “Last Change: %f”" # Faster storage dgc2 = DebugGradientChange(Euclidean(2)) @test repr(dgc2) == dgc_s @@ -179,30 +192,13 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value @test isa(df[:Iteration], DebugDivider) df = DebugFactory([:Stop, "|", 20]) @test isa(df[:Iteration], DebugEvery) - s = [ - :Change, - :GradientChange, - :Iteration, - :Iterate, - :Cost, - :Stepsize, - :p, - :Time, - :IterativeTime, - ] + s = [:Change, :GradientChange, :Iteration, :Iterate, :Cost, :Stepsize, :p, :Time, :IterativeTime] @test all( isa.( DebugFactory(s)[:Iteration].group, [ - DebugChange, - DebugGradientChange, - DebugIteration, - DebugIterate, - DebugCost, - DebugStepsize, - DebugEntry, - DebugTime, - DebugTime, + DebugChange, DebugGradientChange, DebugIteration, DebugIterate, DebugCost, + DebugStepsize, DebugEntry, DebugTime, DebugTime, ], ), ) @@ -211,15 +207,8 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value isa.( DebugFactory([(t, "A") for t in s])[:Iteration].group, [ - DebugChange, - DebugGradientChange, - DebugIteration, - DebugIterate, - DebugCost, - DebugStepsize, - DebugEntry, - DebugTime, - DebugTime, + DebugChange, DebugGradientChange, DebugIteration, DebugIterate, DebugCost, + DebugStepsize, DebugEntry, DebugTime, DebugTime, ], ), ) @@ -264,8 +253,9 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value mp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) w1 = DebugWarnIfCostNotFinite() - @test repr(w1) == "DebugWarnIfCostNotFinite()" - @test Manopt.status_summary(w1) == ":WarnCost" + @test repr(w1) == "DebugWarnIfCostNotFinite(:Once)" + @test Manopt.status_summary(w1; context = :short) == ":WarnCost" + @test Manopt.status_summary(w1) == "A DebugAction to issue a warning when the cost is no longer finite. It will only warn once." @test_logs (:warn,) (:warn,) w1(mp, st, 0) w2 = DebugWarnIfCostNotFinite(:Always) @test_logs (:warn,) w2(mp, st, 0) @@ -273,6 +263,7 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value st.X = grad_f(M, p) w3 = DebugWarnIfFieldNotFinite(:X) @test repr(w3) == "DebugWarnIfFieldNotFinite(:X, :Once)" + @test startswith(Manopt.status_summary(w3), "A DebugAction to warn if the field") @test_logs (:warn,) (:warn,) w3(mp, st, 0) w4 = DebugWarnIfFieldNotFinite(:X, :Always) @test_logs (:warn,) w4(mp, st, 1) @@ -283,6 +274,7 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value mp2 = DefaultManoptProblem(M2, ManifoldGradientObjective(f, grad_f)) w6 = DebugWarnIfGradientNormTooLarge(1.0, :Once) @test repr(w6) == "DebugWarnIfGradientNormTooLarge(1.0, :Once)" + @test startswith(Manopt.status_summary(w6), "A DebugAction warning if the gradient norm gets larger than") st.X .= [4.0, 0.0] # > π in norm @test_logs (:warn,) (:warn,) w6(mp2, st, 1) @@ -292,8 +284,12 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value w8 = DebugWarnIfStepsizeCollapsed(1.0, :Once) @test repr(w8) == "DebugWarnIfStepsizeCollapsed(1.0, :Once)" + @test startswith(Manopt.status_summary(w8), "A DebugAction warning if the step size collapses") @test_logs (:warn,) (:warn,) w8(mp2, st, 1) + w9 = DebugWarnIfCostIncreases() + @test startswith(repr(w9), "DebugWarnIfCostIncreases(") + df1 = DebugFactory([:WarnCost]) @test isa(df1[:Iteration], DebugWarnIfCostNotFinite) df2 = DebugFactory([:WarnGradient]) @@ -341,53 +337,68 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value drs = "DebugTime(; format=\"time spent: %s\", mode=:cumulative)" @test repr(DebugTime()) == drs drs2 = "(:IterativeTime, \"time spent: %s\")" - @test Manopt.status_summary(DebugTime(; mode = :iterative)) == drs2 + drs2h = "a DebugActin to print time per step iteratively" + @test Manopt.status_summary(DebugTime(; mode = :iterative); context = :short) == drs2 + @test Manopt.status_summary(DebugTime(; mode = :iterative)) == drs2h drs3 = "(:Time, \"time spent: %s\")" - @test Manopt.status_summary(DebugTime(; mode = :cumulative)) == drs3 + drs3h = "a DebugActin to print time per step cumulatively" + @test Manopt.status_summary(DebugTime(; mode = :cumulative); context = :short) == drs3 + @test Manopt.status_summary(DebugTime(; mode = :cumulative)) == drs3h end @testset "Debug show/summaries" begin d1 = DebugDivider("|") d2 = DebugIterate() d3 = DebugGroup([d1, d2]) @test repr(d3) == "DebugGroup([$(d1), $(d2)])" - ts = "[ $(Manopt.status_summary(d1)), $(Manopt.status_summary(d2)) ]" - @test Manopt.status_summary(d3) == ts - + ts = "[ $(Manopt.status_summary(d1; context = :short)), $(Manopt.status_summary(d2; context = :short)) ]" + @test Manopt.status_summary(d3; context = :short) == ts + tsi = "A DebugAction consisting of a group actions, $(Manopt.status_summary(d1; context = :inline)), and $(Manopt.status_summary(d2; context = :inline))" + @test Manopt.status_summary(d3; context = :inline) == tsi + tsd = "A DebugAction consisting of a group with the following elements\n* $(Manopt.status_summary(d1))\n* $(Manopt.status_summary(d2))" + @test Manopt.status_summary(d3) == tsd d4 = DebugEvery(d1, 4) @test repr(d4) == "DebugEvery($(d1), 4, true; activation_offset=1)" - @test Manopt.status_summary(d4) === "[$(d1), 4]" - + @test Manopt.status_summary(d4; context = :short) === "[$(Manopt.status_summary(d1; context = :short)), 4]" + de_d = "A DebugAction wrapping the following DebugAction to only print it every" + @test startswith(Manopt.status_summary(d4), de_d) ts2 = "DebugChange(; format=\"Last Change: %f\", inverse_retraction=LogarithmicInverseRetraction())" @test repr(DebugChange()) == ts2 - @test Manopt.status_summary(DebugChange()) == "(:Change, \"Last Change: %f\")" + @test Manopt.status_summary(DebugChange(); context = :short) == "(:Change, \"Last Change: %f\")" + @test startswith(Manopt.status_summary(DebugChange()), "A DebugAction to print the change of") # verify that a non-default manifold works as well - not sure how to test this then d = DebugChange(Euclidean(2)) @test repr(DebugCost()) == "DebugCost(; format=\"f(x): %f\", at_init=true)" - @test Manopt.status_summary(DebugCost()) == "(:Cost, \"f(x): %f\")" + @test Manopt.status_summary(DebugCost(); context = :short) == "(:Cost, \"f(x): %f\")" + @test Manopt.status_summary(DebugCost()) == "A DebugAction printing the current cost value" @test repr(DebugDivider("|")) == "DebugDivider(; divider=\"|\", at_init=true)" - @test Manopt.status_summary(DebugDivider("a")) == "\"a\"" + @test Manopt.status_summary(DebugDivider("a"); context = :short) == "\"a\"" + @test Manopt.status_summary(DebugDivider("a")) == "A DebugAction printing the String “a” as a divider" @test repr(DebugEntry(:a)) == "DebugEntry(:a; format=\"a: %s\", at_init=true)" + @test startswith(Manopt.status_summary(DebugEntry(:a)), "A DebugAction to print the field :a") @test repr(DebugStepsize()) == "DebugStepsize(; format=\"s:%s\", at_init=true)" - @test Manopt.status_summary(DebugStepsize()) == "(:Stepsize, \"s:%s\")" + @test Manopt.status_summary(DebugStepsize(); context = :short) == "(:Stepsize, \"s:%s\")" + @test startswith(Manopt.status_summary(DebugStepsize()), "A DebugAction that prints the current step size") @test repr(DebugGradientNorm()) == "DebugGradientNorm(; format=\"|grad f(p)|:%s\", at_init=true)" dgn_s = "(:GradientNorm, \"|grad f(p)|:%s\")" - @test Manopt.status_summary(DebugGradientNorm()) == dgn_s + @test Manopt.status_summary(DebugGradientNorm(); context = :short) == dgn_s + @test startswith(Manopt.status_summary(DebugGradientNorm(); context = :default), "A debug action to display the gradient norm") @test repr(DebugGradient()) == "DebugGradient(; format=\"grad f(p):%s\", at_init=false)" dg_s = "(:Gradient, \"grad f(p):%s\")" - @test Manopt.status_summary(DebugGradient()) == dg_s + @test Manopt.status_summary(DebugGradient(); context = :short) == dg_s end @testset "Debug Messages" begin s = TestMessageState() mp = DefaultManoptProblem(Euclidean(2), ManifoldCostObjective(x -> x)) d = DebugMessages(:Info, :Always) @test repr(d) == "DebugMessages(:Info, :Always)" - @test Manopt.status_summary(d) == "(:Messages, :Always)" + @test Manopt.status_summary(d; context = :short) == "(:InfoMessages, :Always)" + @test startswith(Manopt.status_summary(d), "A DebugAction printing messages collected during the last iteration") @test_logs (:info, "DebugTest") d(mp, s, 0) end @testset "DebugIfEntry" begin @@ -395,10 +406,8 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value M = ManifoldsBase.DefaultManifold(2) p = [-4.0, 2.0] st = GradientDescentState( - M; - p = p, - stopping_criterion = StopAfterIteration(20), - stepsize = Manopt.ConstantStepsize(M), + M; p = p, + stopping_criterion = StopAfterIteration(20), stepsize = Manopt.ConstantStepsize(M), ) f(M, y) = Inf grad_f(M, y) = Inf .* ones(2) @@ -406,6 +415,7 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value die1 = DebugIfEntry(:p, p -> p[1] > 0.0; type = :warn, message = "test1") @test startswith(repr(die1), "DebugIfEntry(:p, ") + @test startswith(Manopt.status_summary(die1), "A DebugAction printing the entry ") @test_logs (:warn, "test1") die1(mp, st, 1) die2 = DebugIfEntry(:p, p -> p[1] > 0.0; type = :info, message = "test2") @test_logs (:info, "test2") die2(mp, st, 1) @@ -435,7 +445,7 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value Manopt.set_parameter!(dA, :Activity, true) # activate @test dA.active @test repr(dA) == "DebugWhenActive($(repr(dD)), true, true)" - @test Manopt.status_summary(dA) == repr(dA) + @test contains(Manopt.status_summary(dA), "## Current activity\nactive") #issue active dA(mp, st, 1) @test endswith(String(take!(io)), " | ") @@ -496,7 +506,8 @@ Manopt.get_parameter(d::TestDebugParameterState, ::Val{:value}) = d.value end test_simple_callback() dbc = Manopt.DebugCallback(() -> nothing; simple = true) - @test startswith(repr(dbc), "DebugCallback containing") - @test startswith(Manopt.status_summary(dbc), "#") + @test startswith(repr(dbc), "DebugCallback(") + @test startswith(Manopt.status_summary(dbc; context = :short), "#") + @test startswith(Manopt.status_summary(dbc), "A DebugAction with a callback that calls #") end end diff --git a/test/plans/test_difference_of_convex_plan.jl b/test/plans/test_difference_of_convex_plan.jl index 103ce3560c..1f1d15ff48 100644 --- a/test/plans/test_difference_of_convex_plan.jl +++ b/test/plans/test_difference_of_convex_plan.jl @@ -10,6 +10,7 @@ using LRUCache, LinearAlgebra, Manifolds, Manopt, Test grad_g(M, p) = 4 * log(det(p))^3 * p grad_g!(M, X, p) = (X .= 4 * log(det(p))^3 * p) f(M, p) = g(M, p) - h(M, p) + grad_f(M, p) = grad_g(M, p) - grad_h(M, p) p = log(2) * Matrix{Float64}(I, n, n) G = grad_g(M, p) @@ -103,4 +104,22 @@ using LRUCache, LinearAlgebra, Manifolds, Manopt, Test @test get_count(ddo, :SubtrahendGradient) == 2 end end + @testset "show/repr and status_summary" begin + dc_obj_ga = ManifoldDifferenceOfConvexObjective(f, grad_h; gradient = grad_g) + s1 = repr(dc_obja) + @test startswith(s1, "ManifoldDifferenceOfConvexObjective(f, grad_h;") + s2 = repr(dc_obj_ga) + @test startswith(s2, "ManifoldDifferenceOfConvexObjective(f, grad_h;") + @test contains(s2, "gradient = grad_g") + s3 = Manopt.status_summary(dc_obja; context = :default) + @test contains(s3, "## Functions") + dcp_obj_ga = ManifoldDifferenceOfConvexProximalObjective(grad_h; gradient = grad_f, cost = f) + s4 = repr(dcp_obja) + @test startswith(s4, "ManifoldDifferenceOfConvexProximalObjective(grad_h;") + s5 = repr(dcp_obj_ga) + @test startswith(s5, "ManifoldDifferenceOfConvexProximalObjective(grad_h;") + @test contains(s5, "gradient = grad_f") + s6 = Manopt.status_summary(dcp_obja; context = :default) + @test contains(s6, "## Functions") + end end diff --git a/test/plans/test_embedded.jl b/test/plans/test_embedded.jl index d39d545a96..2d5792171a 100644 --- a/test/plans/test_embedded.jl +++ b/test/plans/test_embedded.jl @@ -39,9 +39,10 @@ using Manifolds, Manopt, Test, LinearAlgebra, Random end end # Without interim caches for p and X - @test repr(eo4) == - "EmbeddedManifoldObjective{Missing,Missing} of an $(repr(eo4.objective))" - + eo4repr = repr(eo4) + @test startswith(eo4repr, "EmbeddedManifoldObjective(ManifoldHessianObjective(") + @test endswith(eo4repr, "missing, missing)") + @test startswith(Manopt.status_summary(eo4), "An embedded objective\n\n") # Constraints, though this is not the most practical constraint o2 = ConstrainedManifoldObjective(f, ∇f, [f], [∇f], [f], [∇f]) eco1 = EmbeddedManifoldObjective(M, o2) diff --git a/test/plans/test_gradient_plan.jl b/test/plans/test_gradient_plan.jl index ba7e5c9e51..92062b6402 100644 --- a/test/plans/test_gradient_plan.jl +++ b/test/plans/test_gradient_plan.jl @@ -11,10 +11,8 @@ using ManifoldsBase, Manopt, Test p = [1.0, 2.0] X = [0.2, 0.3] gst = GradientDescentState( - M; - p = zero(p), - stopping_criterion = StopAfterIteration(20), - stepsize = Manopt.ConstantStepsize(M), + M; p = zero(p), + stopping_criterion = StopAfterIteration(20), stepsize = Manopt.ConstantStepsize(M), ) @test stopped_at(gst) == -1 set_iterate!(gst, M, q) @@ -39,6 +37,7 @@ using ManifoldsBase, Manopt, Test @test_throws MethodError get_proximal_map(mp, 1.0, gst.p, 1) @testset "Debug Gradient" begin a1 = DebugGradient(; long = false, io = io) + @test startswith(Manopt.status_summary(a1), "A DebugAction to print the gradient") a1(mp, gst, 1) @test String(take!(io)) == "grad f(p):[1.0, 0.0]" a1a = DebugGradient(; prefix = "s:", io = io) @@ -59,12 +58,16 @@ using ManifoldsBase, Manopt, Test end @testset "Record Gradient" begin b1 = RecordGradient(gst.X) + @test startswith(Manopt.status_summary(b1), "A RecordAction to record the current gradient") b1(mp, gst, 1) @test b1.recorded_values == [gst.X] b2 = RecordGradientNorm() + @test startswith(Manopt.status_summary(b2), "A RecordAction to record the current gradient norm") b2(mp, gst, 1) @test b2.recorded_values == [1.0] b3 = RecordStepsize() + @test startswith(Manopt.status_summary(b3), "A RecordAction to record the current stepsize") + @test startswith(repr(b3), "RecordStepsize(") b3(mp, gst, 1) b3(mp, gst, 2) b3(mp, gst, 3) @@ -98,7 +101,7 @@ using ManifoldsBase, Manopt, Test @test get_count(cmcgo, :Gradient) == 2 @test get_count(cmcgo, :Cost) == 3 end - @testset "Objective Decorator passthrough" begin + @testset "Objective Decorator pass through" begin ddo = Manopt.Test.DummyDecoratedObjective(mgo) @test get_cost(M, mgo, p) == get_cost(M, ddo, p) @test get_gradient(M, mgo, p) == get_gradient(M, ddo, p) @@ -123,7 +126,7 @@ using ManifoldsBase, Manopt, Test # the number represents the case, a/i alloc/inplace # Use old names here mfo1a = ManifoldCostGradientObjective(fg) - @test startswith(repr(mfo1a), "ManifoldFirstOrderObjective{AllocatingEvaluation, ") + @test startswith(repr(mfo1a), "ManifoldFirstOrderObjective(; costgradient = ") mfo1i = ManifoldCostGradientObjective(fg!; evaluation = InplaceEvaluation()) mfo2a = ManifoldGradientObjective(f, grad_f) mfo2i = ManifoldGradientObjective(f, grad_f!; evaluation = InplaceEvaluation()) @@ -213,4 +216,26 @@ using ManifoldsBase, Manopt, Test @test_throws ErrorException Manopt.get_cost_and_gradient!(M, Y, mfo_f, q) end end + @testset "DirectionUpdateRules" begin + M = ManifoldsBase.DefaultManifold(2) + idr = Gradient()(M) # Gradient factory yields IdentityUpdateRule + @test typeof(idr) == Manopt.IdentityUpdateRule + @test contains(Manopt.status_summary(idr), "evaluates the gradient") + + mgr = MomentumGradient()(M) # Actually produces a rule + @test startswith(repr(mgr), "MomentumGradientRule") + @test contains(Manopt.status_summary(mgr), "Momentum Gradient Rule") + + agr = AverageGradient()(M) # Actually produces a rule + @test startswith(repr(agr), "AverageGradientRule") + @test contains(Manopt.status_summary(agr), "Average Gradient Rule") + + nr = Nesterov()(M) # Actually produces a rule + @test startswith(repr(nr), "NesterovRule") + @test contains(Manopt.status_summary(nr), "Nesterov Rule") + + pr = PreconditionedDirection((M, Y, p, X) -> copyto!(M, Y, p, X); evaluation = InplaceEvaluation())(M) + @test startswith(repr(pr), "PreconditionedDirectionRule") + @test contains(Manopt.status_summary(pr), "Preconditioned Direction Rule") + end end diff --git a/test/plans/test_hessian_plan.jl b/test/plans/test_hessian_plan.jl index e7855cd489..ff5c42c81b 100644 --- a/test/plans/test_hessian_plan.jl +++ b/test/plans/test_hessian_plan.jl @@ -32,12 +32,15 @@ using LRUCache, Manifolds, Manopt, Test, Random @test get_hessian(mp, p, X) == 0.5 * X get_hessian!(mp, Y, p, X) @test Y == 0.5 * X - # precondition + # precondition - alweays identity, since the precon we use in mho2 is id as well @test get_preconditioner(mp, p, X) == X get_preconditioner!(mp, Y, p, X) @test Y == X + # show / status summary + @test startswith(Manopt.status_summary(mho), "A second order objective providing a cost, a gradient") + @test contains(Manopt.status_summary(mho), "preconditioner") == (mho === mho2) end - @testset "Objective Decorator passthrough" begin + @testset "Objective Decorator pass through" begin Y1 = zero_vector(M, p) Y2 = zero_vector(M, p) for obj in [mho1, mho2, mho3, mho4] diff --git a/test/plans/test_higher_order_primal_dual_plan.jl b/test/plans/test_higher_order_primal_dual_plan.jl index b8b9af6d42..7a41406370 100644 --- a/test/plans/test_higher_order_primal_dual_plan.jl +++ b/test/plans/test_higher_order_primal_dual_plan.jl @@ -52,21 +52,18 @@ using ManifoldDiff: X = log(M, p0, m) Ξ = X + obj1 = PrimalDualManifoldSemismoothNewtonObjective( + f, prox_f, Dprox_F, prox_g_dual, Dprox_G_dual, DΛ, adjoint_DΛ + ) + obj2 = PrimalDualManifoldSemismoothNewtonObjective( + f, prox_f!, Dprox_F!, prox_g_dual!, Dprox_G_dual!, DΛ!, adjoint_DΛ!; + evaluation = InplaceEvaluation(), + ) + obj3 = PrimalDualManifoldSemismoothNewtonObjective( + f, prox_f, Dprox_F, prox_g_dual, Dprox_G_dual, DΛ, adjoint_DΛ, Λ = Λ + ) @testset "test Mutating/Allocation Problem Variants" begin - obj1 = PrimalDualManifoldSemismoothNewtonObjective( - f, prox_f, Dprox_F, prox_g_dual, Dprox_G_dual, DΛ, adjoint_DΛ - ) p1 = TwoManifoldProblem(M, N, obj1) - obj2 = PrimalDualManifoldSemismoothNewtonObjective( - f, - prox_f!, - Dprox_F!, - prox_g_dual!, - Dprox_G_dual!, - DΛ!, - adjoint_DΛ!; - evaluation = InplaceEvaluation(), - ) p2 = TwoManifoldProblem(M, N, obj2) x1 = get_differential_primal_prox(p1, 1.0, p0, X) x2 = get_differential_primal_prox(p2, 1.0, p0, X) @@ -81,5 +78,23 @@ using ManifoldDiff: get_differential_dual_prox!(p1, ξ1, n, 1.0, ξ0, Ξ) get_differential_dual_prox!(p2, ξ2, n, 1.0, ξ0, Ξ) @test ξ1 ≈ ξ2 atol = 2 * 1.0e-16 + + @test startswith(repr(p1), "TwoManifoldProblem") + @test startswith(Manopt.status_summary(p1), "An optimization problem for Manopt.jl requiring a primal and a dual manifold") + end + @testset "show/repr and status_summary" begin + s1 = repr(obj1) + @test startswith(s1, "PrimalDualManifoldSemismoothNewtonObjective(f,") + s2 = repr(obj2) + @test startswith(s2, "PrimalDualManifoldSemismoothNewtonObjective(f,") + @test contains(s2, "InplaceEvaluation") + s3 = repr(obj3) + @test startswith(s3, "PrimalDualManifoldSemismoothNewtonObjective(f,") + @test contains(s3, "Λ = ") + s4 = Manopt.status_summary(obj1) + @test startswith(s4, "A primal dual semismooth Newton objective") + s5 = Manopt.status_summary(obj3) + @test startswith(s5, "A primal dual semismooth Newton objective") + @test contains(s5, "Λ") end end diff --git a/test/plans/test_interior_point_newton_plan.jl b/test/plans/test_interior_point_newton_plan.jl index 12bbec9e51..3a5fc360a5 100644 --- a/test/plans/test_interior_point_newton_plan.jl +++ b/test/plans/test_interior_point_newton_plan.jl @@ -44,15 +44,9 @@ using ManifoldsBase, Manifolds, Manopt, Test, RecursiveArrayTools sub_p[sub_M, 1] = p sub_p[sub_M, 2] = λ coh = ConstrainedManifoldObjective( - f, - grad_f; - hess_f = hess_f, - g = g, - grad_g = grad_g, - hess_g = hess_g, - h = h, - grad_h = grad_h, - hess_h = hess_h, + f, grad_f; hess_f = hess_f, + g = g, grad_g = grad_g, hess_g = hess_g, + h = h, grad_h = grad_h, hess_h = hess_h, M = M, ) sub_obj = SymmetricLinearSystemObjective( @@ -70,13 +64,17 @@ using ManifoldsBase, Manifolds, Manopt, Test, RecursiveArrayTools @test set_gradient!(ipns, M, 3 * p) == ipns @test get_gradient(ipns) == 3 * p show_str = "# Solver state for `Manopt.jl`s Interior Point Newton Method\n" - @test startswith(repr(ipns), show_str) - # + @test startswith(Manopt.status_summary(ipns; context = :default), show_str) + @test startswith(repr(ipns), "InteriorPointNewtonState(") + # status summary of KKT Vectorfields + @test startswith(Manopt.status_summary(sub_obj.A!!), "The Jacobian of the condensed KKT vector field for the constrained objective") + @test startswith(Manopt.status_summary(sub_obj.b!!), "The condensed KKT vector field for the constrained objective") sc = StopWhenKKTResidualLess(1.0e-5) @test length(get_reason(sc)) == 0 @test !sc(dmp, ipns, 1) #not yet reached @test Manopt.indicates_convergence(sc) - @test startswith(repr(sc), "StopWhenKKTResidualLess(1.0e-5)\n") + @test startswith(repr(sc), "StopWhenKKTResidualLess(1.0e-5)") + @test startswith(Manopt.status_summary(sc), "Stop when the KKT residual is less than ") # Fake stop sc.residual = 1.0e-7 sc.at_iteration = 1 diff --git a/test/plans/test_mesh_adaptive_plan.jl b/test/plans/test_mesh_adaptive_plan.jl index af9832bb0a..335e791366 100644 --- a/test/plans/test_mesh_adaptive_plan.jl +++ b/test/plans/test_mesh_adaptive_plan.jl @@ -14,7 +14,8 @@ using ManifoldsBase, Manifolds, Manopt, Test, Random p2 = [2.0, 0.0, 0.0] @test Manopt.update_basepoint!(M, ltap, p2) === ltap @test Manopt.get_basepoint(ltap) == p2 - @test startswith(repr(ltap), "LowerTriangularAdaptivePoll\n") + @test startswith(repr(ltap), "LowerTriangularAdaptivePoll(; ") + @test startswith(Manopt.status_summary(ltap), "A Lower triangular adaptive poll\n\n") # test call Random.seed!(42) ltap(cmp, mesh_size) @@ -34,7 +35,8 @@ using ManifoldsBase, Manifolds, Manopt, Test, Random dmads = DefaultMeshAdaptiveDirectSearch(M, p) @test !Manopt.is_successful(dmads) @test Manopt.get_candidate(dmads) == dmads.p - @test startswith(repr(dmads), "DefaultMeshAdaptiveDirectSearch\n") + @test startswith(repr(dmads), "DefaultMeshAdaptiveDirectSearch(; ") + @test startswith(Manopt.status_summary(dmads), "The default mesh adaptive direct search\nalong one given direction X.\n\n") X = -ones(3) # This step would bring us to zero, but we only allow a max step 1.0 dmads(cmp, 1.0, p, X; max_stepsize = 1.0) @@ -46,7 +48,8 @@ using ManifoldsBase, Manifolds, Manopt, Test, Random p = ones(3) mads = MeshAdaptiveDirectSearchState(M, p) @test startswith( - repr(mads), "# Solver state for `Manopt.jl`s mesh adaptive direct search\n" + Manopt.status_summary(mads; context = :default), + "# Solver state for `Manopt.jl`s mesh adaptive direct search\n" ) @test get_iterate(mads) == p @test get_solver_result(mads) == p diff --git a/test/plans/test_nelder_mead_plan.jl b/test/plans/test_nelder_mead_plan.jl index 3cd2812e11..08ace06124 100644 --- a/test/plans/test_nelder_mead_plan.jl +++ b/test/plans/test_nelder_mead_plan.jl @@ -4,6 +4,8 @@ using Manifolds, Manopt, Test @testset "Nelder Mead State" begin M = Euclidean(2) o = NelderMeadState(M) + @test startswith(Manopt.status_summary(o), "# Solver state for `Manopt.jl`s Nelder Mead Algorithm") + @test startswith(repr(o), "NelderMeadState(; ") o2 = NelderMeadState(M; population = o.population) @test o.p == o2.p @test o.population == o2.population @@ -17,6 +19,8 @@ using Manifolds, Manopt, Test obj = ManifoldCostObjective(f) mp = DefaultManoptProblem(M, obj) s = StopWhenPopulationConcentrated(0.1, 0.1) + @test startswith(Manopt.status_summary(s), "Stop when the population of a swarm is") + @test startswith(repr(s), "StopWhenPopulationConcentrated(") # tweak an iteration o.costs = [0.0, 0.1] @test !s(mp, o, 1) diff --git a/test/plans/test_nonlinear_least_squares_plan.jl b/test/plans/test_nonlinear_least_squares_plan.jl index 6dd62453fd..7c4610a5c1 100644 --- a/test/plans/test_nonlinear_least_squares_plan.jl +++ b/test/plans/test_nonlinear_least_squares_plan.jl @@ -24,26 +24,26 @@ using Manifolds, Manopt, Test # Smoothing types # Test all (new) possible combinations of vectorial cost and Jacobian - # (1) [F] Function (Gradient), [C] Component (Gradients), [J] Coordinate (Jacobian in Basis) + # (1) Function (F, Gradient), Component (C, Gradients), [J] Coordinate (Jacobian in Basis) # (2) [a] allocating [i] in place - nlsoFa = NonlinearLeastSquaresObjective( + nlsoFa = ManifoldNonlinearLeastSquaresObjective( f, JF, 2; jacobian_type = FunctionVectorialType() ) - nlsoFi = NonlinearLeastSquaresObjective( + nlsoFi = ManifoldNonlinearLeastSquaresObjective( f!, JF!, 2; evaluation = InplaceEvaluation(), jacobian_type = FunctionVectorialType(), ) - nlsoCa = NonlinearLeastSquaresObjective( + nlsoCa = ManifoldNonlinearLeastSquaresObjective( [f1, f2], [j1, j2], 2; function_type = ComponentVectorialType(), jacobian_type = ComponentVectorialType(), ) - nlsoCi = NonlinearLeastSquaresObjective( + nlsoCi = ManifoldNonlinearLeastSquaresObjective( [f1, f2], [j1!, j2!], 2; @@ -51,10 +51,10 @@ using Manifolds, Manopt, Test jacobian_type = ComponentVectorialType(), evaluation = InplaceEvaluation(), ) - nlsoJa = NonlinearLeastSquaresObjective( + nlsoJa = ManifoldNonlinearLeastSquaresObjective( f, J, 2; jacobian_type = CoordinateVectorialType() ) - nlsoJi = NonlinearLeastSquaresObjective(f!, J!, 2; evaluation = InplaceEvaluation()) + nlsoJi = ManifoldNonlinearLeastSquaresObjective(f!, J!, 2; evaluation = InplaceEvaluation()) p = [0.5, 0.5] V = [0.0, 0.0] @@ -77,6 +77,8 @@ using Manifolds, Manopt, Test # jacobian of the objective G2 = get_jacobian(M, nlso.objective, p) @test G2 == Gt + @test startswith(repr(nlso), "ManifoldNonlinearLeastSquaresObjective(") + @test startswith(Manopt.status_summary(nlso), "A nonlinear least squares objective") end end @testset "Test Change of basis" begin @@ -90,4 +92,10 @@ using Manifolds, Manopt, Test # In practice both are the same basis in coordinates, so Jtt stays as iss @test J == Jt end + @testset "show/repr and status_summary" begin + M = Euclidean(3) + f(M, p) = p + J_f(M, p) = one(p) + mnlso = ManifoldNonlinearLeastSquaresObjective(f, J_f, 3) + end end diff --git a/test/plans/test_objective.jl b/test/plans/test_objective.jl index ef1671b48a..8da8cd91cb 100644 --- a/test/plans/test_objective.jl +++ b/test/plans/test_objective.jl @@ -7,19 +7,28 @@ using ManifoldsBase, Manopt, Test @test (get_objective(d) isa ManifoldCostObjective) @test Manopt.is_objective_decorator(d) @test !Manopt.is_objective_decorator(o) + io = IOBuffer() + show(io, MIME"text/plain"(), o) + @test startswith(String(take!(io)), "A cost function on a Riemannian manifold") + d = Manopt.Test.DummyEmptyDecoratedObjective(o) + # Check both default pass throughs + Manopt.status_summary(io, d) + @test startswith(String(take!(io)), "A cost function on a Riemannian manifold") + @test startswith(Manopt.status_summary(d), "A cost function on a Riemannian manifold") end - @testset "ReturnObjective" begin + @testset "ReturnManifoldObjective" begin o = ManifoldCostObjective(x -> x) r = Manopt.ReturnManifoldObjective(o) - @test repr(o) == "ManifoldCostObjective{AllocatingEvaluation}" - @test repr(r) == "ManifoldCostObjective{AllocatingEvaluation}" - @test Manopt.status_summary(o) == "" # both simplified to empty - @test Manopt.status_summary(r) == "" - @test repr((o, 1.0)) == - "To access the solver result, call `get_solver_result` on this variable." + @test repr(o) == "ManifoldCostObjective(f)" + @test repr(r) == "ReturnManifoldObjective(ManifoldCostObjective(f))" + @test Manopt.status_summary(o) == "A cost function on a Riemannian manifold `f = (M,p) -> ℝ`." + @test Manopt.status_summary(r) == "A cost function on a Riemannian manifold `f = (M,p) -> ℝ`." d = Manopt.Test.DummyDecoratedObjective(o) r2 = Manopt.ReturnManifoldObjective(d) - @test repr(r) == "ManifoldCostObjective{AllocatingEvaluation}" + # Still acts transparent for one of them + @test Manopt.status_summary(r2) == "A dummy decorator for A cost function on a Riemannian manifold `f = (M,p) -> ℝ`." + # repr contains all is much longer + @test repr(r2) == "ReturnManifoldObjective(DummyDecoratedObjective($(repr(o))))" end @testset "set_parameter!" begin o = ManifoldCostObjective(x -> x) @@ -35,9 +44,7 @@ using ManifoldsBase, Manopt, Test @test Manopt.get_gradient_function(oa)(M, p) == p @test Manopt.get_hessian_function(oa)(M, p, X) == X oi = ManifoldHessianObjective( - (M, p) -> p[1], - (M, X, p) -> (X .= p), - (M, Y, p, X) -> (Y .= X); + (M, p) -> p[1], (M, X, p) -> (X .= p), (M, Y, p, X) -> (Y .= X); evaluation = InplaceEvaluation(), ) @test Manopt.get_cost_function(oi)(M, p) == p[1] @@ -46,5 +53,7 @@ using ManifoldsBase, Manopt, Test @test Y == p @test Manopt.get_hessian_function(oi)(M, Y, p, X) == X @test Y == X + @test Manopt._to_kw(Manopt.ParentEvaluationType) == "evaluation = ParentEvaluationType()" + @test Manopt._to_kw(Manopt.AllocatingInplaceEvaluation) == "evaluation = AllocatingInplaceEvaluation()" end end diff --git a/test/plans/test_primal_dual_plan.jl b/test/plans/test_primal_dual_plan.jl index 6e53fd33b7..f1177c7d34 100644 --- a/test/plans/test_primal_dual_plan.jl +++ b/test/plans/test_primal_dual_plan.jl @@ -82,18 +82,18 @@ using RecursiveArrayTools @test all(get_iterate(s_exact) .== p0) osm = PrimalDualSemismoothNewtonState( - M; - m = m, - n = n, - p = zero.(p0), - X = X0, - primal_stepsize = 0.0, - dual_stepsize = 0.0, - regularization_parameter = 0.0, + M; m = m, n = n, p = zero.(p0), X = X0, + primal_stepsize = 0.0, dual_stepsize = 0.0, regularization_parameter = 0.0, ) set_iterate!(osm, p0) @test all(get_iterate(osm) .== p0) + @testset "show/repr" begin + for o in [pdmol, pdmoe] + @test startswith(Manopt.status_summary(o), "A primal dual objective") + @test startswith(repr(o), "PrimalDualManifoldObjective(") + end + end @testset "test Mutating/Allocation Problem Variants" begin pdmoa = PrimalDualManifoldObjective( f, prox_f, prox_g_dual, adjoint_DΛ; linearized_forward_operator = DΛ, Λ = Λ @@ -190,16 +190,22 @@ using RecursiveArrayTools d1(p_exact, s_exact, 1) s = String(take!(io)) @test startswith(s, "Dual Residual:") + @test startswith(Manopt.status_summary(d1), "A DebugAction to print the dual residual with format") + @test startswith(repr(d1), "DebugDualResidual(; ") d2 = DebugPrimalResidual(; storage = a, io = io) d2(p_exact, s_exact, 1) s = String(take!(io)) @test startswith(s, "Primal Residual: ") + @test startswith(Manopt.status_summary(d2), "A DebugAction to print the primal residual with format") + @test startswith(repr(d2), "DebugPrimalResidual(; ") d3 = DebugPrimalDualResidual(; storage = a, io = io) d3(p_exact, s_exact, 1) s = String(take!(io)) @test startswith(s, "PD Residual: ") + @test startswith(Manopt.status_summary(d3), "A DebugAction to print the primal dual residual with format") + @test startswith(repr(d3), "DebugPrimalDualResidual(; ") d4 = DebugPrimalChange(; storage = a, prefix = "Primal Change: ", io = io) d4(p_exact, s_exact, 1) @@ -220,6 +226,8 @@ using RecursiveArrayTools d7(p_exact, s_exact, 1) s = String(take!(io)) @test startswith(s, "Dual Change:") + @test startswith(repr(d7), "DebugDualChange(; ") + @test startswith(Manopt.status_summary(d7), "A DebugAction to print the change of the dual variable") d7a = DebugDualChange((X0, n); storage = a, io = io) d7a(p_exact, s_exact, 1) diff --git a/test/plans/test_problem.jl b/test/plans/test_problem.jl index a673a460ec..385b12db23 100644 --- a/test/plans/test_problem.jl +++ b/test/plans/test_problem.jl @@ -13,6 +13,10 @@ using Manopt, Manifolds, Test moi = ManifoldGradientObjective(f, grad_f!; evaluation = InplaceEvaluation()) cpi = DefaultManoptProblem(M, moi) @test Manopt.evaluation_type(cpi) === InplaceEvaluation + + io = IOBuffer() + show(io, MIME"text/plain"(), cpa) + @test startswith(String(take!(io)), "An optimization problem for Manopt.jl") end @testset "set_parameter functions" begin f(M, p) = 1 # dummy cost diff --git a/test/plans/test_record.jl b/test/plans/test_record.jl index 2b97c46d57..69a0ae7ecd 100644 --- a/test/plans/test_record.jl +++ b/test/plans/test_record.jl @@ -14,17 +14,16 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value M = ManifoldsBase.DefaultManifold(2) p = [4.0, 2.0] gds = GradientDescentState( - M; - p = copy(p), - stopping_criterion = StopAfterIteration(10), - stepsize = Manopt.ConstantStepsize(M), + M; p = copy(p), + stopping_criterion = StopAfterIteration(10), stepsize = Manopt.ConstantStepsize(M), ) f(M, q) = distance(M, q, p) .^ 2 grad_f(M, q) = -2 * log(M, q, p) dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) a = RecordIteration() @test repr(a) == "RecordIteration()" - @test Manopt.status_summary(a) == ":Iteration" + @test Manopt.status_summary(a; context = :short) == ":Iteration" + @test Manopt.status_summary(a) == "A RecordAction to record the current iteration number" # constructors rs = RecordSolverState(gds, a) Manopt.set_parameter!(rs, :Record, RecordCost()) @@ -35,6 +34,9 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value Manopt.set_parameter!(rs, :StoppingCriterion, :MaxIteration, 20) @test rs.state.stop.max_iterations == 20 #Maybe turn into a getter? # + rs_empty = RecordSolverState(gds, []) + @test contains(Manopt.status_summary(rs_empty), "No recordings registered.") + # @test get_initial_stepsize(dmp, rs) == 1.0 @test get_stepsize(dmp, rs, 1) == 1.0 @test get_last_stepsize(dmp, rs, 1) == 1.0 @@ -85,7 +87,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value @test_throws ErrorException RecordGroup(RecordAction[], Dict(:a => 1)) @test_throws ErrorException RecordGroup(RecordAction[], Dict(:a => 0)) b = RecordGroup([RecordIteration(), RecordIteration()], Dict(:It1 => 1, :It2 => 2)) - @test Manopt.status_summary(b) == "[ :Iteration, :Iteration ]" + @test Manopt.status_summary(b; context = :short) == "[:Iteration, :Iteration]" + @test startswith(Manopt.status_summary(b), "A group of 2 RecordActions:\n") @test repr(b) == "RecordGroup([RecordIteration(), RecordIteration()])" b(dmp, gds, 1) b(dmp, gds, 2) @@ -102,7 +105,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value @testset "RecordEvery" begin c = RecordEvery(a, 10, true) @test repr(c) == "RecordEvery(RecordIteration(), 10, true)" - @test Manopt.status_summary(c) == "[RecordIteration(), 10]" + @test Manopt.status_summary(c; context = :short) == "[:Iteration, 10]" + @test startswith(Manopt.status_summary(c), "A RecordAction that records every 10th iteration with\n") c(dmp, gds, 0) @test length(get_record(c)) === 0 c(dmp, gds, 1) @@ -118,7 +122,7 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value 10, ) @test repr(c2) == "RecordEvery($(repr(c2.record)), 10, true)" - @test Manopt.status_summary(c2) == "[:Iteration, :Iteration, 10]" + @test Manopt.status_summary(c2; context = :short) == "[:Iteration, :Iteration, 10]" c2(dmp, gds, 5) c2(dmp, gds, 10) c2(dmp, gds, 20) @@ -129,7 +133,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value d = RecordChange() sd = "RecordChange(; inverse_retraction_method=LogarithmicInverseRetraction())" @test repr(d) == sd - @test Manopt.status_summary(d) == ":Change" + @test Manopt.status_summary(d; context = :short) == ":Change" + @test startswith(Manopt.status_summary(d), "A RecordAction to record the change of the iterate") d(dmp, gds, 1) @test d.recorded_values == [0.0] # no p0 -> assume p is the first iterate set_iterate!(gds, M, p + [1.0, 0.0]) @@ -149,6 +154,7 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value set_iterate!(gds, M, p) f = RecordEntry(p, :p) @test repr(f) == "RecordEntry(:p)" + @test Manopt.status_summary(f) == "A RecordAction to record the solver state field :p" f(dmp, gds, 1) @test f.recorded_values == [p] f2 = RecordEntry(typeof(p), :p) @@ -159,6 +165,7 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value set_iterate!(gds, M, p) e = RecordEntryChange(:p, (p, o, x, y) -> distance(get_manifold(p), x, y)) @test startswith(repr(e), "RecordEntryChange(:p") + @test startswith(Manopt.status_summary(e), "A RecordAction to record the solver state field's :p change") @test update_storage!(e.storage, dmp, gds) == [:p] e(dmp, gds, 1) @test e.recorded_values == [0.0] @@ -169,7 +176,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value @testset "RecordIterate" begin set_iterate!(gds, M, p) f = RecordIterate(p) - @test Manopt.status_summary(f) == ":Iterate" + @test Manopt.status_summary(f; context = :short) == ":Iterate" + @test Manopt.status_summary(f) == "A RecordAction to record the current iterate" @test repr(f) == "RecordIterate(Vector{Float64})" @test_throws ErrorException RecordIterate() f(dmp, gds, 1) @@ -178,7 +186,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value @testset "RecordCost" begin g = RecordCost() @test repr(g) == "RecordCost()" - @test Manopt.status_summary(g) == ":Cost" + @test Manopt.status_summary(g; context = :short) == ":Cost" + @test Manopt.status_summary(g) == "A RecordAction to record the cost value" g(dmp, gds, 1) @test g.recorded_values == [0.0] gds.p = [3.0, 2.0] @@ -188,7 +197,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value @testset "RecordStoppingReason" begin g = RecordStoppingReason() @test repr(g) == "RecordStoppingReason()" - @test Manopt.status_summary(g) == ":Stop" + @test Manopt.status_summary(g; context = :short) == ":Stop" + @test startswith(Manopt.status_summary(g), "A RecordAction to record the stopping reason") @test length(get_record(g)) == 0 stop_solver!(dmp, gds, 21) # trigger stop g(dmp, gds, 21) # record @@ -198,7 +208,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value @testset "RecordSubsolver" begin rss = RecordSubsolver() @test repr(rss) == "RecordSubsolver(; record=[:Iteration], record_type=Any)" - @test Manopt.status_summary(rss) == ":Subsolver" + @test Manopt.status_summary(rss; context = :short) == ":Subsolver" + @test startswith(Manopt.status_summary(rss), "A RecordAction to record elements in from each subsolver") epms = ExactPenaltyMethodState(M, dmp, rs) rss(dmp, epms, 1) end @@ -206,7 +217,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value i = RecordIteration() rwa = RecordWhenActive(i) @test repr(rwa) == "RecordWhenActive(RecordIteration(), true, true)" - @test Manopt.status_summary(rwa) == repr(rwa) + @test Manopt.status_summary(rwa; context = :short) == repr(rwa) + @test startswith(Manopt.status_summary(rwa), "Record the following only, when active") rwa(dmp, gds, 1) @test length(get_record(rwa)) == 1 rwa(dmp, gds, -1) # Reset @@ -283,7 +295,8 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value @testset "RecordTime" begin h1 = RecordTime(; mode = :cumulative) @test repr(h1) == "RecordTime(; mode=:cumulative)" - @test Manopt.status_summary(h1) == ":Time" + @test Manopt.status_summary(h1, context = :short) == ":Time" + @test startswith(Manopt.status_summary(h1), "A RecordAction for recording times") t = h1.start @test t isa Nanosecond h1(dmp, gds, 1) @@ -304,7 +317,7 @@ Manopt.get_parameter(d::TestRecordParameterState, ::Val{:value}) = d.value @test length(h3.recorded_values) == 1 @test repr(RecordGradientNorm()) == "RecordGradientNorm()" # since only the type is stored can test - @test repr(RecordGradient(zeros(3))) == "RecordGradient{Vector{Float64}}()" + @test repr(RecordGradient(zeros(3))) == "RecordGradient(Vector{Float64})" end @testset "Record and parameter passthrough" begin s = TestRecordParameterState(0) diff --git a/test/plans/test_scaled_objective.jl b/test/plans/test_scaled_objective.jl index 15972ac363..5f238ae417 100644 --- a/test/plans/test_scaled_objective.jl +++ b/test/plans/test_scaled_objective.jl @@ -16,14 +16,16 @@ using LinearAlgebra, Manifolds, Manopt, Test, Random obj! = ManifoldHessianObjective(f, ∇f!, ∇²f!; evaluation = InplaceEvaluation()) neg_obj = -obj @test neg_obj isa ScaledManifoldObjective - s = "ScaledManifoldObjective based on a $(obj) with scale -1" + s = repr(neg_obj) + @test startswith(s, "ScaledManifoldObjective(ManifoldHessianObjective(f, ∇f") + @test endswith(s, "-1)") @test repr(neg_obj) == s scaled_obj = -1 * obj @test scaled_obj == neg_obj scaled_obj! = -1.0 * obj! # just verify that this also works for double decorated ones. deco_obj = ScaledManifoldObjective(ManifoldCountObjective(M, obj, [:Cost]), 0.5) - + @test startswith(Manopt.status_summary(scaled_obj), "A scaled version of the objective") # # Test and compare all accessors # diff --git a/test/plans/test_state.jl b/test/plans/test_state.jl index 8b2f2c2ebf..f7384317d3 100644 --- a/test/plans/test_state.jl +++ b/test/plans/test_state.jl @@ -9,8 +9,12 @@ struct NoIterateState <: AbstractManoptSolverState end pr = Manopt.Test.DummyProblem{typeof(M)}() s = Manopt.Test.DummyState() @test repr(Manopt.ReturnSolverState(s)) == "ReturnSolverState($s)" - @test Manopt.status_summary(Manopt.ReturnSolverState(s)) == - "Manopt.Test.DummyState(Float64[])" + srst = "A Manopt Test state with storage Float64[]" + @test Manopt.status_summary(Manopt.ReturnSolverState(s)) == srst + io = IOBuffer() + show(io, MIME"text/plain"(), Manopt.ReturnSolverState(s)) + @test startswith(String(take!(io)), srst) + a = ArmijoLinesearch(; initial_stepsize = 1.0)(M) @test get_last_stepsize(a) == 1.0 @test get_initial_stepsize(a) == 1.0 @@ -104,7 +108,7 @@ struct NoIterateState <: AbstractManoptSolverState end ddo = Manopt.Test.DummyDecoratedObjective(o) s = Manopt.Test.DummyState() rs = Manopt.ReturnSolverState(s) - @test Manopt.get_solver_return(o, rs) == s #no ReturnObjective + @test Manopt.get_solver_return(o, rs) == s #no ReturnManifoldObjective # Return O & S (a, b) = Manopt.get_solver_return(ro, rs) @test a == o diff --git a/test/plans/test_stepsize.jl b/test/plans/test_stepsize.jl index c12026628b..60c0f3ce5c 100644 --- a/test/plans/test_stepsize.jl +++ b/test/plans/test_stepsize.jl @@ -70,10 +70,15 @@ end @test Manopt.get_message(Manopt.ConstantStepsize(M, 1.0)) == "" s = Manopt.ArmijoLinesearchStepsize(Euclidean()) @test startswith(repr(s), "ArmijoLinesearch(;") - s_stat = Manopt.status_summary(s) + s_stat = Manopt.status_summary(s; context = :short) @test startswith(s_stat, "ArmijoLinesearch(;") - @test endswith(s_stat, "of 1.0") + s_stat2 = Manopt.status_summary(s) + @test startswith(s_stat2, "Armijo backtracking line search") @test Manopt.get_message(s) == "" + io = IOBuffer() + show(io, MIME"text/plain"(), s) + s_stat3 = String(take!(io)) + @test s_stat2 == s_stat3 s2 = NonmonotoneLinesearch()(M) @test startswith(repr(s2), "NonmonotoneLinesearch(;") @@ -81,13 +86,13 @@ end s3 = WolfePowellBinaryLinesearch()(M) @test Manopt.get_message(s3) == "" - @test startswith(repr(s3), "WolfePowellBinaryLinesearch(;") + @test startswith(repr(s3), "WolfePowellBinaryLinesearchStepsize(;") + @test get_last_stepsize(s3) == 0.0 + @test startswith(Manopt.status_summary(s3), "A Wolfe Powell bisection line search") # no stepsize yet so `repr` and summary are the same - @test repr(s3) == Manopt.status_summary(s3) s4 = WolfePowellLinesearch()(M) - @test startswith(repr(s4), "WolfePowellLinesearch(;") - # no stepsize yet so `repr` and summary are the same - @test repr(s4) == Manopt.status_summary(s4) + @test startswith(repr(s4), "WolfePowellLinesearchStepsize(;") + @test startswith(Manopt.status_summary(s4), "A Wolfe Powell line search") @test Manopt.get_message(s4) == "" @testset "Armijo setter / getters" begin # Check that the passdowns work, though; since the defaults are functions, they return nothing @@ -141,6 +146,8 @@ end mgo = ManifoldGradientObjective(f, grad_f) mp = DefaultManoptProblem(M, mgo) s = AdaptiveWNGradient(; gradient_reduction = 0.5, count_threshold = 2)(M) + @test startswith(Manopt.status_summary(s), "An adaptive Gradient WN step size") + @test startswith(repr(s), "AdaptiveWNGradientStepsize(; ") gds = GradientDescentState(M; p = p) @test get_initial_stepsize(s) == 1.0 @test get_last_stepsize(s) == 1.0 @@ -155,7 +162,7 @@ end @test s(mp, gds, 3) ≈ 3.1209362808842656 @test s.count == 0 # was reset @test s.weight == 0.75 # also reset to orig - @test startswith(repr(s), "AdaptiveWNGradient(;\n ") + @test startswith(repr(s), "AdaptiveWNGradientStepsize(;") end @testset "Absolute stepsizes" begin M = ManifoldsBase.DefaultManifold(2) @@ -170,6 +177,7 @@ end abs_dec_step = Manopt.DecreasingStepsize( M; length = 10.0, factor = 1.0, subtrahend = 0.0, exponent = 1.0, type = :absolute ) + @test startswith(repr(abs_dec_step), "DecreasingStepsize(; ") solve!(mp, gds) @test abs_dec_step(mp, gds, 1) == 10.0 / norm(get_manifold(mp), get_iterate(gds), get_gradient(gds)) @@ -186,9 +194,9 @@ end X = grad_f(M, p) sgs = SubGradientMethodState(M; p = p) ps = Polyak()() - @test repr(ps) == - "Polyak()\nA stepsize with keyword parameters\n * initial_cost_estimate = 0.0\n" + @test startswith(repr(ps), "Polyak(; γ = ") @test ps(dmp, sgs, 1) == (f(M, p) - 0 + 1) / (norm(M, p, X)^2) + @test startswith(Manopt.status_summary(ps), "Polyak step size with γ = ") end @testset "CubicBracketing Stepsize" begin M = Euclidean(2) @@ -255,6 +263,534 @@ end clbs = CubicBracketingLinesearch(; sufficient_curvature = 1.0e-16, min_bracket_width = 0.0, initial_stepsize = 0.5)(M) @test clbs(dmp, gs, 1) ≈ 1 / 6 atol = 5.0e-4 end + @testset "secant numerical stability" begin + # Large offset, small interval + a = 1.0e7 + b = a + 1.0e-6 + + # Choose derivatives that differ slightly + ga = 1.0 + gb = nextfloat(ga) # smallest representable difference + + # minimizer using affine formula + x_ref = a - ga * (b - a) / (gb - ga) + + err_secant = abs( + Manopt.secant( + Manopt.UnivariateTriple(a, 0.0, ga), + Manopt.UnivariateTriple(b, 0.0, gb) + ) - x_ref + ) + @test err_secant < 1.0e-6 + end + @testset "HagerZhang Linesearch Stepsize" begin + M = Euclidean(2) + f_sum_sq(M, p) = sum(p .^ 2) + grad_f_sum_sq(M, p) = 2 .* p + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f_sum_sq, grad_f_sum_sq)) + p = [1.0, 2.0] + η = -grad_f_sum_sq(M, p) + gs = GradientDescentState(M; p = p) + + hzls = HagerZhangLinesearch()(M) + @test startswith(repr(hzls), "HagerZhangLinesearch(;") + @test startswith(Manopt.status_summary(hzls), "HagerZhangLinesearch(;") + @test Manopt.get_message(hzls) == "" + + α = hzls(dmp, gs, 1, η) + @test isfinite(α) + @test α > 0 + α2 = hzls(dmp, gs, 1, η; gradient = grad_f_sum_sq(M, p)) + @test α2 ≈ α + @test hzls.last_stepsize == α + @test hzls.last_cost <= f_sum_sq(M, p) + 1.0e-12 + + hzls_limit = Manopt.HagerZhangLinesearchStepsize(M; stepsize_limit = 0.05) + α_limit = hzls_limit(dmp, gs, 1, η) + @test α_limit <= 0.05 + eps(0.05) + @test hzls_limit.last_stepsize == α_limit + α_limit_kw = hzls_limit(dmp, gs, 2, η; stop_when_stepsize_exceeds = 0.01) + @test α_limit_kw <= 0.01 + eps(0.01) + @testset "Running out of evaluations in _hz_evaluate_next_step" begin + N = length(hzls_limit.triples) - hzls_limit.last_evaluation_index + for i in 1:N + Manopt._hz_evaluate_next_step(hzls_limit, M, dmp, p, η, 0.1) + end + @test_throws ErrorException Manopt._hz_evaluate_next_step(hzls_limit, M, dmp, p, η, 0.1) + end + @testset "Wolfe condition modes" begin + hzls_default = Manopt.HagerZhangLinesearchStepsize(M) + hzls.current_mode = :invalid_mode + @test_throws ErrorException hzls(dmp, gs, 1, η) + end + + + hzls_approx = Manopt.HagerZhangLinesearchStepsize( + M; wolfe_condition_mode = :approximate, stepsize_limit = 0.2 + ) + α_approx = hzls_approx(dmp, gs, 1, η) + @test α_approx > 0 + + @testset "termination modes" begin + hzls_std = Manopt.HagerZhangLinesearchStepsize( + M; + wolfe_condition_mode = :standard, + initial_guess = Manopt.ConstantInitialGuess(0.5), + max_function_evaluations = 5, + ) + α_std = hzls_std(dmp, gs, 1, η) + @test isapprox(α_std, 0.5; rtol = 1.0e-12, atol = 0.0) + @test hzls_std.current_mode == :standard + + hzls_adapt = Manopt.HagerZhangLinesearchStepsize( + M; + wolfe_condition_mode = :adaptive, + initial_guess = Manopt.ConstantInitialGuess(0.5), + initial_last_cost = f_sum_sq(M, p), + ω = 1.0, + max_function_evaluations = 5, + ) + α_adapt = hzls_adapt(dmp, gs, 1, η) + @test α_adapt > 0 + @test hzls_adapt.current_mode == :approximate + + hzls_eval = Manopt.HagerZhangLinesearchStepsize( + M; + wolfe_condition_mode = :standard, + initial_guess = Manopt.ConstantInitialGuess(1.0), + max_function_evaluations = 2, + ) + α_eval = hzls_eval(dmp, gs, 1, η) + @test α_eval > 0 + @test hzls_eval.last_evaluation_index == length(hzls_eval.triples) + end + @testset "B1 bracketing test" begin + M = Euclidean(1) + f(M, p) = sum(p .^ 2) + grad_f(M, p) = 2 .* p + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) + p = [1.0] + η = -grad_f(M, p) + gs = GradientDescentState(M; p = p) + hzls_b1 = Manopt.HagerZhangLinesearchStepsize( + M; + initial_guess = Manopt.ConstantInitialGuess(0.75), + start_enforcing_wolfe_conditions_at_bracketing_iteration = 2, + max_bracket_iterations = 1, + ) + α_b1 = hzls_b1(dmp, gs, 1, η) + @test α_b1 > 0 + end + @testset "B2 bracketing test" begin + M = Euclidean(1) + # f(x) = -22 x^3 + 33 x^2 - x + # grad_f(x) = -66 x^2 + 66 x - 1 + f(M, p) = -22 * p[1]^3 + 33 * p[1]^2 - p[1] + grad_f(M, p) = [-66 * p[1]^2 + 66 * p[1] - 1] + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) + p = [0.0] + η = [1.0] # Descent direction + gs = GradientDescentState(M; p = p) + hzls_b2 = Manopt.HagerZhangLinesearchStepsize( + M; + initial_guess = Manopt.ConstantInitialGuess(1.0), + start_enforcing_wolfe_conditions_at_bracketing_iteration = 2, + max_bracket_iterations = 2, + ) + α = hzls_b2(dmp, gs, 1, η) + @test α > 0 + end + @testset "B3 bracketing test" begin + M = Euclidean(1) + # f(x) = -x + # grad_f(x) = -1 + f(M, p) = -p[1] + grad_f(M, p) = [-1.0] + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) + p = [0.0] + η = [1.0] # Descent direction + gs = GradientDescentState(M; p = p) + hzls_b3 = Manopt.HagerZhangLinesearchStepsize( + M; + initial_guess = Manopt.ConstantInitialGuess(1.0), + stepsize_limit = 2.0, + max_bracket_iterations = 2, + ) + α = hzls_b3(dmp, gs, 1, η) + @test α > 0 + end + @testset "U1 trigger test" begin + M = Euclidean(1) + # f(x) = x^2 / 2 + # grad_f(x) = x + f(M, p) = p[1]^2 / 2 + grad_f(M, p) = [p[1]] + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) + p = [1.0] + η = [-1.0] # Descent direction + gs = GradientDescentState(M; p = p) + hzls_u1 = Manopt.HagerZhangLinesearchStepsize( + M; + initial_guess = Manopt.ConstantInitialGuess(2.0), + ) + # We expect U1 to be triggered during the update (secant is exact, slope 0 >= 0) + α = hzls_u1(dmp, gs, 1, η) + @test α > 0 + end + @testset "U2 trigger test" begin + M = Euclidean(1) + # We mock f and grad_f to trigger U2 termination + # We need: + # 1. Starting at p=0 with descent direction (df < 0) + # 2. Bracketing finds a point with df > 0 (to finish bracketing) -> p=1.0, df=1.0 + # 3. Refinement hits max evaluations at a point with df < 0 and f > f(0)+eps -> p=0.5, f=10.0, df=-0.1 + + function f_u2(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-9) + return 0.0 + elseif isapprox(v, 1.0; atol = 1.0e-9) + return 0.0 + elseif isapprox(v, 0.5; atol = 1.0e-9) + return 10.0 + end + return 0.0 + end + + function grad_f_u2(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-9) + return [-1.0] + elseif isapprox(v, 1.0; atol = 1.0e-9) + return [1.0] + elseif isapprox(v, 0.5; atol = 1.0e-9) + return [-0.1] + end + return [0.0] + end + + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f_u2, grad_f_u2)) + p = [0.0] + η = [1.0] + gs = GradientDescentState(M; p = p) + hzls_u2 = Manopt.HagerZhangLinesearchStepsize( + M; initial_guess = Manopt.ConstantInitialGuess(1.0), max_function_evaluations = 3 + ) + α = hzls_u2(dmp, gs, 1, η) + @test α > 0 + end + @testset "U3 trigger test" begin + M = Euclidean(1) + # Trigger U3 by having a point that satisfies conditions for U2 but f_eval is false. + # Same landscape as U2: + # p=0, df=-1 (start) + # p=1, df=1 (end of bracket) + # p=0.5, f=10, df=-0.1 (high function value, negative slope) + + function f_u3(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-9) + return 0.0 + elseif isapprox(v, 1.0; atol = 1.0e-9) + return 0.0 + elseif isapprox(v, 0.5; atol = 1.0e-9) + return 10.0 + end + return 0.0 + end + + function grad_f_u3(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-9) + return [-1.0] + elseif isapprox(v, 1.0; atol = 1.0e-9) + return [1.0] + elseif isapprox(v, 0.5; atol = 1.0e-9) + return [-0.1] + end + return [0.0] + end + + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f_u3, grad_f_u3)) + p = [0.0] + η = [1.0] # Descent direction + gs = GradientDescentState(M; p = p) + # Set max_function_evaluations > 3 so we don't hit U2 termination (f_eval=true) + hzls_u3 = Manopt.HagerZhangLinesearchStepsize( + M; initial_guess = Manopt.ConstantInitialGuess(1.0), max_function_evaluations = 5 + ) + α = hzls_u3(dmp, gs, 1, η) + @test α > 0 + end + @testset "U3 (b) trigger test" begin + M = Euclidean(1) + # Force U3 (b) in _hz_u3: + # 1) At d=0.5 we need df < 0 and f(d) <= f(0) + ϵₖ with no termination, + # so i_a_bar gets updated to i_d. + # 2) On the next U3 iteration we return from the loop. + function f_u3b(M, q) + return 0.0 + end + + function grad_f_u3b(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-12) + return [-1.0] + elseif isapprox(v, 1.0; atol = 1.0e-12) + return [1.0] + elseif isapprox(v, 0.5; atol = 1.0e-12) + return [-1.0] + elseif isapprox(v, 0.75; atol = 1.0e-12) + return [1.0] + end + return [0.0] + end + + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f_u3b, grad_f_u3b)) + p = [0.0] + η = [1.0] + hzls_u3b = Manopt.HagerZhangLinesearchStepsize(M; max_function_evaluations = 4) + Manopt.initialize_stepsize!(hzls_u3b) + Manopt._hz_evaluate_next_step(hzls_u3b, M, dmp, p, η, 0.0) + Manopt._hz_evaluate_next_step(hzls_u3b, M, dmp, p, η, 1.0) + + (i_a, i_b, f_eval, f_wolfe) = Manopt._hz_u3(hzls_u3b, M, dmp, p, η, 1, 2) + @test (i_a, i_b) == (3, 4) + @test f_eval + @test !f_wolfe + end + @testset "U3 (c) info trigger test" begin + M = Euclidean(1) + # Force U3 (c) inside _hz_u3 by making the mid-point have + # negative slope but too large function value. + function f_u3c(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-12) + return 0.0 + elseif isapprox(v, 1.0; atol = 1.0e-12) + return 0.0 + elseif isapprox(v, 0.5; atol = 1.0e-12) + return 1.0 + elseif isapprox(v, 0.25; atol = 1.0e-12) + return 0.0 + end + return 0.0 + end + + function grad_f_u3c(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-12) + return [-1.0] + elseif isapprox(v, 1.0; atol = 1.0e-12) + return [1.0] + elseif isapprox(v, 0.5; atol = 1.0e-12) + return [-0.1] + elseif isapprox(v, 0.25; atol = 1.0e-12) + return [0.1] + end + return [0.0] + end + + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f_u3c, grad_f_u3c)) + p = [0.0] + η = [1.0] + hzls_u3c = Manopt.HagerZhangLinesearchStepsize(M; max_function_evaluations = 4) + Manopt.initialize_stepsize!(hzls_u3c) + Manopt._hz_evaluate_next_step(hzls_u3c, M, dmp, p, η, 0.0) + Manopt._hz_evaluate_next_step(hzls_u3c, M, dmp, p, η, 1.0) + @test (1, 4, true, false) == Manopt._hz_u3(hzls_u3c, M, dmp, p, η, 1, 2) + end + @testset "U3 max evaluations termination" begin + M = Euclidean(1) + f(M, p) = sum(p .^ 2) + grad_f(M, p) = 2 .* p + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) + p = [0.0] + η = [1.0] + + hzls_u3_max = Manopt.HagerZhangLinesearchStepsize(M; max_function_evaluations = 2) + Manopt.initialize_stepsize!(hzls_u3_max) + Manopt._hz_evaluate_next_step(hzls_u3_max, M, dmp, p, η, 0.0) + Manopt._hz_evaluate_next_step(hzls_u3_max, M, dmp, p, η, 1.0) + @test hzls_u3_max.last_evaluation_index == length(hzls_u3_max.triples) + + (i_a, i_b, f_eval, f_wolfe) = Manopt._hz_u3(hzls_u3_max, M, dmp, p, η, 1, 2) + @test (i_a, i_b) == (1, 2) + @test !f_eval + @test !f_wolfe + end + @testset "U0 out-of-bracket early return" begin + M = Euclidean(1) + f(M, p) = sum(p .^ 2) + grad_f(M, p) = 2 .* p + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) + p = [0.0] + η = [1.0] + + hzls_u0 = Manopt.HagerZhangLinesearchStepsize(M; max_function_evaluations = 5) + Manopt.initialize_stepsize!(hzls_u0) + Manopt._hz_evaluate_next_step(hzls_u0, M, dmp, p, η, 0.0) + Manopt._hz_evaluate_next_step(hzls_u0, M, dmp, p, η, 1.0) + + last_eval_before = hzls_u0.last_evaluation_index + + # c is left of bracket [0, 1] -> U0 early return + @test (1, 2, -1, false, false) == Manopt._hz_update(hzls_u0, M, dmp, p, η, 1, 2, -0.1) + @test hzls_u0.last_evaluation_index == last_eval_before + + # c is right of bracket [0, 1] -> U0 early return + @test (1, 2, -1, false, false) == Manopt._hz_update(hzls_u0, M, dmp, p, η, 1, 2, 1.1) + @test hzls_u0.last_evaluation_index == last_eval_before + end + + @testset "S2 trigger test" begin + M = Euclidean(1) + # S2 is triggered within _hz_secant2 when the updated bracket point i_c is the new upper bound i_B + # This happens if slope at c is positive (U1 case in _hz_update). + # Sequence: + # 1. Start p=0, df=-1. + # 2. Initial bracket p=1, df=4 (df > 0 -> bracket found). + # 3. _hz_secant2 calls secant(0, 1) -> c = (0*4 - 1*(-1))/(4 - (-1)) = 0.2. + # 4. At c=0.2, we set df=0.1 (positive slope -> U1 -> i_c = i_B). + # 5. We also need f(0.2) high enough to fail Armijo so we don't return early with f_wolfe=true. + # f(0)=0. f(0.2)=0.5. Armijo check: 0.5 <= 0 + 0.1*0.2*(-1) = -0.02 (False). + + function f_s2(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-9) + return 0.0 + elseif isapprox(v, 1.0; atol = 1.0e-9) + return 2.0 # Arbitrary high value + elseif isapprox(v, 0.2; atol = 1.0e-9) + return 0.5 # Fail Armijo + end + return 0.0 # Fallback (e.g. for c_bar in S2) + end + + function grad_f_s2(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-9) + return [-1.0] + elseif isapprox(v, 1.0; atol = 1.0e-9) + return [4.0] + elseif isapprox(v, 0.2; atol = 1.0e-9) + return [0.1] # Positive slope triggers U1 -> i_c = i_B + end + return [0.0] + end + + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f_s2, grad_f_s2)) + p = [0.0] + η = [1.0] + gs = GradientDescentState(M; p = p) + hzls_s2 = Manopt.HagerZhangLinesearchStepsize( + M; initial_guess = Manopt.ConstantInitialGuess(1.0) + ) + # We expect the S2 log + α = hzls_s2(dmp, gs, 1, η) + @test α > 0 + end + + @testset "S3 trigger test" begin + M = Euclidean(1) + # S3 is triggered within _hz_secant2 when the updated bracket point i_c is the new lower bound i_A + # (U2 case in _hz_update). We set up: + # 1. Start p=0, df=-1 (descent). + # 2. Bracket at p=1, df=4 (positive slope). + # 3. Secant gives c=0.2. At c, df=-0.1 and f=0 -> U2. + + function f_s3(M, q) + return 0.0 + end + + function grad_f_s3(M, q) + v = q[1] + if isapprox(v, 0.0; atol = 1.0e-12) + return [-1.0] + elseif isapprox(v, 1.0; atol = 1.0e-12) + return [4.0] + elseif isapprox(v, 0.2; atol = 1.0e-12) + return [-0.1] + end + return [0.0] + end + + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f_s3, grad_f_s3)) + p = [0.0] + η = [1.0] + hzls_s3 = Manopt.HagerZhangLinesearchStepsize(M; max_function_evaluations = 5) + Manopt.initialize_stepsize!(hzls_s3) + Manopt._hz_evaluate_next_step(hzls_s3, M, dmp, p, η, 0.0) + Manopt._hz_evaluate_next_step(hzls_s3, M, dmp, p, η, 1.0) + + c = Manopt.secant(hzls_s3.triples[1], hzls_s3.triples[2]) + (i_A, i_B, i_c, f_eval, f_wolfe) = Manopt._hz_secant2(hzls_s3, M, dmp, p, η, 1, 2) + @test !f_eval + @test !f_wolfe + @test hzls_s3.triples[i_A].t ≈ c atol = 1.0e-12 + + c_bar = Manopt.secant(hzls_s3.triples[1], hzls_s3.triples[i_A]) + @test hzls_s3.triples[i_c].t ≈ c_bar atol = 1.0e-12 + @test i_A != i_B + end + + @testset "Hager-Zhang infinite at b" begin + # A function that is finite for small steps but infinite for larger ones + # and has positive slope where it is infinite to trigger the bracket condition. + + M = Euclidean(1) + + # f(x) = x^2 - x for x < 1.0 + # f(x) = Inf for x >= 1.0 + # Min at x = 0.5, f(0.5) = -0.25 + function f_inf(M, p) + x = p[1] + if x < 1.0 + return x^2 - x + else + return Inf + end + end + + function grad_f_inf(M, p) + x = p[1] + if x < 1.0 + return [2 * x - 1] + else + # Return a positive slope to satisfy _hz_bracket exit condition + return [1.0] + end + end + + dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f_inf, grad_f_inf)) + + # Start at 0. f(0)=0. grad(0)=-1. Search direction +1. + s = GradientDescentState(M; p = [0.0]) + + # Force initial guess to be 2.0 (in the infinite region) + hzls = HagerZhangLinesearch(; initial_guess = Manopt.ConstantInitialGuess(2.0))(M) + + # Because initial bracket will be [0, 2] with f(2)=Inf. + # Then bisection will eventually find 0.5. + + step = hzls(dmp, s, 1, [1.0]) + @test abs(step - 0.5) < 1.0e-1 + end + + @testset "Hager-Zhang initialize_stepsize!" begin + hzls = HagerZhangLinesearch()(M) + hzls.last_evaluation_index = 5 + hzls.Qₖ = 2.0 + hzls.Cₖ = 2.0 + hzls.current_mode = :approximate + Manopt.initialize_stepsize!(hzls) + @test hzls.last_evaluation_index == 0 + @test hzls.Qₖ == 0.0 + @test hzls.Cₖ == 0.0 + @test hzls.current_mode == :standard + end + + end @testset "Distance over Gradients Stepsize" begin @testset "does not use sectional cuvature (Eucludian)" begin M = Euclidean(2) @@ -270,23 +806,18 @@ end @test ds.max_distance == 1.0 @test ds.initial_point == p @test ds.last_stepsize === get_initial_stepsize(ds) - @test ds.last_stepsize === NaN + @test ds.last_stepsize === 0.0 @test ds.last_stepsize === get_last_stepsize(ds) # test printed representation before first step repr_ds = repr(ds) - @test occursin("DistanceOverGradients(;", repr_ds) + @test occursin("DistanceOverGradientStepsize(;", repr_ds) @test occursin("initial_distance = 1.0", repr_ds) @test occursin("use_curvature = false", repr_ds) @test occursin("sectional_curvature_bound = 0.0", repr_ds) - @test occursin("Current state:", repr_ds) - @test occursin("max_distance = 1.0", repr_ds) - @test occursin("gradient_sum = 0.0", repr_ds) - @test occursin("last_stepsize = NaN", repr_ds) + summary = Manopt.status_summary(ds) + @test startswith(summary, "A distance over gradients step size") lr = ds(dmp, gds, 0) @test lr == 0.125 - # after first step, last_stepsize should be reflected in repr - repr_ds_after = repr(ds) - @test occursin("last_stepsize = 0.125", repr_ds_after) end @testset "use sectional cuvature (Euclidian)" begin M = Euclidean(2) @@ -305,7 +836,7 @@ end @test ds.max_distance == 1.0 @test ds.initial_point == p @test ds.last_stepsize === get_initial_stepsize(ds) - @test ds.last_stepsize === NaN + @test ds.last_stepsize === 0.0 @test ds.last_stepsize === get_last_stepsize(ds) lr = ds(dmp, gds, 0) @test lr == 0.125 @@ -324,7 +855,7 @@ end @test ds.max_distance == 1.0 @test ds.initial_point == p @test ds.last_stepsize === get_initial_stepsize(ds) - @test ds.last_stepsize === NaN + @test ds.last_stepsize === 0.0 @test ds.last_stepsize === get_last_stepsize(ds) lr = ds(dmp, gds, 0) @test lr == 0.5 @@ -346,7 +877,7 @@ end @test ds.max_distance == 1.0 @test ds.initial_point == p @test ds.last_stepsize === get_initial_stepsize(ds) - @test ds.last_stepsize === NaN + @test ds.last_stepsize === 0.0 @test ds.last_stepsize === get_last_stepsize(ds) lr = ds(dmp, gds, 0) @test lr == 0.5 @@ -376,7 +907,7 @@ end @test ds.max_distance == 1.0 @test ds.initial_point == p @test ds.last_stepsize === get_initial_stepsize(ds) - @test ds.last_stepsize === NaN + @test ds.last_stepsize === 0.0 @test ds.last_stepsize === get_last_stepsize(ds) # Expected initial step: diff --git a/test/plans/test_stochastic_gradient_plan.jl b/test/plans/test_stochastic_gradient_plan.jl index 460c065bb2..9fdbd1fbea 100644 --- a/test/plans/test_stochastic_gradient_plan.jl +++ b/test/plans/test_stochastic_gradient_plan.jl @@ -110,4 +110,19 @@ using LinearAlgebra, LRUCache, Manifolds, Manopt, Test end end end + @testset "show/repr and status_summary" begin + s1 = repr(msgo_ff) + @test startswith(s1, "ManifoldStochasticGradientObjective(") + @test contains(s1, " cost = ") + s2 = Manopt.status_summary(msgo_ff) + @test contains(s2, "stochastic gradient objective") + @test contains(s2, "cost") + # missing cost + msgo_fm = ManifoldStochasticGradientObjective(sgrad_f1) + s3 = repr(msgo_fm) + @test !contains(s3, "cost") + s4 = Manopt.status_summary(msgo_fm) + @test contains(s4, "stochastic gradient objective") + @test !contains(s4, "cost") + end end diff --git a/test/plans/test_stopping_criteria.jl b/test/plans/test_stopping_criteria.jl index 91c83e0541..77da09743a 100644 --- a/test/plans/test_stopping_criteria.jl +++ b/test/plans/test_stopping_criteria.jl @@ -1,22 +1,32 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates +function to_display_string(obj) + buf = IOBuffer() + Base.show(buf, MIME"text/plain"(), obj) + return String(take!(buf)) +end + @testset "StoppingCriteria" begin @testset "Generic Tests" begin - @test_throws ErrorException get_stopping_criteria( - Manopt.Test.DummyStoppingCriteriaSet() - ) - - s = StopWhenAll(StopAfterIteration(10), StopWhenChangeLess(Euclidean(), 0.1)) - @test Manopt.indicates_convergence(s) #due to all and change this is true - @test startswith(repr(s), "StopWhenAll with the") + @test_throws ErrorException get_stopping_criteria(Manopt.Test.DummyStoppingCriteriaSet()) + sa = StopAfterIteration(10) + sb = StopWhenChangeLess(Euclidean(), 0.1) + s = StopWhenAll(sa, sb) + @test !Manopt.indicates_convergence(s) #both are false so this is false + @test repr(s) == "StopWhenAll([$(repr(sa)), $(repr(sb))])" + @test Manopt.status_summary(s; context = :short) == "$(repr(sa)) & $(repr(sb))" + @test startswith(Manopt.status_summary(s), "Stop when") @test get_reason(s) === "" + io = IOBuffer() + show(io, MIME"text/plain"(), sa) + @test startswith(String(take!(io)), "A stopping criterion to stop after 10 iterations") + # Trigger second one manually s.criteria[2].last_change = 0.05 s.criteria[2].at_iteration = 3 @test length(get_reason(s.criteria[2])) > 0 s2 = StopWhenAll([StopAfterIteration(10), StopWhenChangeLess(Euclidean(), 0.1)]) - @test get_stopping_criteria(s)[1].max_iterations == - get_stopping_criteria(s2)[1].max_iterations + @test get_stopping_criteria(s)[1].max_iterations == get_stopping_criteria(s2)[1].max_iterations s3 = StopWhenCostLess(0.1) p = DefaultManoptProblem( @@ -30,11 +40,13 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @test s3(p, s, 2) @test length(get_reason(s3)) > 0 # repack - sn = StopWhenAny(StopAfterIteration(10), s3) + sn1 = StopAfterIteration(10) + sn = StopWhenAny(sn1, s3) @test get_reason(sn) == "" @test !Manopt.indicates_convergence(sn) # since it might stop after 10 iterations - @test startswith(repr(sn), "StopWhenAny with the") - @test Manopt._fast_any(x -> false, ()) + @test repr(sn) == "StopWhenAny([$(repr(sn1)), $(repr(s3))])" + # or over an empty set has to be false for any function + @test !Manopt._fast_any(x -> false, ()) sn2 = StopAfterIteration(10) | s3 @test get_stopping_criteria(sn)[1].max_iterations == @@ -46,7 +58,7 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @test get_active_stopping_criteria(s3) == [s3] @test get_active_stopping_criteria(StopAfterIteration(1)) == [] sm = StopWhenAll(StopAfterIteration(10), s3) - s1 = "StopAfterIteration(10)\n Max Iteration 10:\tnot reached" + s1 = "StopAfterIteration(10)" @test repr(StopAfterIteration(10)) == s1 @test !sm(p, s, 9) @@ -70,8 +82,8 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates o = Manopt.Test.DummyState() s = StopAfter(Millisecond(30)) @test !Manopt.indicates_convergence(s) - @test Manopt.status_summary(s) == "stopped after $(s.threshold):\tnot reached" - @test repr(s) == "StopAfter(Millisecond(30))\n $(Manopt.status_summary(s))" + @test Manopt.status_summary(s) == "A stopping criterion to stop after $(s.threshold)\n$(Manopt._MANOPT_INDENT)not reached" + @test repr(s) == "StopAfter(Millisecond(30))" s(p, o, 0) # Start @test s(p, o, 1) == false @test get_reason(s) == "" @@ -87,13 +99,13 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @testset "Stopping Criterion &/| operators" begin a = StopAfterIteration(200) b = StopWhenChangeLess(Euclidean(), 1.0e-6) - sb = "StopWhenChangeLess with threshold 1.0e-6.\n $(Manopt.status_summary(b))" + sb = "StopWhenChangeLess(1.0e-6; inverse_retraction_method=LogarithmicInverseRetraction())" @test repr(b) == sb @test get_reason(b) == "" b2 = StopWhenChangeLess(Euclidean(), 1.0e-6) # second constructor @test repr(b2) == sb c = StopWhenGradientNormLess(1.0e-6) - sc = "StopWhenGradientNormLess(1.0e-6)\n $(Manopt.status_summary(c))" + sc = "StopWhenGradientNormLess(1.0e-6)" @test repr(c) == sc @test get_reason(c) == "" # Trigger manually @@ -101,7 +113,7 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates c.at_iteration = 3 @test length(get_reason(c)) > 0 c2 = StopWhenSubgradientNormLess(1.0e-6) - sc2 = "StopWhenSubgradientNormLess(1.0e-6)\n $(Manopt.status_summary(c2))" + sc2 = "StopWhenSubgradientNormLess(1.0e-6)" @test repr(c2) == sc2 d = StopWhenAll(a, b, c) @test typeof(d) === typeof(a & b & c) @@ -121,13 +133,16 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @testset "Stopping Criterion print&summary" begin f = StopWhenStepsizeLess(1.0e-6) - sf1 = "Stepsize s < 1.0e-6:\tnot reached" - @test Manopt.status_summary(f) == sf1 - sf2 = "StopWhenStepsizeLess(1.0e-6)\n $(sf1)" + sf1 = "Stepsize s < 1.0e-6:$(Manopt._MANOPT_INDENT)not reached" + sf2 = "StopWhenStepsizeLess(1.0e-6)" + @test Manopt.status_summary(f) == "A stopping criterion to stop when the step size is less than 1.0e-6\n$(Manopt._MANOPT_INDENT)not reached" + @test Manopt.status_summary(f; context = :inline) == sf1 @test repr(f) == sf2 g = StopWhenCostLess(1.0e-4) - @test Manopt.status_summary(g) == "f(x) < $(1.0e-4):\tnot reached" - @test repr(g) == "StopWhenCostLess(0.0001)\n $(Manopt.status_summary(g))" + @test Manopt.status_summary(g; context = :inline) == "f(x) < $(1.0e-4):$(Manopt._MANOPT_INDENT)not reached" + @test repr(g) == "StopWhenCostLess(0.0001)" + @test startswith(Manopt.status_summary(g), "A stopping criterion to stop when the cost function") + @test !Manopt.indicates_convergence(g) gf(M, p) = norm(p) grad_gf(M, p) = p gp = DefaultManoptProblem(Euclidean(2), ManifoldGradientObjective(gf, grad_gf)) @@ -140,8 +155,8 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @test g(gp, gs, 2) @test length(get_reason(g)) > 0 h = StopWhenSmallerOrEqual(:p, 1.0e-4) - @test repr(h) == - "StopWhenSmallerOrEqual(:p, $(1.0e-4))\n $(Manopt.status_summary(h))" + @test repr(h) == "StopWhenSmallerOrEqual(:p, $(1.0e-4))" + @test !Manopt.indicates_convergence(h) @test get_reason(h) == "" # Trigger manually h.at_iteration = 1 @@ -151,6 +166,7 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates for swgcl in [swgcl1, swgcl2] repr(swgcl) == "StopWhenGradientChangeLess($(1.0e-8); vector_transport_method=ParallelTransport())\n $(Manopt.status_summary(swgcl))" + @test !Manopt.indicates_convergence(swgcl) swgcl(gp, gs, 0) # reset @test get_reason(swgcl) == "" @test swgcl(gp, gs, 1) # change 0 -> true @@ -202,6 +218,7 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates stepsize = Manopt.ConstantStepsize(Euclidean()), ) s1 = StopWhenStepsizeLess(0.5) + @test !Manopt.indicates_convergence(s1) @test !s1(dmp, gds, 1) @test length(get_reason(s1)) == 0 gds.stepsize = Manopt.ConstantStepsize(Euclidean(), 0.25) @@ -223,7 +240,9 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates stepsize = Manopt.ConstantStepsize(Euclidean()), ) swecl = StopWhenEntryChangeLess(:p, (p, s, v, w) -> norm(w - v), 1.0e-5) - @test startswith(repr(swecl), "StopWhenEntryChangeLess\n") + @test startswith(repr(swecl), "StopWhenEntryChangeLess(") + @test !Manopt.indicates_convergence(swecl) + @test startswith(Manopt.status_summary(swecl), "A stopping criterion to stop when the change of ") Manopt.set_parameter!(swecl, :Threshold, 1.0e-4) @test swecl.threshold == 1.0e-4 @test !swecl(dmp, gds, 1) #First call stores @@ -247,8 +266,9 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates mso = ManifoldSubgradientObjective(f, ∂f) mp = DefaultManoptProblem(M, mso) c2 = StopWhenSubgradientNormLess(1.0e-6) - sc2 = "StopWhenSubgradientNormLess(1.0e-6)\n $(Manopt.status_summary(c2))" - @test repr(c2) == sc2 + @test repr(c2) == "StopWhenSubgradientNormLess(1.0e-6)" + @test startswith(Manopt.status_summary(c2), "A stopping criterion to stop when the subgradient norm") + st = SubGradientMethodState(M; p = p, stopping_criterion = c2) st.X = ∂f(M, 2p) @test !c2(mp, st, 1) @@ -264,10 +284,12 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @testset "StopWhenCostNaN, StopWhenCostChangeLess, StopWhenIterateNaN" begin sc1 = StopWhenCostNaN() + @test !Manopt.indicates_convergence(sc1) + @test startswith(Manopt.status_summary(sc1), "A stopping criterion to stop when the cost function is") f(M, p) = norm(p) > 2 ? NaN : norm(p) M = Euclidean(2) p = [1.0, 2.0] - @test startswith(repr(sc1), "StopWhenCostNaN()\n") + @test startswith(repr(sc1), "StopWhenCostNaN()") mco = ManifoldCostObjective(f) mp = DefaultManoptProblem(M, mco) s = NelderMeadState(M) @@ -282,7 +304,9 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @test length(get_reason(sc1)) > 0 sc2 = StopWhenCostChangeLess(1.0e-6) - @test startswith(repr(sc2), "StopWhenCostChangeLess with threshold 1.0e-6.\n") + @test startswith(repr(sc2), "StopWhenCostChangeLess(1.0e-6)") + @test startswith(Manopt.status_summary(sc2), "A stopping criterion to stop when the change of the cost") + @test !Manopt.indicates_convergence(sc2) @test get_reason(sc2) == "" s.p = [0.0, 0.1] @test !sc2(mp, s, 1) # Init check @@ -295,7 +319,10 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates s.p .= NaN sc3 = StopWhenIterateNaN() - @test startswith(repr(sc3), "StopWhenIterateNaN()\n") + @test startswith(repr(sc3), "StopWhenIterateNaN()") + @test !Manopt.indicates_convergence(sc3) + @test startswith(Manopt.status_summary(sc3), "A stopping criterion to stop when an entry of the iterate is") + @test sc3(mp, s, 1) #always returns true since p was now set to NaN @test length(get_reason(sc3)) > 0 s.p = p @@ -316,8 +343,9 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @test Manopt.indicates_convergence(sc) == Manopt.indicates_convergence(s) @test has_converged(sc) == has_converged(s) @test get_reason(sc) == "" - @test startswith(repr(sc), "StopWhenRepeated with the Stopping Criterion:\n") - @test startswith(Manopt.status_summary(sc), "0 ≥ 3 (consecutive): not reached") + @test startswith(repr(sc), "StopWhenRepeated(") + @test startswith(Manopt.status_summary(sc), "A stopping criterion to stop when the inner criterion has indicated to stop 3 (consecutive) times") + @test startswith(Manopt.status_summary(sc; context = :short), "StopWhenRepeated(StopAfterIteration(2))×3") @test !sc(p, o, 1) # still count 0 @test !sc(p, o, 2) # 1 @test !sc(p, o, 2) # 2 @@ -343,11 +371,9 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @test Manopt.indicates_convergence(sc) == Manopt.indicates_convergence(s) @test has_converged(sc) == has_converged(s) @test get_reason(sc) == "" - @test startswith( - repr(sc), - "StopWhenCriterionWithIterationCondition with the Stopping Criterion:\n", - ) - @test startswith(Manopt.status_summary(sc), "Base.Fix2{typeof(>), Int64}(>, 5) &&") + @test startswith(repr(sc), "StopWhenCriterionWithIterationCondition(") + @test startswith(Manopt.status_summary(sc; context = :short), repr(sc)) + @test startswith(Manopt.status_summary(sc), "A stopping criterion to stop when the inner criterion is met and") sc2 = s ⩼ 5 @test typeof(sc) === typeof(sc2) sc4 = s ≟ 5 @@ -363,6 +389,54 @@ using Manifolds, ManifoldsBase, Manopt, Test, ManifoldsBase, Dates @test length(get_reason(sc)) == 0 end + @testset "StopWhenRelativeAPosterioriCostChangeLessOrEqual" begin + sc = StopWhenRelativeAPosterioriCostChangeLessOrEqual(; factr = 100.0) + prob = DefaultManoptProblem( + Euclidean(), ManifoldGradientObjective((M, x) -> x^2, x -> 2x) + ) + s = GradientDescentState(Euclidean(); p = 1.0) + @test !sc(prob, s, -1) + @test !sc(prob, s, 1) + @test length(get_reason(sc)) == 0 + s.p = 1.0 - 1.0e-14 + + @test sc(prob, s, 2) + @test length(get_reason(sc)) > 0 + @test startswith( + Manopt.status_summary(sc), + "A stopping criterion to stop when the relative posteriori cost change is less than", + ) + @test startswith(Manopt.status_summary(sc; context = :inline), "(fₖ- fₖ₊₁)/max(|fₖ|, |fₖ₊₁|, 1) = ") + @test startswith(repr(sc), "StopWhenRelativeAPosterioriCostChangeLessOrEqual(") + @test !Manopt.indicates_convergence(sc) + end + + @testset "StopWhenProjectedNegativeGradientNormLess" begin + sc = StopWhenProjectedNegativeGradientNormLess(1.0e-10) + @test startswith(repr(sc), "StopWhenProjectedNegativeGradientNormLess(") + @test startswith(Manopt.status_summary(sc), "A StoppingCriterion to stop when the negative projected gradient norm is less than") + + M = Hyperrectangle([1.0], [2.0]) + prob = DefaultManoptProblem( + M, ManifoldGradientObjective((M, x) -> x^2, x -> 2x) + ) + s = GradientDescentState(M; p = [1.0], X = [2.0]) + @test !sc(prob, s, -1) + @test length(get_reason(sc)) == 0 + @test sc(prob, s, 1) + @test length(get_reason(sc)) > 0 + + @test startswith( + to_display_string(sc), + "A StoppingCriterion to stop when the negative projected gradient norm is less than", + ) + @test startswith(Manopt.status_summary(sc; context = :inline), "|proj (-grad f)| < 1.0e-10") + + Manopt.set_parameter!(sc, Val(:MinGradNorm), 1.0e-5) + @test sc.threshold == 1.0e-5 + @test Manopt.indicates_convergence(sc) + end + @testset "has_converged" begin M = Euclidean(1) pr = Manopt.Test.DummyProblem{typeof(M)}() diff --git a/test/plans/test_storage.jl b/test/plans/test_storage.jl index 88602df444..2b00c2c602 100644 --- a/test/plans/test_storage.jl +++ b/test/plans/test_storage.jl @@ -10,10 +10,8 @@ using Test, Manopt, ManifoldsBase, Manifolds X_zero = zero_vector(M, p) st = GradientDescentState( - M; - p = p, - stopping_criterion = StopAfterIteration(20), - stepsize = Manopt.ConstantStepsize(M), + M; p = p, + stopping_criterion = StopAfterIteration(20), stepsize = Manopt.ConstantStepsize(M), ) f(M, q) = distance(M, q, p) .^ 2 grad_f(M, q) = -2 * log(M, q, p) @@ -21,6 +19,11 @@ using Test, Manopt, ManifoldsBase, Manifolds a = StoreStateAction(M; store_fields = [:p, :X]) + @test Manopt.status_summary(a) == repr(a) + io = IOBuffer() + show(io, MIME"text/plain"(), a) + @test String(take!(io)) == repr(a) + @test !has_storage(a, Manopt.PointStorageKey(:p)) @test !has_storage(a, Manopt.VectorStorageKey(:X)) update_storage!(a, mp, st) diff --git a/test/plans/test_vectorial_plan.jl b/test/plans/test_vectorial_plan.jl index 894e19c221..f02129b7fa 100644 --- a/test/plans/test_vectorial_plan.jl +++ b/test/plans/test_vectorial_plan.jl @@ -29,6 +29,9 @@ using Manopt: get_value, get_value_function, get_gradient_function hess_g2!(M, Y, p, X) = copyto!(Y, -X) # verify a few case vgf_fa = VectorGradientFunction(g, grad_g, 2) + io = IOBuffer() + show(io, MIME"text/plain"(), vgf_fa) + @test String(take!(io)) == Manopt.status_summary(vgf_fa) @test get_value_function(vgf_fa) === g @test get_gradient_function(vgf_fa) == grad_g vgf_va = VectorGradientFunction( @@ -141,4 +144,5 @@ using Manopt: get_value, get_value_function, get_gradient_function get_hessian!(M, Z, vhf, p, X, 2) @test Z == gh[2] end + @test repr(CoordinateVectorialType(DefaultOrthonormalBasis(ℝ))) == "CoordinateVectorialType(DefaultOrthonormalBasis(ℝ))" end diff --git a/test/runtests.jl b/test/runtests.jl index 33e9654baa..3657cc12b8 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -64,6 +64,7 @@ using Manifolds, ManifoldsBase, Manopt, Test include("solvers/test_proximal_gradient_method.jl") include("solvers/test_proximal_point.jl") include("solvers/test_quasi_Newton.jl") + include("solvers/test_quasi_Newton_box.jl") include("solvers/test_particle_swarm.jl") include("solvers/test_primal_dual_semismooth_Newton.jl") include("solvers/test_stochastic_gradient_descent.jl") @@ -72,7 +73,6 @@ using Manifolds, ManifoldsBase, Manopt, Test include("solvers/test_trust_regions.jl") include("solvers/test_vectorbundle_newton.jl") end - include("MOI_wrapper.jl") include("test_aqua.jl") include("test_deprecated.jl") end diff --git a/test/solvers/test_ChambollePock.jl b/test/solvers/test_ChambollePock.jl index b9250b74bf..db36d96b6c 100644 --- a/test/solvers/test_ChambollePock.jl +++ b/test/solvers/test_ChambollePock.jl @@ -78,7 +78,8 @@ using ManifoldDiff: prox_distance, prox_distance! return_state = true, ) @test startswith( - repr(o1a), "# Solver state for `Manopt.jl`s Chambolle-Pock Algorithm" + Manopt.status_summary(o1a; context = :default), + "# Solver state for `Manopt.jl`s Chambolle-Pock Algorithm" ) @test get_solver_result(o1a) == o1 o2a = ChambollePock( diff --git a/test/solvers/test_Douglas_Rachford.jl b/test/solvers/test_Douglas_Rachford.jl index 2ff52ff155..47ecef60f9 100644 --- a/test/solvers/test_Douglas_Rachford.jl +++ b/test/solvers/test_Douglas_Rachford.jl @@ -39,8 +39,9 @@ using ManifoldDiff: prox_distance, prox_distance! #test getter/set s = DouglasRachfordState(M; p = d1) - sr = "# Solver state for `Manopt.jl`s Douglas Rachford Algorithm\n" - @test startswith(repr(s), sr) + sr = "# Solver state for `Manopt.jl`s Douglas Rachford Algorithm\n" + @test startswith(Manopt.status_summary(s; context = :default), sr) + @test startswith(repr(s), "DouglasRachfordState(; ") set_iterate!(s, d2) @test get_iterate(s) == d2 @testset "Debug and Record prox parameter" begin diff --git a/test/solvers/test_Frank_Wolfe.jl b/test/solvers/test_Frank_Wolfe.jl index 06ec9b1fa3..32a6e71283 100644 --- a/test/solvers/test_Frank_Wolfe.jl +++ b/test/solvers/test_Frank_Wolfe.jl @@ -1,4 +1,4 @@ -using ManifoldsBase, Manopt, Random, Test, LinearAlgebra +using ManifoldsBase, Manifolds, Manopt, Random, Test, LinearAlgebra @testset "Frank Wolfe Method" begin M = ManifoldsBase.DefaultManifold(3) @@ -34,7 +34,10 @@ using ManifoldsBase, Manopt, Random, Test, LinearAlgebra @test FG(M, p) == Y s = FrankWolfeState(M, oracle!; evaluation = InplaceEvaluation(), p = p) @test Manopt.get_message(s) == "" - @test startswith(repr(s), "# Solver state for `Manopt.jl`s Frank Wolfe Method\n") + @test startswith(Manopt.status_summary(s; context = :default), "# Solver state for `Manopt.jl`s Frank Wolfe Method\n") + @test startswith(repr(s), "FrankWolfeState(") + # Manifold+State errors since problem is missing + @test_throws ErrorException FrankWolfeState(M, Manopt.Test.DummyState()) set_iterate!(s, 2 .* p) @test get_iterate(s) == 2 .* p dmp = DefaultManoptProblem(M, ManifoldGradientObjective(FC, FG)) diff --git a/test/solvers/test_Levenberg_Marquardt.jl b/test/solvers/test_Levenberg_Marquardt.jl index c4fd78aa65..46bc4add5f 100644 --- a/test/solvers/test_Levenberg_Marquardt.jl +++ b/test/solvers/test_Levenberg_Marquardt.jl @@ -113,7 +113,7 @@ end lm_r = LevenbergMarquardt(M, F_RLM, jacF_RLM, p0, length(pts_LM); return_state = true) lm_rs = "# Solver state for `Manopt.jl`s Levenberg Marquardt Algorithm\n" - @test startswith(repr(lm_r), lm_rs) + @test startswith(Manopt.status_summary(lm_r; context = :default), lm_rs) p_opt = get_state(lm_r).p @test norm(M, p_opt, get_gradient(lm_r)) < 2.0e-3 p_atol = 1.5e-2 @@ -214,7 +214,7 @@ end ) p_r2 = DefaultManoptProblem( M, - NonlinearLeastSquaresObjective( + ManifoldNonlinearLeastSquaresObjective( F_reg_r2(ts_r2, xs_r2, ys_r2), jacF_reg_r2(ts_r2, xs_r2, ys_r2), length(ts_r2) * 2, @@ -228,7 +228,7 @@ end p_r2_mut = DefaultManoptProblem( M, - NonlinearLeastSquaresObjective( + ManifoldNonlinearLeastSquaresObjective( F_reg_r2!, jacF_reg_r2!, length(ts_r2) * 2; evaluation = InplaceEvaluation() ), ) diff --git a/test/solvers/test_Nelder_Mead.jl b/test/solvers/test_Nelder_Mead.jl index aa6f0da2ea..9feb9f3b0a 100644 --- a/test/solvers/test_Nelder_Mead.jl +++ b/test/solvers/test_Nelder_Mead.jl @@ -52,7 +52,7 @@ Random.seed!(29) return_state = true, stopping_criterion = StopAfterIteration(400), ) - @test startswith(repr(s), "# Solver state for `Manopt.jl`s Nelder Mead Algorithm") + @test startswith(Manopt.status_summary(s; context = :default), "# Solver state for `Manopt.jl`s Nelder Mead Algorithm") p1 = get_solver_result(s) rec = get_record(s) nonincreasing = [rec[i] >= rec[i + 1] for i in 1:(length(rec) - 1)] @@ -68,8 +68,8 @@ Random.seed!(29) @test isapprox(M, p1, p3) # SC f = StopWhenPopulationConcentrated(1.0e-1, 1.0e-2) - sf = "StopWhenPopulationConcentrated($(1.0e-1), $(1.0e-2))\n $(Manopt.status_summary(f))" - @test repr(f) == sf + sf = "StopWhenPopulationConcentrated($(1.0e-1), $(1.0e-2))" + @test Manopt.status_summary(f; context = :short) == sf end @testset "Circle" begin diff --git a/test/solvers/test_adaptive_regularization_with_cubics.jl b/test/solvers/test_adaptive_regularization_with_cubics.jl index 6a7259ea49..e61d206e0e 100644 --- a/test/solvers/test_adaptive_regularization_with_cubics.jl +++ b/test/solvers/test_adaptive_regularization_with_cubics.jl @@ -33,38 +33,37 @@ using LinearAlgebra: I, tr, Symmetric, diagm, eigvals, eigvecs X1 = similar(X0) Manopt.get_objective_preconditioner!(M, X1, arcmo, p0, X0) isapprox(M, p0, X1, get_preconditioner(M, mho, p0, X0)) + @test startswith(repr(arcmo), "AdaptiveRegularizationWithCubicsModelObjective(") + @test startswith(Manopt.status_summary(arcmo), "The cubic polynomial based model for the sub problem of the Adaptive") end @testset "State and repr" begin arcs = AdaptiveRegularizationState( - M, - DefaultManoptProblem(M2, arcmo), - GradientDescentState(M2; p = zero_vector(M, p0)); + M, DefaultManoptProblem(M2, arcmo), GradientDescentState(M2; p = zero_vector(M, p0)); p = p0, ) @test startswith( - repr(arcs), + Manopt.status_summary(arcs; context = :default), "# Solver state for `Manopt.jl`s Adaptive Regularization with Cubics (ARC)", ) + @test startswith(repr(arcs), "AdaptiveRegularizationState(") p1 = rand(M) X1 = rand(M; vector_at = p1) set_iterate!(arcs, p1) @test arcs.p == p1 set_gradient!(arcs, X1) @test arcs.X == X1 + lst = LanczosState(M2; maxIterLanczos = 1) + @test startswith(repr(lst), "LanczosState(; ") + @test startswith(Manopt.status_summary(lst), "# Solver state for `Manopt.jl`s Lanczos Iteration") arcs2 = AdaptiveRegularizationState( - M, - DefaultManoptProblem(M2, arcmo), - LanczosState(M2; maxIterLanczos = 1); - p = p0, - stopping_criterion = StopWhenAllLanczosVectorsUsed(1), + M, DefaultManoptProblem(M2, arcmo), lst; p = p0, stopping_criterion = StopWhenAllLanczosVectorsUsed(1), ) #add a fake Lanczos push!(arcs2.sub_state.Lanczos_vectors, X1) # 1 Lanczos was reached @test stop_solver!(arcs2.sub_problem, arcs2.sub_state, 1) @test stop_solver!(arcs2.sub_problem, arcs2, 1) - arcs3 = AdaptiveRegularizationState( M, DefaultManoptProblem(M2, arcmo), LanczosState(M2; maxIterLanczos = 2); p = p0 ) @@ -74,12 +73,7 @@ using LinearAlgebra: I, tr, Symmetric, diagm, eigvals, eigvecs step_solver!(arcs3.sub_problem, arcs3.sub_state, 2) # to introduce a random new one # test orthogonality of the new 2 ones @test isapprox( - inner( - M, - p1, - arcs3.sub_state.Lanczos_vectors[1], - arcs3.sub_state.Lanczos_vectors[2], - ), + inner(M, p1, arcs3.sub_state.Lanczos_vectors[1], arcs3.sub_state.Lanczos_vectors[2]), 0.0, atol = 1.0e-14, ) @@ -93,18 +87,13 @@ using LinearAlgebra: I, tr, Symmetric, diagm, eigvals, eigvecs step_solver!(arcs4.sub_problem, arcs4.sub_state, 2) # to introduce a random new one but copy to 2 # test orthogonality of the new 2 ones @test isapprox( - inner( - M, - p1, - arcs4.sub_state.Lanczos_vectors[1], - arcs4.sub_state.Lanczos_vectors[2], - ), + inner(M, p1, arcs4.sub_state.Lanczos_vectors[1], arcs4.sub_state.Lanczos_vectors[2]), 0.0, atol = 1.0e-14, ) st1 = StopWhenFirstOrderProgress(0.5) - @test startswith(repr(st1), "StopWhenFirstOrderProgress(0.5)\n") + @test startswith(repr(st1), "StopWhenFirstOrderProgress(0.5)") @test Manopt.indicates_convergence(st1) @test get_reason(st1) == "" # fake a trigger @@ -114,10 +103,12 @@ using LinearAlgebra: I, tr, Symmetric, diagm, eigvals, eigvecs @test length(get_reason(st1)) > 0 st2 = StopWhenAllLanczosVectorsUsed(2) - @test startswith(repr(st2), "StopWhenAllLanczosVectorsUsed(2)\n") + @test startswith(repr(st2), "StopWhenAllLanczosVectorsUsed(2)") + @test startswith(Manopt.status_summary(st2), "Stop when all 2 Lanczos vectors are used") @test !Manopt.indicates_convergence(st2) @test startswith( - repr(arcs2.sub_state), "# Solver state for `Manopt.jl`s Lanczos Iteration\n" + Manopt.status_summary(arcs2.sub_state; context = :default), + "# Solver state for `Manopt.jl`s Lanczos Iteration\n" ) @test get_reason(st2) == "" # manually trigger @@ -165,14 +156,8 @@ using LinearAlgebra: I, tr, Symmetric, diagm, eigvals, eigvecs @test isapprox(M, p_min, p4) # with a large η1 to trigger the bad model case once p5 = adaptive_regularization_with_cubics( - M, - f, - grad_f, - Hess_f; - θ = 0.5, - σ = 100.0, - η1 = 0.89, - retraction_method = PolarRetraction(), + M, f, grad_f, Hess_f; + θ = 0.5, σ = 100.0, η1 = 0.89, retraction_method = PolarRetraction(), ) @test isapprox(M, p_min, p5) @@ -198,24 +183,14 @@ using LinearAlgebra: I, tr, Symmetric, diagm, eigvals, eigvecs sub_problem = DefaultManoptProblem(M2, arcmo) sub_state = GradientDescentState( - M2; - p = zero_vector(M, p0), - stopping_criterion = StopAfterIteration(500) | - StopWhenGradientNormLess(1.0e-11) | - StopWhenFirstOrderProgress(0.1), + M2; p = zero_vector(M, p0), + stopping_criterion = StopAfterIteration(500) | StopWhenGradientNormLess(1.0e-11) | StopWhenFirstOrderProgress(0.1), ) q3 = copy(M, p0) adaptive_regularization_with_cubics!( - M, - mho, - q3; - θ = 0.5, - σ = 100.0, - retraction_method = PolarRetraction(), - sub_problem = sub_problem, - sub_state = sub_state, - return_objective = true, - return_state = true, + M, mho, q3; θ = 0.5, σ = 100.0, + retraction_method = PolarRetraction(), sub_problem = sub_problem, sub_state = sub_state, + return_objective = true, return_state = true, ) @test isapprox(M, p_min, q3) diff --git a/test/solvers/test_alternating_gradient.jl b/test/solvers/test_alternating_gradient.jl index fa6ef1848b..d9512f8acb 100644 --- a/test/solvers/test_alternating_gradient.jl +++ b/test/solvers/test_alternating_gradient.jl @@ -5,12 +5,7 @@ using Manopt, Manifolds, Test, RecursiveArrayTools M = Sphere(2) N = M × M data = [[1.0, 0.0, 0.0], [0.0, 1.0, 0.0]] - function f(N, p) - return 1 / 2 * ( - distance(N[1], p[N, Val(1)], data[1])^2 + - distance(N[2], p[N, Val(2)], data[2])^2 - ) - end + f(N, p) = 1 / 2 * (distance(N[1], p[N, Val(1)], data[1])^2 + distance(N[2], p[N, Val(2)], data[2])^2) grad_f1(N, p) = -log(N[1], p[N, 1], data[1]) grad_f1!(N, X, p) = (X .= -log(N[1], p[N, 1], data[1])) grad_f2(N, p) = -log(N[2], p[N, 2], data[2]) @@ -22,9 +17,9 @@ using Manopt, Manifolds, Test, RecursiveArrayTools return X .*= -1 end p = ArrayPartition([0.0, 0.0, 1.0], [0.0, 0.0, 1.0]) + objf = ManifoldAlternatingGradientObjective(f, grad_f) @testset "Test gradient access" begin - objf = ManifoldAlternatingGradientObjective(f, grad_f) Pf = DefaultManoptProblem(N, objf) objv = ManifoldAlternatingGradientObjective(f, [grad_f1, grad_f2]) Pv = DefaultManoptProblem(N, objv) @@ -36,8 +31,10 @@ using Manopt, Manifolds, Test, RecursiveArrayTools f, [grad_f1!, grad_f2!]; evaluation = InplaceEvaluation() ) Pv! = DefaultManoptProblem(N, objv!) + X = zero_vector(N, p) + @test repr(Manopt.AlternatingGradientRule(X)) == "AlternatingGradientRule($X)" + for P in [Pf, Pv, Pf!, Pv!] - X = zero_vector(N, p) @test get_gradient(P, p)[N, 1] == grad_f(N, p)[N, 1] @test get_gradient(P, p)[N, 2] == grad_f(N, p)[N, 2] get_gradient!(P, X, p) @@ -52,31 +49,33 @@ using Manopt, Manifolds, Test, RecursiveArrayTools @test X[N, 2] == grad_f(N, p)[N, 2] end end + @testset "Test show/repr" begin + s1 = repr(objf) + @test startswith(s1, "ManifoldAlternatingGradientObjective(") + @test contains(s1, "AllocatingEvaluation()") + @test Manopt.status_summary(objf; context = :short) == s1 + @test startswith(Manopt.status_summary(objf; context = :inline), "An alternating gradient objective") + s2 = Manopt.status_summary(objf) + @test startswith(s2, "An alternating gradient objective") + @test contains(s2, "## Functions") + end @testset "Test high level interface" begin q = allocate(p) copyto!(N, q, p) q2 = allocate(p) copyto!(N, q2, p) q3 = alternating_gradient_descent( - N, - f, - [grad_f1!, grad_f2!], - p; - order_type = :Linear, - evaluation = InplaceEvaluation(), + N, f, [grad_f1!, grad_f2!], p; order_type = :Linear, evaluation = InplaceEvaluation(), ) r = alternating_gradient_descent!( - N, - f, - [grad_f1!, grad_f2!], - q; - order_type = :Linear, - evaluation = InplaceEvaluation(), - return_state = true, + N, f, [grad_f1!, grad_f2!], q; + order_type = :Linear, evaluation = InplaceEvaluation(), return_state = true, ) @test startswith( - repr(r), "# Solver state for `Manopt.jl`s Alternating Gradient Descent Solver" + Manopt.status_summary(r; context = :default), + "# Solver state for `Manopt.jl`s Alternating Gradient Descent Solver" ) + @test startswith(repr(r), "AlternatingGradientDescentState(; ") # r has the same message as the internal stepsize @test Manopt.get_message(r) == Manopt.get_message(r.stepsize) @test isapprox(N, q3, q) diff --git a/test/solvers/test_augmented_lagrangian.jl b/test/solvers/test_augmented_lagrangian.jl index 977adbafe4..b1fc6eb57b 100644 --- a/test/solvers/test_augmented_lagrangian.jl +++ b/test/solvers/test_augmented_lagrangian.jl @@ -40,7 +40,8 @@ using LinearAlgebra: I, tr @test Manopt.get_message(alms) == "" @test get_iterate(alms) == 2 .* p0 @test startswith( - repr(alms), "# Solver state for `Manopt.jl`s Augmented Lagrangian Method\n" + Manopt.status_summary(alms; context = :default), + "# Solver state for `Manopt.jl`s Augmented Lagrangian Method\n" ) @test Manopt.get_sub_problem(alms) === sp @test Manopt.get_sub_state(alms) === ss diff --git a/test/solvers/test_cma_es.jl b/test/solvers/test_cma_es.jl index 64a10d0563..c95ca98dea 100644 --- a/test/solvers/test_cma_es.jl +++ b/test/solvers/test_cma_es.jl @@ -34,7 +34,7 @@ flat_example(::AbstractManifold, p) = 0.0 @test griewank(M, p1) < 0.1 p1 = cma_es(M, griewank; σ = 10.0, rng = MersenneTwister(123)) - @test griewank(M, p1) < 0.2 + @test griewank(M, p1) < 0.25 p1 = [10.0, 10.0] cma_es!(M, griewank, p1; σ = 10.0, rng = MersenneTwister(123)) @@ -42,16 +42,12 @@ flat_example(::AbstractManifold, p) = 0.0 o = cma_es(M, griewank, [10.0, 10.0]; return_state = true) @test startswith( - repr(o), + Manopt.status_summary(o; context = :default), "# Solver state for `Manopt.jl`s Covariance Matrix Adaptation Evolutionary Strategy", ) o_d = cma_es( - M, - divergent_example, - [10.0, 10.0]; - σ = 10.0, - rng = MersenneTwister(123), + M, divergent_example, [10.0, 10.0]; σ = 10.0, rng = MersenneTwister(123), return_state = true, ) div_sc = only(get_active_stopping_criteria(o_d.stop)) @@ -60,12 +56,8 @@ flat_example(::AbstractManifold, p) = 0.0 @test startswith(repr(div_sc), "StopWhenPopulationDiverges(") o_d = cma_es( - M, - poorly_conditioned_example, - [10.0, 10.0]; - σ = 10.0, - rng = MersenneTwister(123), - return_state = true, + M, poorly_conditioned_example, [10.0, 10.0]; + σ = 10.0, rng = MersenneTwister(123), return_state = true, ) condcov_sc = only(get_active_stopping_criteria(o_d.stop)) @test condcov_sc isa StopWhenCovarianceIllConditioned @@ -73,14 +65,9 @@ flat_example(::AbstractManifold, p) = 0.0 @test startswith(repr(condcov_sc), "StopWhenCovarianceIllConditioned(") o_flat = cma_es( - M, - flat_example, - [10.0, 10.0]; - σ = 10.0, - stopping_criterion = StopAfterIteration(500) | - StopWhenBestCostInGenerationConstant{Float64}(5), - rng = MersenneTwister(123), - return_state = true, + M, flat_example, [10.0, 10.0]; σ = 10.0, + stopping_criterion = StopAfterIteration(500) | StopWhenBestCostInGenerationConstant{Float64}(5), + rng = MersenneTwister(123), return_state = true, ) flat_sc = only(get_active_stopping_criteria(o_flat.stop)) @test flat_sc isa StopWhenBestCostInGenerationConstant @@ -88,14 +75,9 @@ flat_example(::AbstractManifold, p) = 0.0 @test startswith(repr(flat_sc), "StopWhenBestCostInGenerationConstant(") o_flat = cma_es( - M, - flat_example, - [10.0, 10.0]; - σ = 10.0, - stopping_criterion = StopAfterIteration(500) | - StopWhenEvolutionStagnates(5, 100, 0.3), - rng = MersenneTwister(123), - return_state = true, + M, flat_example, [10.0, 10.0]; σ = 10.0, + stopping_criterion = StopAfterIteration(500) | StopWhenEvolutionStagnates(5, 100, 0.3), + rng = MersenneTwister(123), return_state = true, ) flat_sc = only(get_active_stopping_criteria(o_flat.stop)) @test flat_sc isa StopWhenEvolutionStagnates @@ -103,14 +85,9 @@ flat_example(::AbstractManifold, p) = 0.0 @test startswith(repr(flat_sc), "StopWhenEvolutionStagnates(") o_flat = cma_es( - M, - flat_example, - [10.0, 10.0]; - σ = 10.0, - stopping_criterion = StopAfterIteration(1000) | - StopWhenPopulationStronglyConcentrated(1.0e-5), - rng = MersenneTwister(12), - return_state = true, + M, flat_example, [10.0, 10.0]; σ = 10.0, + stopping_criterion = StopAfterIteration(1000) | StopWhenPopulationStronglyConcentrated(1.0e-5), + rng = MersenneTwister(12), return_state = true, ) flat_sc = only(get_active_stopping_criteria(o_flat.stop)) @test flat_sc isa StopWhenPopulationStronglyConcentrated @@ -118,14 +95,9 @@ flat_example(::AbstractManifold, p) = 0.0 @test startswith(repr(flat_sc), "StopWhenPopulationStronglyConcentrated(") o_flat = cma_es( - M, - flat_example, - [10.0, 10.0]; - σ = 10.0, - stopping_criterion = StopAfterIteration(500) | - StopWhenPopulationCostConcentrated(1.0e-5, 5), - rng = MersenneTwister(123), - return_state = true, + M, flat_example, [10.0, 10.0]; σ = 10.0, + stopping_criterion = StopAfterIteration(500) | StopWhenPopulationCostConcentrated(1.0e-5, 5), + rng = MersenneTwister(123), return_state = true, ) flat_sc = only(get_active_stopping_criteria(o_flat.stop)) @test flat_sc isa StopWhenPopulationCostConcentrated @@ -134,14 +106,9 @@ flat_example(::AbstractManifold, p) = 0.0 # test handling of negative covariance matrix eigenvalues @test_warn "Covariance matrix has nonpositive eigenvalues" o_flat = cma_es( - M, - flat_example, - [10.0, 10.0]; - σ = 10.0, - stopping_criterion = StopAfterIteration(10000) | - StopWhenPopulationStronglyConcentrated(1.0e-14), - rng = MersenneTwister(12), - return_state = true, + M, flat_example, [10.0, 10.0]; σ = 10.0, + stopping_criterion = StopAfterIteration(10000) | StopWhenPopulationStronglyConcentrated(1.0e-14), + rng = MersenneTwister(12), return_state = true, ) flat_sc = only(get_active_stopping_criteria(o_flat.stop)) @test flat_sc isa StopWhenPopulationStronglyConcentrated @@ -150,7 +117,6 @@ flat_example(::AbstractManifold, p) = 0.0 end @testset "Spherical CMA-ES" begin M = Sphere(2) - p1 = cma_es(M, griewank, [0.0, 1.0, 0.0]; σ = 1.0, rng = MersenneTwister(123)) @test griewank(M, p1) < 0.17 end diff --git a/test/solvers/test_conjugate_gradient.jl b/test/solvers/test_conjugate_gradient.jl index c7f29ff1c6..1203b3d0ff 100644 --- a/test/solvers/test_conjugate_gradient.jl +++ b/test/solvers/test_conjugate_gradient.jl @@ -23,14 +23,11 @@ using ManifoldDiff: grad_distance dU = SteepestDescentCoefficient() s1 = ConjugateGradientDescentState( M; - p = x0, - stopping_criterion = sC, - stepsize = s, - coefficient = dU, - retraction_method = retr, - vector_transport_method = vtm, + p = x0, stopping_criterion = sC, stepsize = s, + coefficient = dU, retraction_method = retr, vector_transport_method = vtm, initial_gradient = zero_vector(M, x0), ) + @test startswith(repr(s1), "ConjugateGradientDescentState(; ") @test s1.coefficient(dmp, s1, 1) == 0 @test default_stepsize(M, typeof(s1)) isa Manopt.ManifoldDefaultsFactory{Manopt.ArmijoLinesearchStepsize} @test Manopt.get_message(s1) == "" @@ -38,12 +35,8 @@ using ManifoldDiff: grad_distance dU = Manopt.ConjugateDescentCoefficient() s2 = ConjugateGradientDescentState( M; - p = x0, - stopping_criterion = sC, - stepsize = s, - coefficient = dU, - retraction_method = retr, - vector_transport_method = vtm, + p = x0, stopping_criterion = sC, stepsize = s, coefficient = dU, + retraction_method = retr, vector_transport_method = vtm, initial_gradient = zero_vector(M, x0), ) s2.X = grad_1 @@ -213,7 +206,7 @@ using ManifoldDiff: grad_distance ) @test get_solver_result(x_opt2) == x_opt @test startswith( - repr(x_opt2), + Manopt.status_summary(x_opt2; context = :default), "# Solver state for `Manopt.jl`s Conjugate Gradient Descent Solver", ) Random.seed!(23) diff --git a/test/solvers/test_conjugate_residual.jl b/test/solvers/test_conjugate_residual.jl index 741056621e..24a62b463b 100644 --- a/test/solvers/test_conjugate_residual.jl +++ b/test/solvers/test_conjugate_residual.jl @@ -18,4 +18,15 @@ using Manifolds, Manopt, Test @test norm(ps - pT) < 3.0e-15 @test norm(pT2 - pT) < 3.0e-15 @test get_cost(TpM, slso, pT) < 5.0e-15 + s = repr(slso) + @test startswith(s, "SymmetricLinearSystemObjective") + s2 = Manopt.status_summary(slso) + @test startswith(s2, "An objetcive modelling a symmetric linear system") + cgrs = conjugate_residual(TpM, slso, X0; return_state = true) + @test startswith(Manopt.status_summary(cgrs), "# Solver state for `Manopt.jl`s Conjugate Residual Method") + @test startswith(repr(cgrs), "ConjugateResidualState(; ") + + scs = StopWhenRelativeResidualLess(1.0, 0.1) + @test repr(scs) == "StopWhenRelativeResidualLess(1.0, 0.1)" + @test startswith(Manopt.status_summary(scs), "A stopping criterion to stop when the relative residual is less") end diff --git a/test/solvers/test_convex_bundle_method.jl b/test/solvers/test_convex_bundle_method.jl index 4be4fae38d..416d175ac5 100644 --- a/test/solvers/test_convex_bundle_method.jl +++ b/test/solvers/test_convex_bundle_method.jl @@ -25,6 +25,7 @@ using Manopt: estimate_sectional_curvature curvature_cbms = ConvexBundleMethodState(M; p = p0) @test ω ≤ curvature_cbms.k_min @test Ω ≥ curvature_cbms.k_max + @test startswith(repr(curvature_cbms), "ConvexBundleMethodState(") end @testset "Close Point Function" begin @@ -34,28 +35,22 @@ using Manopt: estimate_sectional_curvature end cbms = ConvexBundleMethodState( - M; - p = p0, - atol_λ = 1.0e0, - diameter = diameter, + M; p = p0, atol_λ = 1.0e0, diameter = diameter, domain = (M, q) -> distance(M, q, p0) < diameter / 2 ? true : false, - k_max = Ω, - k_min = ω, + k_max = Ω, k_min = ω, stepsize = Manopt.DomainBackTrackingStepsize(M; contraction_factor = 0.975), stopping_criterion = StopAfterIteration(200), ) + # A state can not be created with just a manifold and a substate, then the problem is missing + @test_throws ErrorException ConvexBundleMethodState(M, Manopt.Test.DummyState()) @test get_iterate(cbms) == p0 cbms.X = [1.0, 0.0, 0.0, 0.0, 0.0] @testset "Special Stopping Criteria" begin sc1 = StopWhenLagrangeMultiplierLess(1.0e-8) - @test startswith( - repr(sc1), "StopWhenLagrangeMultiplierLess([1.0e-8]; mode=:estimate)\n" - ) + @test startswith(repr(sc1), "StopWhenLagrangeMultiplierLess([1.0e-8]; mode=:estimate)") sc2 = StopWhenLagrangeMultiplierLess([1.0e-8, 1.0e-8]; mode = :both) - @test startswith( - repr(sc2), "StopWhenLagrangeMultiplierLess([1.0e-8, 1.0e-8]; mode=:both)\n" - ) + @test startswith(repr(sc2), "StopWhenLagrangeMultiplierLess([1.0e-8, 1.0e-8]; mode=:both)") end @testset "Allocating Subgradient" begin @@ -86,18 +81,8 @@ using Manopt: estimate_sectional_curvature @testset "Domain and Null Conditions" begin @test _domain_condition(M, p, p0, 1.0, 1.0, cbms.domain) @test !_null_condition( - mp, - M, - p, - p0, - cbms.X, - cbms.g, - cbms.vector_transport_method, - cbms.inverse_retraction_method, - cbms.m, - 1.0, - cbms.ξ, - cbms.ϱ, + mp, M, p, p0, cbms.X, cbms.g, cbms.vector_transport_method, + cbms.inverse_retraction_method, cbms.m, 1.0, cbms.ξ, cbms.ϱ, ) end @@ -105,17 +90,10 @@ using Manopt: estimate_sectional_curvature io = IOBuffer() ds = DebugStepsize(; io = io) bms2 = convex_bundle_method( - M, - f, - ∂f, - p0; - diameter = diameter, + M, f, ∂f, p0; diameter = diameter, domain = (M, q) -> distance(M, q, p0) < diameter / 2 ? true : false, - k_max = Ω, - k_min = ω, - stopping_criterion = StopAfterIteration(200), - return_state = true, - debug = [], + k_max = Ω, k_min = ω, + stopping_criterion = StopAfterIteration(200), return_state = true, debug = [], ) p_star2 = get_solver_result(bms2) @test get_subgradient(bms2) == -∂f(M, p_star2) @@ -134,12 +112,12 @@ using Manopt: estimate_sectional_curvature @testset "Warnings" begin dw1 = DebugWarnIfLagrangeMultiplierIncreases(:Once; tol = 0.0) - @test repr(dw1) == "DebugWarnIfLagrangeMultiplierIncreases(; tol=\"0.0\")" + @test repr(dw1) == "DebugWarnIfLagrangeMultiplierIncreases(:Once; tol=\"0.0\")" cbms.ξ = 101.0 @test_logs (:warn,) dw1(mp, cbms, 1) dw2 = DebugWarnIfLagrangeMultiplierIncreases(:Once; tol = 1.0e1) dw2.old_value = -101.0 - @test repr(dw2) == "DebugWarnIfLagrangeMultiplierIncreases(; tol=\"10.0\")" + @test repr(dw2) == "DebugWarnIfLagrangeMultiplierIncreases(:Once; tol=\"10.0\")" cbms.ξ = -1.0 @test_logs (:warn,) (:warn,) dw2(mp, cbms, 1) end @@ -171,18 +149,12 @@ using Manopt: estimate_sectional_curvature @test_throws MethodError get_gradient(mp, cbms.p) @test_throws MethodError get_proximal_map(mp, 1.0, cbms.p, 1) s2 = convex_bundle_method( - M, - f, - ∂f!, - copy(p0); - diameter = diameter, + M, f, ∂f!, copy(p0); diameter = diameter, domain = (M, q) -> distance(M, q, p0) < diameter / 2 ? true : false, k_max = Ω, stopping_criterion = StopAfterIteration(200), evaluation = InplaceEvaluation(), - sub_problem = (convex_bundle_method_subsolver!), - return_state = true, - debug = [], + sub_problem = (convex_bundle_method_subsolver!), return_state = true, debug = [], ) p_star2 = get_solver_result(s2) @test f(M, p_star2) <= f(M, p0) @@ -204,18 +176,15 @@ using Manopt: estimate_sectional_curvature p0 = p1 cbm_s = convex_bundle_method(M, f, ∂f, p0; k_max = 1.0, k_min = 1.0, return_state = true) @test startswith( - repr(cbm_s), "# Solver state for `Manopt.jl`s Convex Bundle Method\n" + Manopt.status_summary(cbm_s; context = :default), + "# Solver state for `Manopt.jl`s Convex Bundle Method\n" ) q = get_solver_result(cbm_s) m = median(M, data) @test distance(M, q, m) < 2.0e-2 #with default parameters this is not very precise # test the other stopping criterion mode q2 = convex_bundle_method( - M, - f, - ∂f, - p0; - k_max = 1.0, + M, f, ∂f, p0; k_max = 1.0, stopping_criterion = StopWhenLagrangeMultiplierLess([1.0e-6, 1.0e-6]; mode = :both), ) @test distance(M, q2, m) < 2.0e-2 @@ -223,13 +192,7 @@ using Manopt: estimate_sectional_curvature diam = π / 4 domf(M, p) = distance(M, p, p0) < diam / 2 ? true : false q2 = convex_bundle_method( - M, - f, - ∂f, - p0; - k_max = 1.0, - diameter = diam, - domain = domf, + M, f, ∂f, p0; k_max = 1.0, diameter = diam, domain = domf, stopping_criterion = StopAfterIteration(3), ) end @@ -246,11 +209,7 @@ using Manopt: estimate_sectional_curvature return -log(M, q, p) / max(10 * eps(Float64), distance(M, p, q)) end cbms = ConvexBundleMethodState( - M, - convex_bundle_method_subsolver; - p = q, - k_max = 1.0, - k_min = 1.0, + M, convex_bundle_method_subsolver; p = q, k_max = 1.0, k_min = 1.0, stepsize = DomainBackTrackingStepsize(M; contraction_factor = 0.975), stopping_criterion = StopAfterIteration(20), ) @@ -268,15 +227,13 @@ using Manopt: estimate_sectional_curvature @test nsbt(mp, cbms, 1) < 1.0e-15 # Expected value? # nsbt show/status - @test startswith(repr(nsbt), "NullStepBackTracking(;\n") - @test startswith(Manopt.status_summary(nsbt), "NullStepBackTracking(;\n") - @test endswith(Manopt.status_summary(nsbt), "e-16") + @test startswith(repr(nsbt), "NullStepBackTrackingStepsize(;") + @test startswith(Manopt.status_summary(nsbt), "A null step backtracking stepsize") # Test show/summary on domainbt dbt = DomainBackTrackingStepsize(M; contraction_factor = 0.975) @test get_initial_stepsize(dbt) == 1 - @test startswith(repr(dbt), "DomainBackTracking(;\n") - @test startswith(Manopt.status_summary(dbt), "DomainBackTracking(;\n") - @test endswith(Manopt.status_summary(dbt), "of 1.0") + @test startswith(repr(dbt), "DomainBackTrackingStepsize(;") + @test startswith(Manopt.status_summary(dbt), "A domain backtracking stepsize") # a newly setup stepsize has now message (yet) @test Manopt.get_message(dbt) == "" end @@ -293,14 +250,8 @@ using Manopt: estimate_sectional_curvature diam = π / 2 domf(M, p) = distance(M, p, p) < diam / 2 ? true : false cbms = ConvexBundleMethodState( - M, - convex_bundle_method_subsolver; - diameter = diam, - domain = domf, - bundle_cap = 3, - p = q, - k_max = 1.0, - k_min = 1.0, + M, convex_bundle_method_subsolver; + diameter = diam, domain = domf, bundle_cap = 3, p = q, k_max = 1.0, k_min = 1.0, stepsize = DomainBackTrackingStepsize(M; contraction_factor = 0.975), stopping_criterion = StopAfterIteration(20), ) @@ -317,14 +268,10 @@ using Manopt: estimate_sectional_curvature # Ensure the first element in the bundle is not equal to p_last_serious cbms.p_last_serious .= [0.0, 1.0, 0.0] - step_solver!(mp, cbms, 1) - @test length(cbms.bundle) == cbms.bundle_cap @test cbms.bundle[1][1] ≠ cbms.p_last_serious @test length(cbms.linearization_errors) == length(cbms.bundle) @test length(cbms.λ) == length(cbms.bundle) - - # step_solver!(mp, cbms, 2) end end diff --git a/test/solvers/test_cyclic_proximal_point.jl b/test/solvers/test_cyclic_proximal_point.jl index 0e792f21f8..e81e2de241 100644 --- a/test/solvers/test_cyclic_proximal_point.jl +++ b/test/solvers/test_cyclic_proximal_point.jl @@ -78,8 +78,10 @@ using ManifoldDiff: prox_distance, prox_distance! s2 = get_solver_result(r) @test isapprox(N, s1, s2) @test startswith( - repr(r), "# Solver state for `Manopt.jl`s Cyclic Proximal Point Algorithm" + Manopt.status_summary(r; context = :default), + "# Solver state for `Manopt.jl`s Cyclic Proximal Point Algorithm" ) + @test startswith(repr(r), "CyclicProximalPointState(; ") @testset "Caching" begin r2 = cyclic_proximal_point( N, diff --git a/test/solvers/test_difference_of_convex.jl b/test/solvers/test_difference_of_convex.jl index 11f1323266..03d4636813 100644 --- a/test/solvers/test_difference_of_convex.jl +++ b/test/solvers/test_difference_of_convex.jl @@ -42,6 +42,8 @@ import Manifolds: inner @test Manopt.get_message(dcs) == "" dcsc = DifferenceOfConvexState(M, f) @test dcsc.sub_state isa Manopt.ClosedFormSubSolverState + # Test that the combination Manifold+State errors (since a problem is required) + @test_throws ErrorException DifferenceOfConvexState(M, Manopt.Test.DummyState()) set_iterate!(dcs, M, p1) @test dcs.p == p1 @@ -52,9 +54,7 @@ import Manifolds: inner dcppa_sub_cost = ProximalDCCost(g, copy(M, p0), 1.0) dcppa_sub_grad = ProximalDCGrad(grad_g, copy(M, p0), 1.0) - dcppa_sub_grad! = ProximalDCGrad( - grad_g!, copy(M, p0), 1.0; evaluation = InplaceEvaluation() - ) + dcppa_sub_grad! = ProximalDCGrad(grad_g!, copy(M, p0), 1.0; evaluation = InplaceEvaluation()) Y1 = dcppa_sub_grad!(M, p0) Y2 = similar(Y1) dcppa_sub_grad(M, Y2, p0) @@ -63,12 +63,8 @@ import Manifolds: inner dcppa_sub_objective = ManifoldGradientObjective(dcppa_sub_cost, dcppa_sub_grad) dcppa_sub_problem = DefaultManoptProblem(M, dcppa_sub_objective) dcppa_sub_state = GradientDescentState(M; p = copy(M, p0)) - - dcps = DifferenceOfConvexProximalState( #Initialize with random point - M, - dcppa_sub_problem, - dcppa_sub_state, - ) + # Initialize with random point + dcps = DifferenceOfConvexProximalState(M, dcppa_sub_problem, dcppa_sub_state) set_iterate!(dcps, M, p1) @test dcps.p == p1 set_gradient!(dcps, M, p1, X1) @@ -116,8 +112,10 @@ import Manifolds: inner M, f, g, grad_h, p0; grad_g = grad_g, gradient = grad_f, return_state = true ) @test startswith( - repr(s1), "# Solver state for `Manopt.jl`s Difference of Convex Algorithm\n" + Manopt.status_summary(s1), "# Solver state for `Manopt.jl`s Difference of Convex Algorithm\n" ) + @test_throws ErrorException DifferenceOfConvexProximalState(M, Manopt.Test.DummyState()) + @test startswith(repr(s1), "DifferenceOfConvexState(DefaultManoptProblem(") p3 = get_solver_result(s1) @test Manopt.get_message(s1) == "" # no message in last step @test isapprox(M, p1, p2) @@ -139,21 +137,17 @@ import Manifolds: inner p5b = difference_of_convex_proximal_point(M, grad_h; g = g, grad_g = grad_g) # using gradient descent p5c = difference_of_convex_proximal_point( - M, - grad_h, - p0; - g = g, - grad_g = grad_g, - sub_hess = nothing, + M, grad_h, p0; g = g, grad_g = grad_g, sub_hess = nothing, stopping_criterion = StopAfterIteration(10), # is not that stable ) s2 = difference_of_convex_proximal_point( M, grad_h, p0; g = g, grad_g = grad_g, gradient = grad_f, return_state = true ) @test startswith( - repr(s2), + Manopt.status_summary(s2; context = :default), "# Solver state for `Manopt.jl`s Difference of Convex Proximal Point Algorithm\n", ) + @test startswith(repr(s2), "DifferenceOfConvexProximalState(DefaultManoptProblem(") p6 = get_solver_result(s2) @test Manopt.get_message(s2) == "" diff --git a/test/solvers/test_exact_penalty.jl b/test/solvers/test_exact_penalty.jl index 1fb880bf51..c645f9ebc4 100644 --- a/test/solvers/test_exact_penalty.jl +++ b/test/solvers/test_exact_penalty.jl @@ -24,13 +24,8 @@ using LinearAlgebra: I, tr ) sol_lqh3 = copy(M, p0) exact_penalty_method!( - M, - f, - grad_f, - sol_lqh3; - g = g, - grad_g = grad_g, - smoothing = LinearQuadraticHuber(), + M, f, grad_f, sol_lqh3; + g = g, grad_g = grad_g, smoothing = LinearQuadraticHuber(), gradient_inequality_range = NestedPowerRepresentation(), ) a_tol_emp = 8.0e-2 @@ -46,10 +41,13 @@ using LinearAlgebra: I, tr @test Manopt.get_message(epms) == "" set_iterate!(epms, M, 2 .* p0) @test get_iterate(epms) == 2 .* p0 - @test startswith(repr(epms), "# Solver state for `Manopt.jl`s Exact Penalty Method\n") + @test startswith(Manopt.status_summary(epms; context = :default), "# Solver state for `Manopt.jl`s Exact Penalty Method\n") + @test startswith(repr(epms), "ExactPenaltyMethodState($(dmp)") # With dummy closed form solution epmsc = ExactPenaltyMethodState(M, f) @test epmsc.sub_state isa Manopt.ClosedFormSubSolverState + # that is errors with just Manifold + State + @test_throws ErrorException ExactPenaltyMethodState(M, Manopt.Test.DummyState()) @testset "Numbers" begin Me = Euclidean() fe(M, p) = (p + 5)^2 @@ -57,13 +55,8 @@ using LinearAlgebra: I, tr ge(M, p) = -p # inequality constraint p ≥ 0 grad_ge(M, p) = -1 s = exact_penalty_method( - Me, - fe, - grad_fe, - 4.0; - g = ge, - grad_g = grad_ge, - stopping_criterion = StopAfterIteration(20), + Me, fe, grad_fe, 4.0; + g = ge, grad_g = grad_ge, stopping_criterion = StopAfterIteration(20), return_state = true, ) q = get_solver_result(s)[] diff --git a/test/solvers/test_gradient_descent.jl b/test/solvers/test_gradient_descent.jl index 690c424981..15302f1175 100644 --- a/test/solvers/test_gradient_descent.jl +++ b/test/solvers/test_gradient_descent.jl @@ -161,16 +161,13 @@ using ManifoldDiff: grad_distance n5 = copy(M, pts[1]) r = gradient_descent!(M, f, grad_f, n5; return_state = true) @test isapprox(M, n5, n2) - @test startswith(repr(r), "# Solver state for `Manopt.jl`s Gradient Descent") + @test startswith(Manopt.status_summary(r; context = :default), "# Solver state for `Manopt.jl`s Gradient Descent") # State and a count objective, putting stats behind print n6 = gradient_descent( M, f, grad_f, pts[1]; - count = [:Gradient], - return_objective = true, - return_state = true, + count = [:Gradient], return_objective = true, return_state = true, ) - @test stopped_at(n6[2]) > 0 - @test repr(n6) == "$(n6[2])\n\n$(n6[1])" + @test Manopt.status_summary(n6; context = :default) == "$(Manopt.status_summary(n6[2]; context = :default))\n\n$(Manopt.status_summary(n6[1]; context = :default))" end @testset "Tutorial mode" begin M = Sphere(2) diff --git a/test/solvers/test_interior_point_Newton.jl b/test/solvers/test_interior_point_Newton.jl index 84853cac75..f7e92262b2 100644 --- a/test/solvers/test_interior_point_Newton.jl +++ b/test/solvers/test_interior_point_Newton.jl @@ -4,9 +4,11 @@ using Manifolds, Manopt, LinearAlgebra, Random, Test, RecursiveArrayTools @testset "StepsizeState" begin M = Manifolds.Sphere(2) a = StepsizeState(M) - b = StepsizeState(a.p, a.X) + b = StepsizeState(; p = a.p, X = a.X) @test a.p === b.p @test a.X === b.X + @test startswith(repr(b), "StepsizeState(; ") + @test startswith(Manopt.status_summary(b), "A state for a stepsize") end @testset "A solver run on the Sphere" begin # We can take a look at debug prints of one run and plot the result @@ -35,37 +37,18 @@ using Manifolds, Manopt, LinearAlgebra, Random, Test, RecursiveArrayTools p_opt = [0.0, 0.0, 1.0] record = [:Iterate] dbg = [ - :Iteration, - " ", - :Cost, - " ", - :Stepsize, - " ", - :Change, - " ", - :Feasibility, - "\n", - :Stop, - 10, - DebugMessages(:Info, :Always), + :Iteration, " ", :Cost, " ", :Stepsize, " ", :Change, " ", :Feasibility, "\n", + :Stop, 10, DebugMessages(:Info, :Always), ] sc = StopAfterIteration(800) | StopWhenKKTResidualLess(1.0e-2) # (a) classical call w/ recording res = interior_point_Newton( - M, - f, - grad_f, - Hess_f, - p_0; - g = g, - grad_g = grad_g, - Hess_g = Hess_g, + M, f, grad_f, Hess_f, p_0; + g = g, grad_g = grad_g, Hess_g = Hess_g, stopping_criterion = sc, - debug = _debug ? dbg : [], - record = _debug_iterates_plot ? record : [], - return_state = true, - return_objective = true, + debug = _debug ? dbg : [], record = _debug_iterates_plot ? record : [], + return_state = true, return_objective = true, ) q = get_solver_result(res) @@ -74,15 +57,8 @@ using Manifolds, Manopt, LinearAlgebra, Random, Test, RecursiveArrayTools # (b) inplace call q2 = copy(M, p_0) interior_point_Newton!( - M, - f, - grad_f, - Hess_f, - q2; - g = g, - grad_g = grad_g, - Hess_g = Hess_g, - stopping_criterion = sc, + M, f, grad_f, Hess_f, q2; + g = g, grad_g = grad_g, Hess_g = Hess_g, stopping_criterion = sc, ) @test q == q2 diff --git a/test/solvers/test_mesh_adaptive_direct_search.jl b/test/solvers/test_mesh_adaptive_direct_search.jl index 703f60ffd8..9e2d2a6c93 100644 --- a/test/solvers/test_mesh_adaptive_direct_search.jl +++ b/test/solvers/test_mesh_adaptive_direct_search.jl @@ -9,15 +9,10 @@ using Manifolds, Manopt, Test, LinearAlgebra, Random p0 = [1.0 0.0; 0.0 1.0] f(M, p) = opnorm(B - A * p) Random.seed!(42) - s = mesh_adaptive_direct_search( - M, - f, - p0; - # debug=[:Iteration, :Cost, " ", :poll_size, " ", :mesh_size, " ", :Stop, "\n"], - return_state = true, - ) + s = mesh_adaptive_direct_search(M, f, p0; return_state = true) @test distance(M, get_solver_result(s), W) < 1.0e-9 @test startswith(get_reason(s), "The algorithm computed a poll step size") + @test startswith(repr(s), "MeshAdaptiveDirectSearchState(; ") # # # A bit larger example inplace diff --git a/test/solvers/test_particle_swarm.jl b/test/solvers/test_particle_swarm.jl index 0b6375114c..eadc2d6b7b 100644 --- a/test/solvers/test_particle_swarm.jl +++ b/test/solvers/test_particle_swarm.jl @@ -12,8 +12,10 @@ using Random Random.seed!(35) o = particle_swarm(M, f, p1; return_state = true) @test startswith( - repr(o), "# Solver state for `Manopt.jl`s Particle Swarm Optimization Algorithm" + Manopt.status_summary(o; context = :default), + "# Solver state for `Manopt.jl`s Particle Swarm Optimization Algorithm\n" ) + @test startswith(repr(o), "ParticleSwarmState(;") g = get_solver_result(o) initF = min(f.(Ref(M), p1)...) diff --git a/test/solvers/test_primal_dual_semismooth_Newton.jl b/test/solvers/test_primal_dual_semismooth_Newton.jl index 71445c89d9..bb6b4bbb82 100644 --- a/test/solvers/test_primal_dual_semismooth_Newton.jl +++ b/test/solvers/test_primal_dual_semismooth_Newton.jl @@ -46,47 +46,20 @@ using ManifoldDiff: differential_shortest_geodesic_startpoint, prox_distance ξ0 = zero_vector(M, m) s = primal_dual_semismooth_Newton( - M, - N, - f, - x0, - ξ0, - m, - n, - prox_f, - Dprox_F, - prox_g_dual, - Dprox_G_dual, - DΛ, - adjoint_DΛ; - primal_stepsize = σ, - dual_stepsize = τ, - return_state = true, + M, N, f, x0, ξ0, m, n, prox_f, Dprox_F, prox_g_dual, Dprox_G_dual, DΛ, adjoint_DΛ; + primal_stepsize = σ, dual_stepsize = τ, return_state = true, ) @test startswith( - repr(s), "# Solver state for `Manopt.jl`s primal dual semismooth Newton" + Manopt.status_summary(s; context = :default), + "# Solver state for `Manopt.jl`s primal dual semismooth Newton" ) y = get_solver_result(s) @test x_hat ≈ y atol = 2 * 1.0e-7 update_dual_base(p, o, i) = o.n o2 = primal_dual_semismooth_Newton( - M, - N, - f, - x0, - ξ0, - m, - n, - prox_f, - Dprox_F, - prox_g_dual, - Dprox_G_dual, - DΛ, - adjoint_DΛ; - primal_stepsize = σ, - dual_stepsize = τ, - update_dual_base = update_dual_base, + M, N, f, x0, ξ0, m, n, prox_f, Dprox_F, prox_g_dual, Dprox_G_dual, DΛ, adjoint_DΛ; + primal_stepsize = σ, dual_stepsize = τ, update_dual_base = update_dual_base, return_state = false, ) y2 = o2 diff --git a/test/solvers/test_projected_gradient.jl b/test/solvers/test_projected_gradient.jl index f1ad16aa55..0c708aea08 100644 --- a/test/solvers/test_projected_gradient.jl +++ b/test/solvers/test_projected_gradient.jl @@ -81,16 +81,12 @@ using Manifolds, Manopt, Random, Test ) @test isapprox(M, mean_pg_1, mean_pg_3) @test startswith( - repr(st), "# Solver state for `Manopt.jl`s Projected Gradient Method\n" + Manopt.status_summary(st; context = :default), + "# Solver state for `Manopt.jl`s Projected Gradient Method\n" ) stop_when_stationary = st.stop.criteria[2] @test Manopt.indicates_convergence(stop_when_stationary) - @test repr(stop_when_stationary) == - "StopWhenProjectedGradientStationary($(stop_when_stationary.threshold))\n $( - Manopt.status_summary( - stop_when_stationary - ) - )" + @test repr(stop_when_stationary) == "StopWhenProjectedGradientStationary($(stop_when_stationary.threshold))" @test length(get_reason(stop_when_stationary)) > 0 @test length(get_reason(StopWhenProjectedGradientStationary(M, 1.0e-7))) == 0 end diff --git a/test/solvers/test_proximal_bundle_method.jl b/test/solvers/test_proximal_bundle_method.jl index d4d29a3775..f0b98384f6 100644 --- a/test/solvers/test_proximal_bundle_method.jl +++ b/test/solvers/test_proximal_bundle_method.jl @@ -7,21 +7,18 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve p0 = [0.0, 0.0, 0.0, 0.0, -1.0] pbms = ProximalBundleMethodState(M; p = p0, stopping_criterion = StopAfterIteration(200)) @test get_iterate(pbms) == p0 - + # Check that Manifold+State is erroring since a problem is missing + @test_throws ErrorException ProximalBundleMethodState(M, Manopt.Test.DummyState()) pbms.X = [1.0, 0.0, 0.0, 0.0, 0.0] @testset "Special Stopping Criteria" begin sc1 = StopWhenLagrangeMultiplierLess(1.0e-8) - @test startswith( - repr(sc1), "StopWhenLagrangeMultiplierLess([1.0e-8]; mode=:estimate)\n" - ) + @test startswith(repr(sc1), "StopWhenLagrangeMultiplierLess([1.0e-8]; mode=:estimate)") @test get_reason(sc1) == "" # Trigger manually sc1.at_iteration = 2 @test length(get_reason(sc1)) > 0 sc2 = StopWhenLagrangeMultiplierLess([1.0e-8, 1.0e-8]; mode = :both) - @test startswith( - repr(sc2), "StopWhenLagrangeMultiplierLess([1.0e-8, 1.0e-8]; mode=:both)\n" - ) + @test startswith(repr(sc2), "StopWhenLagrangeMultiplierLess([1.0e-8, 1.0e-8]; mode=:both)") @test get_reason(sc2) == "" # Trigger manually sc2.at_iteration = 2 @@ -47,13 +44,8 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve @test_throws MethodError get_gradient(mp, pbms.p) @test_throws MethodError get_proximal_map(mp, 1.0, pbms.p, 1) pbms2 = proximal_bundle_method( - M, - f, - ∂f, - p0; - stopping_criterion = StopAfterIteration(200), - return_state = true, - debug = [], + M, f, ∂f, p0; + stopping_criterion = StopAfterIteration(200), return_state = true, debug = [], ) p_star2 = get_solver_result(pbms2) @test get_subgradient(pbms2) == -∂f(M, p_star2) @@ -63,12 +55,12 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve # Test warnings dw1 = DebugWarnIfLagrangeMultiplierIncreases(:Once; tol = 0.0) dw1(mp, pbms, 1) #do one normal run. - @test repr(dw1) == "DebugWarnIfLagrangeMultiplierIncreases(; tol=\"0.0\")" + @test repr(dw1) == "DebugWarnIfLagrangeMultiplierIncreases(:Once; tol=\"0.0\")" pbms.ν = 101.0 @test_logs (:warn,) dw1(mp, pbms, 2) dw2 = DebugWarnIfLagrangeMultiplierIncreases(:Once; tol = 1.0e1) dw2.old_value = -101.0 - @test repr(dw2) == "DebugWarnIfLagrangeMultiplierIncreases(; tol=\"10.0\")" + @test repr(dw2) == "DebugWarnIfLagrangeMultiplierIncreases(:Once; tol=\"10.0\")" pbms.ν = -1.0 @test_logs (:warn,) (:warn,) dw2(mp, pbms, 1) end @@ -98,10 +90,7 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve @test_throws MethodError get_gradient(mp, pbms.p) @test_throws MethodError get_proximal_map(mp, 1.0, pbms.p, 1) s2 = proximal_bundle_method( - M, - f, - ∂f!, - copy(p0); + M, f, ∂f!, copy(p0); stopping_criterion = StopAfterIteration(200), evaluation = InplaceEvaluation(), sub_state = AllocatingEvaluation(), # keep the default allocating subsolver here @@ -127,8 +116,10 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve p0 = p1 pbm_s = proximal_bundle_method(M, f, ∂f, p0; return_state = true) @test startswith( - repr(pbm_s), "# Solver state for `Manopt.jl`s Proximal Bundle Method\n" + Manopt.status_summary(pbm_s; context = :default), + "# Solver state for `Manopt.jl`s Proximal Bundle Method\n" ) + @test startswith(repr(pbm_s), "ProximalBundleMethodState(") q = get_solver_result(pbm_s) # with default parameters for both median and proximal bundle, this is not very precise m = median(M, data) @@ -138,10 +129,7 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve @test norm(M, q, get_subgradient(pbm_s)) < 1.0e-4 # test the other stopping criterion mode q2 = proximal_bundle_method( - M, - f, - ∂f, - p0; + M, f, ∂f, p0; stopping_criterion = StopWhenLagrangeMultiplierLess([1.0e-8, 1.0e-8]; mode = :both), ) @test distance(M, q2, m) < 2 * 1.0e-3 @@ -155,14 +143,8 @@ import Manopt: proximal_bundle_method_subsolver, proximal_bundle_method_subsolve return X end proximal_bundle_method!( - M, - f, - ∂f!, - p_size; - bundle_size = 2, - evaluation = InplaceEvaluation(), - stopping_criterion = StopAfterIteration(200), - sub_problem = (proximal_bundle_method_subsolver!), + M, f, ∂f!, p_size; bundle_size = 2, stopping_criterion = StopAfterIteration(200), + evaluation = InplaceEvaluation(), sub_problem = (proximal_bundle_method_subsolver!), ) end @testset "Trigger the case where the bundle is not transported" begin diff --git a/test/solvers/test_proximal_gradient_method.jl b/test/solvers/test_proximal_gradient_method.jl index 5f1ce3bd0b..f0e3e34bab 100644 --- a/test/solvers/test_proximal_gradient_method.jl +++ b/test/solvers/test_proximal_gradient_method.jl @@ -5,19 +5,18 @@ using Manopt, Manifolds, Test, ManifoldDiff p = [0.0, 0.0, 1.0] p0 = [1.0, 0.0, √2] pgms = ProximalGradientMethodState( - M; - p = p0, - stepsize = Manopt.ProximalGradientMethodBacktrackingStepsize( - M; initial_stepsize = 1.0, strategy = :convex - ), + M; p = p0, + stepsize = Manopt.ProximalGradientMethodBacktrackingStepsize(M; initial_stepsize = 1.0, strategy = :convex), stopping_criterion = StopAfterIteration(200), ) @test get_iterate(pgms) == p0 - pgms.X = [1.0, 0.0, 0.0] + @test startswith(repr(pgms), "ProximalGradientMethodState(") + # Manifold+substate errors, since a sub problem is missing + @test_throws ErrorException ProximalGradientMethodState(M, NelderMeadState(M)) @testset "Special Stopping Criterion" begin sc1 = StopWhenGradientMappingNormLess(1.0e-8) - @test startswith(repr(sc1), "StopWhenGradientMappingNormLess(1.0e-8)\n") + @test startswith(repr(sc1), "StopWhenGradientMappingNormLess(1.0e-8)") @test get_reason(sc1) == "" # Trigger manually sc1.at_iteration = 2 @@ -27,7 +26,8 @@ using Manopt, Manifolds, Test, ManifoldDiff pgb = Manopt.ProximalGradientMethodBacktrackingStepsize(M) @test get_initial_stepsize(pgb) == 1.0 @test get_last_stepsize(pgb) == 1.0 - @test startswith(repr(pgb), "ProximalGradientMethodBacktrackingStepsize(;\n") + @test startswith(repr(pgb), "ProximalGradientMethodBacktrackingStepsize(;") + @test startswith(Manopt.status_summary(pgb), "A backtracking method tailored for the proximal gradient method") end @testset "Allocating Evaluation" begin g(M, q) = distance(M, q, p)^2 @@ -49,18 +49,12 @@ using Manopt, Manifolds, Test, ManifoldDiff @test_throws MethodError get_gradient(mp, 1.0, pgms.p) @test_throws MethodError get_proximal_map(mp, 1.0, pgms.p, 1) pgm = proximal_gradient_method( - M, - f, - g, - grad_g, - p0; + M, f, g, grad_g, p0; prox_nonsmooth = prox_h, stopping_criterion = StopAfterIteration(10), return_state = true, debug = [], - stepsize = ProximalGradientMethodBacktracking(; - initial_stepsize = 1.0, strategy = :convex - ), + stepsize = ProximalGradientMethodBacktracking(; initial_stepsize = 1.0, strategy = :convex), sub_state = AllocatingEvaluation(), ) p_star2 = get_solver_result(pgm) @@ -99,8 +93,7 @@ using Manopt, Manifolds, Test, ManifoldDiff @test_logs (:warn,) (:warn,) dw1(mp, pgms_warn, 1) dw2 = DebugWarnIfStepsizeCollapsed(1.0, :Once) pgms_const = ProximalGradientMethodState( - M; - p = p0, + M; p = p0, stepsize = Manopt.ConstantStepsize(M, 1.0), stopping_criterion = StopAfterIteration(2), ) @@ -117,13 +110,8 @@ using Manopt, Manifolds, Test, ManifoldDiff # Test subsolver with subgradient ∂h(M, q) = ManifoldDiff.subgrad_distance(M, p, q, 1; atol = 1.0e-8) sub_pgm = proximal_gradient_method( - M, - f, - g, - grad_g, - p0; - cost_nonsmooth = h, - subgradient_nonsmooth = ∂h, + M, f, g, grad_g, p0; + cost_nonsmooth = h, subgradient_nonsmooth = ∂h, stopping_criterion = StopAfterIteration(10), ) @test_throws ErrorException proximal_gradient_method(M, f, g, grad_g, p0) @@ -148,7 +136,7 @@ using Manopt, Manifolds, Test, ManifoldDiff # Since this is experimental, we for now just check that it does not error, # but we can not yet verify the result pgma(mp, pgms, 1) - @test startswith(repr(pgma), "ProximalGradientMethodAcceleration with parameters\n") + @test startswith(repr(pgma), "ProximalGradientMethodAcceleration(; ") end @testset "Inplace Evaluation" begin g(M, q) = distance(M, q, p)^2 @@ -159,9 +147,9 @@ using Manopt, Manifolds, Test, ManifoldDiff h(M, q) = distance(M, q, p) prox_h!(M, a, λ, q) = ManifoldDiff.prox_distance!(M, a, λ, p, q, 1) f(M, q) = g(M, q) + h(M, q) - ieob = ManifoldProximalGradientObjective( - f, g, grad_g!, prox_h!; evaluation = InplaceEvaluation() - ) + ieob = ManifoldProximalGradientObjective(f, g, grad_g!, prox_h!; evaluation = InplaceEvaluation()) + @test startswith(repr(ieob), "ManifoldProximalGradientObjective(") + @test startswith(Manopt.status_summary(ieob), "A proximal gradient objective") mp = DefaultManoptProblem(M, ieob) X = zero_vector(M, p) Y = get_gradient(mp, p) @@ -175,19 +163,11 @@ using Manopt, Manifolds, Test, ManifoldDiff sr = solve!(mp, pgms) xHat = get_solver_result(sr) s2 = proximal_gradient_method( - M, - f, - g, - grad_g!, - copy(p0); + M, f, g, grad_g!, copy(p0); prox_nonsmooth = prox_h!, - stepsize = ProximalGradientMethodBacktracking(; - initial_stepsize = 1.0, strategy = :convex - ), - stopping_criterion = StopAfterIteration(200), - evaluation = InplaceEvaluation(), - return_state = true, - debug = [], + stepsize = ProximalGradientMethodBacktracking(; initial_stepsize = 1.0, strategy = :convex), + stopping_criterion = StopAfterIteration(200), evaluation = InplaceEvaluation(), + return_state = true, debug = [], ) p_star2 = get_solver_result(s2) @test f(M, p_star2) <= f(M, p0) @@ -196,19 +176,11 @@ using Manopt, Manifolds, Test, ManifoldDiff @test get_proximal_map(M, ieob, 1.0, p) == a p2 = copy(M, p0) proximal_gradient_method!( - M, - f, - g, - grad_g!, - p2; + M, f, g, grad_g!, p2; prox_nonsmooth = prox_h!, - stepsize = ProximalGradientMethodBacktracking(; - initial_stepsize = 1.0, strategy = :convex - ), - stopping_criterion = StopAfterIteration(200), - evaluation = InplaceEvaluation(), - return_state = true, - debug = [], + stepsize = ProximalGradientMethodBacktracking(; initial_stepsize = 1.0, strategy = :convex), + stopping_criterion = StopAfterIteration(200), evaluation = InplaceEvaluation(), + return_state = true, debug = [], ) @test isapprox(M, p2, p_star2) end @@ -238,7 +210,8 @@ using Manopt, Manifolds, Test, ManifoldDiff return_state = true ) @test startswith( - repr(pbm_s), "# Solver state for `Manopt.jl`s Proximal Gradient Method\n" + Manopt.status_summary(pbm_s; context = :default), + "# Solver state for `Manopt.jl`s Proximal Gradient Method\n" ) q = get_solver_result(pbm_s) # with default parameters for both median and proximal gradient, this is not very precise diff --git a/test/solvers/test_proximal_point.jl b/test/solvers/test_proximal_point.jl index 7ebad6ed80..3391a03965 100644 --- a/test/solvers/test_proximal_point.jl +++ b/test/solvers/test_proximal_point.jl @@ -28,5 +28,18 @@ using ManifoldDiff: prox_distance, prox_distance! q3b = rand(M) get_proximal_map!(M, q3b, obj, 1.0, get_iterate(pps)) @test distance(M, q3a, q3b) == 0 - @test startswith(repr(pps), "# Solver state for `Manopt.jl`s Proximal Point Method\n") + @test startswith( + Manopt.status_summary(pps; context = :default), + "# Solver state for `Manopt.jl`s Proximal Point Method\n" + ) + @test startswith(repr(pps), "ProximalPointState(; ") + @test startswith(repr(obj), "ManifoldProximalMapObjective(") + @test startswith(Manopt.status_summary(obj), "A proximal map objective") + + dpp = DebugProximalParameter() + @test startswith(repr(dpp), "DebugGradientChange(; io") + @test startswith(Manopt.status_summary(dpp), "A DebugAction printing the proximal parameter") + rpp = RecordProximalParameter() + @test startswith(repr(rpp), "RecordProximalParameter(") + @test startswith(Manopt.status_summary(rpp), "A RecordAction to record the current proximal parameter") end diff --git a/test/solvers/test_quasi_Newton.jl b/test/solvers/test_quasi_Newton.jl index 6f3aead5c2..453dd4b60c 100644 --- a/test/solvers/test_quasi_Newton.jl +++ b/test/solvers/test_quasi_Newton.jl @@ -1,5 +1,5 @@ using Manopt, Manifolds, Test -using LinearAlgebra: I, eigvecs, tr, Diagonal +using LinearAlgebra: I, eigvecs, tr, Diagonal, dot mutable struct QuasiNewtonGradientDirectionUpdate{VT <: AbstractVectorTransportMethod} <: AbstractQuasiNewtonDirectionUpdate @@ -57,13 +57,8 @@ end @test norm(x_lrbfgs - x_solution) ≈ 0 atol = 10.0^(-14) # with State lrbfgs_s = quasi_Newton( - M, - f, - grad_f, - p; - stopping_criterion = StopWhenGradientNormLess(10^(-6)), - return_state = true, - debug = [], + M, f, grad_f, p; + stopping_criterion = StopWhenGradientNormLess(10^(-6)), return_state = true, debug = [], ) # Verify that Newton update direction works also allocating dmp = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f)) @@ -73,8 +68,10 @@ end @test isapprox(M, p_star, D, lrbfgs_s.direction_update(dmp, lrbfgs_s)) @test startswith( - repr(lrbfgs_s), "# Solver state for `Manopt.jl`s Quasi Newton Method\n" + Manopt.status_summary(lrbfgs_s; context = :default), + "# Solver state for `Manopt.jl`s Quasi Newton Method\n" ) + @test startswith(repr(lrbfgs_s), "QuasiNewtonState(; ") @test get_last_stepsize(dmp, lrbfgs_s, lrbfgs_s.stepsize) > 0 @test Manopt.get_iterate(lrbfgs_s) == x_lrbfgs set_gradient!(lrbfgs_s, M, p, grad_f(M, p)) @@ -82,26 +79,17 @@ end @test Manopt.get_message(lrbfgs_s) == "" # with Cached Basis x_lrbfgs_cached = quasi_Newton( - M, - f, - grad_f, - p; + M, f, grad_f, p; stopping_criterion = StopWhenGradientNormLess(10^(-6)), basis = get_basis(M, p, DefaultOrthonormalBasis()), ) @test isapprox(M, x_lrbfgs_cached, x_lrbfgs) - x_lrbfgs_cached_2 = quasi_Newton( - M, - f, - grad_f, - p; + M, f, grad_f, p; stopping_criterion = StopWhenGradientNormLess(10^(-6)), - basis = get_basis(M, p, DefaultOrthonormalBasis()), - memory_size = -1, + basis = get_basis(M, p, DefaultOrthonormalBasis()), memory_size = -1, ) @test isapprox(M, x_lrbfgs_cached_2, x_lrbfgs; atol = 1.0e-5) - # with Costgrad mcgo = ManifoldCostGradientObjective(costgrad) @@ -111,14 +99,9 @@ end @test isapprox(M, x_lrbfgs_costgrad, x_lrbfgs; atol = 1.0e-5) clrbfgs_s = quasi_Newton( - M, - f, - grad_f, - p; - cautious_update = true, - stopping_criterion = StopWhenGradientNormLess(10^(-6)), - return_state = true, - debug = [], + M, f, grad_f, p; + cautious_update = true, stopping_criterion = StopWhenGradientNormLess(10^(-6)), + return_state = true, debug = [], ) # Test direction passthrough x_clrbfgs = get_solver_result(clrbfgs_s) @@ -129,10 +112,7 @@ end @test norm(x_clrbfgs - x_solution) ≈ 0 atol = 10.0^(-14) x_rbfgs_Huang = quasi_Newton( - M, - f, - grad_f, - p; + M, f, grad_f, p; memory_size = -1, stepsize = WolfePowellBinaryLinesearch( M; @@ -146,16 +126,10 @@ end for T in [InverseBFGS(), BFGS(), InverseDFP(), DFP(), InverseSR1(), SR1()] for c in [true, false] x_state = quasi_Newton( - M, - f, - grad_f, - p; - direction_update = T, - cautious_update = c, - memory_size = -1, + M, f, grad_f, p; + direction_update = T, cautious_update = c, memory_size = -1, stopping_criterion = StopWhenGradientNormLess(10^(-12)), - return_state = true, - debug = [], + return_state = true, debug = [], ) x_direction = get_solver_result(x_state) D = zero_vector(M, x_direction) @@ -196,11 +170,7 @@ end # An in-place preconditioner x_lrbfgs = quasi_Newton( - M, - f, - grad_f, - x; - memory_size = -1, + M, f, grad_f, x; memory_size = -1, preconditioner = QuasiNewtonPreconditioner( (M, Y, p, X) -> (Y .= 0.5 .* X); evaluation = InplaceEvaluation() ), @@ -214,22 +184,20 @@ end @test isapprox(M, x_cached_lrbfgs, x_solution; atol = rayleigh_atol) for T in [ - InverseDFP(), - DFP(), - Broyden(0.5), - InverseBroyden(0.5), - Broyden(0.5, :Davidon), - Broyden(0.5, :InverseDavidon), - InverseBFGS(), - BFGS(), + InverseDFP(), DFP(), Broyden(0.5), InverseBroyden(0.5), + Broyden(0.5, :Davidon), Broyden(0.5, :InverseDavidon), InverseBFGS(), BFGS(), ], c in [true, false] - x_direction = quasi_Newton( M, f, grad_f, x; direction_update = T, cautious_update = c, memory_size = -1 ) @test isapprox(M, x_direction, x_solution; atol = rayleigh_atol) end + + @testset "Byrd's nonpositive rule" begin + x1 = quasi_Newton(M, f, grad_f, x; nonpositive_curvature_behavior = :byrd, sy_tol = 1.0e8) + @test isapprox(M, x1, x_solution; atol = rayleigh_atol) + end end @testset "Brocket" begin @@ -252,32 +220,18 @@ end x = Matrix{Float64}(I, n, n)[:, 2:(k + 1)] x_inverseBFGSCautious = quasi_Newton( - M, - f, - grad_f, - x; - memory_size = 8, - vector_transport_method = ProjectionTransport(), - retraction_method = QRRetraction(), - cautious_update = true, - stopping_criterion = StopWhenGradientNormLess(1.0e-6), + M, f, grad_f, x; memory_size = 8, + vector_transport_method = ProjectionTransport(), retraction_method = QRRetraction(), + cautious_update = true, stopping_criterion = StopWhenGradientNormLess(1.0e-6) | StopAfterIteration(100), ) x_inverseBFGSHuang = quasi_Newton( - M, - f, - grad_f, - x; - memory_size = 8, + M, f, grad_f, x; memory_size = 8, stepsize = WolfePowellBinaryLinesearch( - M; - retraction_method = QRRetraction(), - vector_transport_method = ProjectionTransport(), + M; retraction_method = QRRetraction(), vector_transport_method = ProjectionTransport(), ), - vector_transport_method = ProjectionTransport(), - retraction_method = QRRetraction(), - cautious_update = true, - stopping_criterion = StopWhenGradientNormLess(1.0e-6), + vector_transport_method = ProjectionTransport(), retraction_method = QRRetraction(), + cautious_update = true, stopping_criterion = StopWhenGradientNormLess(1.0e-6) | StopAfterIteration(100), ) @test isapprox(M, x_inverseBFGSCautious, x_inverseBFGSHuang; atol = 2.0e-4) end @@ -292,20 +246,12 @@ end grad_f(::Sphere, X) = 2 * (A * X - X * (X' * A * X)) x_solution = abs.(eigvecs(A)[:, 1]) - x = [ - 0.7011245948687502 - -0.1726003159556036 - 0.38798265967671103 - -0.5728026616491424 - ] + x = [0.7011245948687502, -0.1726003159556036, 0.38798265967671103, -0.5728026616491424] x_lrbfgs = quasi_Newton( - M, - F, - grad_f, - x; + M, F, grad_f, x; basis = get_basis(M, x, DefaultOrthonormalBasis()), memory_size = -1, - stopping_criterion = StopWhenGradientNormLess(1.0e-9), + stopping_criterion = StopWhenGradientNormLess(1.0e-9) | StopAfterIteration(1000), ) @test norm(abs.(x_lrbfgs) - x_solution) ≈ 0 atol = rayleigh_atol end @@ -406,13 +352,13 @@ end mp = DefaultManoptProblem(M, gmp) qns = QuasiNewtonState(M; p = p) # push zeros to memory - push!(qns.direction_update.memory_s, copy(p)) - push!(qns.direction_update.memory_s, copy(p)) - push!(qns.direction_update.memory_y, copy(p)) - push!(qns.direction_update.memory_y, copy(p)) + qns.yk = copy(p) + qns.sk = copy(p) + update_hessian!(qns.direction_update, mp, qns, p, 1) + update_hessian!(qns.direction_update, mp, qns, p, 2) + @test contains(qns.direction_update.message, "i=2,1,1") qns.direction_update(mp, qns) # Update (1) says at i=1 inner products are zero (2) all are zero -> gradient proposal - @test contains(qns.direction_update.message, "i=1,2") @test contains(qns.direction_update.message, "gradient") end @@ -425,8 +371,7 @@ end gmp = ManifoldGradientObjective(f, grad_f) mp = DefaultManoptProblem(M, gmp) qns = QuasiNewtonState( - M; - p = copy(M, p), + M; p = copy(M, p), direction_update = QuasiNewtonGradientDirectionUpdate(ParallelTransport()), nondescent_direction_behavior = :step_towards_negative_gradient, ) @@ -440,8 +385,7 @@ end ) solve!(mp, dqns) qns = QuasiNewtonState( - M; - p = copy(M, p), + M; p = copy(M, p), direction_update = QuasiNewtonGradientDirectionUpdate(ParallelTransport()), nondescent_direction_behavior = :step_towards_negative_gradient, ) @@ -450,8 +394,7 @@ end @test qns.direction_update.num_times_init == 1 qns = QuasiNewtonState( - M; - p = copy(M, p), + M; p = copy(M, p), direction_update = QuasiNewtonGradientDirectionUpdate(ParallelTransport()), nondescent_direction_behavior = :reinitialize_direction_update, ) @@ -482,8 +425,7 @@ end mp = DefaultManoptProblem(M, gmp) qdu = QuasiNewtonLimitedMemoryDirectionUpdate(M, p, InverseBFGS(), 2) qns = QuasiNewtonState( - M; - p = copy(M, p), + M; p = copy(M, p), direction_update = QuasiNewtonCautiousDirectionUpdate(qdu), ) # current bound with the gradient is 2, so we choose an sk larger than that @@ -494,5 +436,82 @@ end # This triggers and cautious update that does not update the Hessian Manopt.update_hessian!(qns.direction_update, mp, qns, p, 1) # But I am not totally sure what to test for afterwards + + @test startswith(repr(qdu), "QuasiNewtonLimitedMemoryDirectionUpdate with memory size") + end + @testset "Removing zero rho vectors" begin + M = Euclidean(2) + p = [0.0, 1.0] + f(M, p) = sum(p .^ 2) + # A wrong gradient + grad_f(M, p) = -2 .* p + gmp = ManifoldGradientObjective(f, grad_f) + mp = DefaultManoptProblem(M, gmp) + qdu = QuasiNewtonLimitedMemoryDirectionUpdate(M, p, InverseBFGS(), 3) + # push three pairs; middle one has zero inner product + push!(qdu.memory_y, [1, 0]) + push!(qdu.memory_s, [1, 0]) + + push!(qdu.memory_y, [1, 0]) + push!(qdu.memory_s, [0, 1]) + + push!(qdu.memory_y, [0, 2]) + push!(qdu.memory_s, [0, 2]) + qdu.ρ = [1.0, 0.0, 4.0] + # delete the zero inner product pair and check that the removal was correct + Manopt._drop_zero_rho_vectors!(qdu) + @test length(qdu.memory_y) == 2 + @test length(qdu.memory_s) == 2 + @test qdu.ρ[[1, 2]] == [1.0, 4.0] + @test qdu.memory_y[1] == [1, 0] + @test qdu.memory_y[2] == [0, 2] + end + + @testset "reforming_required + (start == 2)" begin + M = Euclidean(2) + p = [0.0, 0.0] + f(M, p) = sum(p .^ 2) + grad_f(M, p) = 2 * sum(p) + gmp = ManifoldGradientObjective(f, grad_f) + mp = DefaultManoptProblem(M, gmp) + ha = QuasiNewtonLimitedMemoryDirectionUpdate(M, p, InverseBFGS(), 2; nonpositive_curvature_behavior = :byrd) + qns = QuasiNewtonState(M; p = p, nonpositive_curvature_behavior = :byrd, direction_update = ha) + + qns.yk = [1.0, 1.0] + qns.sk = [1.0, 2.0] + update_hessian!(qns.direction_update, mp, qns, p, 1) + + qns.yk = [2.0, 1.0] + qns.sk = [1.0, 2.0] + update_hessian!(qns.direction_update, mp, qns, p, 2) + ha.memory_s[2] = [0.0, 0.0] # force reforming_required in next step + update_hessian!(qns.direction_update, mp, qns, p, 3) + # test that the zeroes out pair was replaced + @test qns.direction_update.memory_s[1] == [1.0, 2.0] + @test qns.direction_update.memory_s[2] == [1.0, 2.0] + end + @testset "get_cost specialization" begin + M = Euclidean(2) + p = [0.0, 1.0] + f(M, p) = sum(p .^ 2) + grad_f(M, p) = 2 .* p + gmp = ManifoldGradientObjective(f, grad_f) + mp = DefaultManoptProblem(M, gmp) + ha = QuasiNewtonLimitedMemoryDirectionUpdate(M, p, InverseBFGS(), 2; nonpositive_curvature_behavior = :byrd) + qns = QuasiNewtonState( + M; + p = copy(M, p), + direction_update = ha, + nondescent_direction_behavior = :step_towards_negative_gradient, + stepsize = HagerZhangLinesearch()(M), + ) + @test get_cost(mp, qns) == f(M, get_iterate(qns)) + solve!(mp, qns) + @test get_cost(mp, qns) == f(M, get_iterate(qns)) + + @testset "get_cost with DebugSolverState" begin + dqns = DebugSolverState(qns, DebugMessages(:Info, :Always)) + @test get_cost(mp, dqns) == f(M, get_iterate(dqns)) + end end end diff --git a/test/solvers/test_quasi_Newton_box.jl b/test/solvers/test_quasi_Newton_box.jl new file mode 100644 index 0000000000..5c268b8a59 --- /dev/null +++ b/test/solvers/test_quasi_Newton_box.jl @@ -0,0 +1,353 @@ +using Manopt, Manifolds, Test +using LinearAlgebra: I, eigvecs, tr, Diagonal, dot + +using RecursiveArrayTools + +@testset "Riemannian quasi-Newton Methods with box-like domains" begin + @testset "get_stepsize_bound - basic" begin + M = Hyperrectangle([0.0, 0.0], [2.0, 2.0]) + + # d[i] > 0 + p = [0.0, 1.0]; d = [1.0, 1.0] + @test Manopt.get_stepsize_bound(M, p, d, 1) ≈ (2.0 - 0.0) / 1.0 # = 2.0 + @test Manopt.get_stepsize_bound(M, p, d, 2) ≈ (2.0 - 1.0) / 1.0 # = 1.0 + + # d[i] < 0 + p = [0.0, 1.0]; d = [-1.0, -1.0] + @test Manopt.get_stepsize_bound(M, p, d, 1) ≈ (0.0 - 0.0) / -1.0 # = 0.0 + @test Manopt.get_stepsize_bound(M, p, d, 2) ≈ (0.0 - 1.0) / -1.0 # = 1.0 + + # d[i] = 0 + p = [0.0, 1.0]; d = [0.0, 0.0] + @test Manopt.get_stepsize_bound(M, p, d, 1) ≈ Inf + @test Manopt.get_stepsize_bound(M, p, d, 2) ≈ Inf + end + + @testset "update_fp_fpp - basic d = -g" begin + M = Hyperrectangle([0.0, 1.0], [3.0, 3.0]) + + grad = [1.0, 4.0] + d = [-1.0, -4.0] + p = [0.0, 0.0] + + # values taken from loop iteration found in test case: "find_gcp! - with bounds, single variable is held fixed" + old_f_prime = -17.0 + old_f_double_prime = 34.0 + dt = 0.25 + gb = 4.0 + db = -4.0 # in case of d = -g, db = -gb + ha = QuasiNewtonMatrixDirectionUpdate(M, BFGS(), DefaultOrthonormalBasis(), [2.0 0.0; 0.0 2.0]) + b = 2 + z = [-0.25, -1.0] + + # optimized formula + upd = Manopt.GenericSegmentHessianUpdater(similar(d), similar(d)) + Manopt.init_updater!(M, upd, p, d, ha) + hv_eb_dz, hv_eb_d = upd(M, p, 0 + dt, dt, b, db, ha) + @test hv_eb_dz ≈ -2.0 + @test hv_eb_d ≈ -8.0 + + # original formula + + original_hv_eb_dz = dot([0, 1], ha.matrix, z) + original_hv_eb_d = dot([0, 1], ha.matrix, d) + + @test hv_eb_dz == original_hv_eb_dz + @test hv_eb_d == original_hv_eb_d + end + + @testset "update_fp_fpp - basic d = [-2.0, -1.0]" begin + M = Hyperrectangle([0.0, 1.0], [3.0, 3.0]) + + grad = [1.0, 4.0] + d = [-2.0, -1.0] + p = [0.0, 0.0] + + old_f_prime = -6.0 + old_f_double_prime = 10.0 + dt = 0.25 + gb = 1.0 + db = -2.0 + ha = QuasiNewtonMatrixDirectionUpdate(M, BFGS(), DefaultOrthonormalBasis(), [2.0 0.0; 0.0 2.0]) + b = 1 + z = [-0.5, -0.25] + + # optimized formula + upd = Manopt.GenericSegmentHessianUpdater(similar(d), similar(d)) + Manopt.init_updater!(M, upd, p, d, ha) + hv_eb_dz, hv_eb_d = upd(M, p, 0 + dt, dt, b, db, ha) + @test hv_eb_dz == -1.0 + @test hv_eb_d == -4.0 + + # original formula + + original_hv_eb_dz = dot([1, 0], ha.matrix, z) + original_hv_eb_d = dot([1, 0], ha.matrix, d) + + @test hv_eb_dz == original_hv_eb_dz + @test hv_eb_d == original_hv_eb_d + end + + @testset "update_fp_fpp - basic d = [-2.0, -1.0] with limited memory update" begin + M = Hyperrectangle([1.0, 4.0], [2.0, 10.0]) + + p = [2.0, 5.0] + ha = QuasiNewtonLimitedMemoryBoxDirectionUpdate(QuasiNewtonLimitedMemoryDirectionUpdate(M, p, InverseBFGS(), 2)) + st = QuasiNewtonState(M) + + @test startswith(repr(ha), "QuasiNewtonLimitedMemoryBoxDirectionUpdate with internal state:") + @test startswith(Manopt.status_summary(ha), "limited memory direction update with support for box constraints; internal direction update status: ") + + f(M, p) = sum(p .^ 2) + grad_f(M, p) = 2 * p + gmp = ManifoldGradientObjective(f, grad_f) + mp = DefaultManoptProblem(M, gmp) + + st.yk = [2.0, 4.0] + st.sk = [4.0, 2.0] + update_hessian!(ha, mp, st, p, 1) + grad = grad_f(M, p) + st.p = p + st.X = grad + + d = similar(grad) + ha(d, mp, st) + + d2 = ha(mp, st) + @test d ≈ d2 + + b = 1 + + old_f_prime = -6.0 + old_f_double_prime = 10.0 + dt = 0.25 + db = d[b] + gb = grad[b] + + t_current = 0 + dt + + # compare the generic and limited memory updater + gupd = Manopt.GenericSegmentHessianUpdater(similar(d), similar(d)) + Manopt.init_updater!(M, gupd, p, d, ha) + hv_eb_dz, hv_eb_d = gupd(M, p, t_current, dt, b, db, ha) + + @test hv_eb_dz ≈ -0.125 + @test hv_eb_d ≈ -0.5 + + lmupd = Manopt.get_default_hessian_segment_updater(M, p, ha) + @test lmupd isa Manopt.LimitedMemorySegmentHessianUpdater + + Manopt.init_updater!(M, lmupd, p, d, ha) + hv_eb_dz_limited, hv_eb_d_limited = lmupd(M, p, t_current, dt, b, db, ha) + + @test hv_eb_dz ≈ hv_eb_dz_limited + @test hv_eb_d ≈ hv_eb_d_limited + + ha.last_gcd_result = :found_unlimited + ha.last_gcd_stepsize = Inf + @test Manopt.get_parameter(ha, Val(:max_stepsize)) == Inf + + @testset "No memory tests" begin + ha2 = QuasiNewtonLimitedMemoryBoxDirectionUpdate(QuasiNewtonLimitedMemoryDirectionUpdate(M, p, InverseBFGS(), 2)) + idx = Manopt.get_bounds_index(M) + @test Manopt.hessian_value(ha2, M, p, Manopt.UnitVector(b), grad) ≈ 4.0 + Manopt.update_current_scale!(M, p, ha2) + @test ha2.current_scale == ha2.qn_du.initial_scale + @test ha2.M_11 == fill(0.0, 0, 0) + @test ha2.M_21 == fill(0.0, 0, 0) + @test ha2.M_22 == fill(0.0, 0, 0) + end + end + + @testset "GeneralizedCauchyDirectionSubsolver" begin + M = Hyperrectangle([-1.0, -2.0, -Inf], [2.0, Inf, 2.0]) + ha = QuasiNewtonMatrixDirectionUpdate(M, BFGS()) + + p = [0.0, 0.0, 0.0] + gf = Manopt.GeneralizedCauchyDirectionSubsolver(M, p, ha) + + X1 = [-5.0, 0.0, 0.0] + + d = -X1 + d_out = similar(d) + + @test Manopt.find_generalized_cauchy_direction!(M, gf, d_out, p, d, X1) === (:found_limited, 1.0) + @test d_out ≈ [2.0, 0.0, 0.0] + + d_out = similar(d) + + @test Manopt.find_generalized_cauchy_direction!(M, gf, d_out, p, 0 * d, X1) === (:not_found, NaN) + + d2 = [0.0, 1.0, 0.0] + + @test Manopt.find_generalized_cauchy_direction!(M, gf, d_out, p, d2, [0.0, -1.0, 0.0]) === (:found_unlimited, Inf) + @test d_out ≈ d2 + + @test Manopt.find_generalized_cauchy_direction!(M, gf, d_out, p, [1.0, 1.0, 0.0], [-10.0, -10.0, -10.0]) === (:found_limited, 1.0) + @test d_out ≈ [2.0, 10.0, 0.0] + + p2 = [-1.0, -2.0, 2.0] + gf2 = Manopt.GeneralizedCauchyDirectionSubsolver(M, p2, ha) + + @test Manopt.find_generalized_cauchy_direction!(M, gf2, d_out, p2, [-1.0, -1.0, 1.0], [-10.0, -10.0, -10.0]) === (:not_found, NaN) + + M2 = Hyperrectangle([-10.0], [10.0]) + + ha2 = QuasiNewtonMatrixDirectionUpdate(M2, BFGS(), DefaultOrthonormalBasis(), [100.0;;]) + p3 = [1.0] + gf3 = Manopt.GeneralizedCauchyDirectionSubsolver(M2, p3, ha2) + + d_out = similar(p3) + @test Manopt.find_generalized_cauchy_direction!(M2, gf3, d_out, p3, [1.0], [-10.0]) === (:found_limited, 90.0) + end + + @testset "Hitting multiple bounds at the same time in GCD" begin + M = Hyperrectangle([-1.0, -1.0, -1.0], [1.0, 1.0, 1.0]) + ha = QuasiNewtonMatrixDirectionUpdate(M, BFGS(), DefaultOrthonormalBasis(), [1.0 0 0; 0 1 0; 0 0 1]) + + p = [0.0, 0.0, 0.0] + gf = Manopt.GeneralizedCauchyDirectionSubsolver(M, p, ha) + + d = [-2.0, -2.0, -1.0] + d_out = similar(d) + X = [10.0, 10.0, 10.0] + + @test Manopt.find_generalized_cauchy_direction!(M, gf, d_out, p, d, X) === (:found_limited, 1.0) + @test d_out ≈ [-1.0, -1.0, -1.0] + end + + @testset "Pure Hyperrectangle" begin + M = Hyperrectangle([-1.0, 2.0, -Inf], [2.0, Inf, 2.0]) + f(M, p) = sum(p .^ 2) + function grad_f(M, p) + return project(M, p, 2 .* p) + end + p0 = [0.0, 4.0, 1.0] + p_opt = quasi_Newton(M, f, grad_f, p0; stopping_criterion = StopWhenProjectedNegativeGradientNormLess(1.0e-6) | StopAfterIteration(10)) + @test p_opt ≈ [0, 2, 0] + + + f2(M, p) = sum(p .^ 4) + function grad_f2(M, p) + return project(M, p, 4 .* (p .^ 3)) + end + p0 = [0.0, 4.0, 1.0] + p_opt = quasi_Newton(M, f2, grad_f2, p0; stopping_criterion = StopWhenProjectedNegativeGradientNormLess(1.0e-6) | StopAfterIteration(100)) + @test f2(M, p_opt) < 16.1 + + for stepsize in [ArmijoLinesearch(), CubicBracketingLinesearch(), NonmonotoneLinesearch()] + p_opt = quasi_Newton( + M, f2, grad_f2, p0; + stopping_criterion = StopWhenProjectedNegativeGradientNormLess(1.0e-6) | StopAfterIteration(100), + stepsize = stepsize + ) + @test f2(M, p_opt) < 64.0 + end + + MInf = Hyperrectangle([-Inf, -Inf, -Inf], [Inf, Inf, Inf]) + + f3(M, p) = sum(p .^ 4) - sum(p .^ 2) + function grad_f3(M, p) + return project(MInf, p, 4 .* (p .^ 3) - 2 .* p) + end + p0 = [0.0, 4.0, 1.0] + p_opt = quasi_Newton(MInf, f3, grad_f3, p0; stopping_criterion = StopWhenProjectedNegativeGradientNormLess(1.0e-6) | StopAfterIteration(100)) + @test f3(MInf, p_opt) < 16.1 + + p_opt = quasi_Newton( + MInf, f3, grad_f3, p0; + stopping_criterion = StopWhenProjectedNegativeGradientNormLess(1.0e-6) | StopAfterIteration(100), + ) + @test f3(MInf, p_opt) < 64.0 + end + + @testset "has_anisotropic_max_stepsize" begin + @test !Manopt.has_anisotropic_max_stepsize(Sphere(2)) + @test Manopt.has_anisotropic_max_stepsize(Hyperrectangle([1], [2])) + @test Manopt.has_anisotropic_max_stepsize(ProductManifold(Hyperrectangle([1], [2]), Sphere(2))) + end + + @testset "Hyperrectangle × Sphere" begin + S2 = Sphere(2) + px = [0.0, 1.0, 0.0] + Mbox = Hyperrectangle([-1.0, 2.0, -Inf], [2.0, Inf, 2.0]) + M = Mbox × S2 + f(M, p) = sum(p.x[1] .^ 4) + 0.5 * distance(S2, p.x[2], px)^2 + grad_f(M, p) = ArrayPartition(project(Mbox, p.x[1], 4 .* (p.x[1] .^ 3)), -log(S2, p.x[2], px)) + p0 = ArrayPartition([0.0, 4.0, 1.0], [1.0, 0.0, 0.0]) + + @testset "Hessian updater" begin + d = -grad_f(M, p0) + ha = QuasiNewtonMatrixDirectionUpdate(M, BFGS(), DefaultOrthonormalBasis()) + gupd = Manopt.GenericSegmentHessianUpdater(similar(d), similar(d)) + Manopt.init_updater!(M, gupd, p0, d, ha) + b = (1, 2) + dt = 0.25 + t_current = 0 + dt + db = d.x[b[1]][b[2]] + hv_eb_dz, hv_eb_d = gupd(M, p0, t_current, dt, b, db, ha) + @test hv_eb_dz ≈ -64.0 + @test hv_eb_d ≈ -256.0 + end + + @testset "GCD check" begin + d = -grad_f(M, p0) + ha = QuasiNewtonLimitedMemoryBoxDirectionUpdate(QuasiNewtonLimitedMemoryDirectionUpdate(M, p0, InverseBFGS(), 2)) + gf = Manopt.GeneralizedCauchyDirectionSubsolver(M, p0, ha) + d_out = similar(d) + X = grad_f(M, p0) + @test Manopt.find_generalized_cauchy_direction!(M, gf, d_out, p0, d, X) === (:found_limited, 1.0) + end + + p_opt = quasi_Newton(M, f, grad_f, p0; stopping_criterion = StopWhenProjectedNegativeGradientNormLess(1.0e-6) | StopAfterIteration(100)) + @test distance(M, p_opt, ArrayPartition([0, 2, 0], px)) < 0.1 + end + + @testset "Sphere × Hyperrectangle" begin + S2 = Sphere(2) + px = [0.0, 1.0, 0.0] + Mbox = Hyperrectangle([-1.0 2.0; -Inf -Inf], [2.0 Inf; 2.0 Inf]) + M = S2 × Mbox + f(M, p) = sum(p.x[2] .^ 4) + 0.5 * distance(S2, p.x[1], px)^2 + grad_f(M, p) = ArrayPartition(-log(S2, p.x[1], px), project(Mbox, p.x[2], 4 .* (p.x[2] .^ 3))) + p0 = ArrayPartition([1.0, 0.0, 0.0], [0.0 4.0; 1.0 1.0]) + + p_opt = quasi_Newton(M, f, grad_f, p0; stopping_criterion = StopWhenProjectedNegativeGradientNormLess(1.0e-6) | StopAfterIteration(100)) + @test distance(M, p_opt, ArrayPartition(px, [0 2; 0 0])) < 0.1 + end +end + +@testset "MaxStepsizeInDirection" begin + @testset "found_limited" begin + M = Hyperrectangle([-1.0, -2.0, -Inf], [2.0, Inf, 2.0]) + p = [0.0, 0.0, 0.0] + d = [2.0, 1.0, 1.0] + d_before = copy(d) + + sdf = Manopt.MaxStepsizeInDirectionSubsolver(M, p) + @test Manopt.find_max_stepsize_in_direction(M, sdf, p, d) === (:found_limited, 1.0) + @test d == d_before + end + + @testset "found_unlimited" begin + M = Hyperrectangle([-Inf], [Inf]) + p = [0.0] + d = [1.0] + d_before = copy(d) + + sdf = Manopt.MaxStepsizeInDirectionSubsolver(M, p) + @test Manopt.find_max_stepsize_in_direction(M, sdf, p, d) === (:found_unlimited, Inf) + @test d == d_before + end + + @testset "not_found" begin + M = Hyperrectangle([0.0], [1.0]) + p = [0.0] + d = [-1.0] + d_before = copy(d) + + sdf = Manopt.MaxStepsizeInDirectionSubsolver(M, p) + @test Manopt.find_max_stepsize_in_direction(M, sdf, p, d) === (:not_found, NaN) + @test d == d_before + end +end diff --git a/test/solvers/test_stochastic_gradient_descent.jl b/test/solvers/test_stochastic_gradient_descent.jl index 80119dbb53..4d785b9e45 100644 --- a/test/solvers/test_stochastic_gradient_descent.jl +++ b/test/solvers/test_stochastic_gradient_descent.jl @@ -88,8 +88,10 @@ using Manopt, Manifolds, Test step_solver!(dmp1, sgds, 1) @test sgds.p == exp(M, p, get_gradient(dmp1, p, 1)) @test startswith( - repr(sgds), "# Solver state for `Manopt.jl`s Stochastic Gradient Descent\n" + Manopt.status_summary(sgds; context = :default), + "# Solver state for `Manopt.jl`s Stochastic Gradient Descent\n" ) + @test startswith(repr(sgds), "StochasticGradientDescentState(; ") end @testset "Comparing Stochastic Methods" begin q1 = stochastic_gradient_descent(M, sgrad_f1, p; order_type = :Linear) diff --git a/test/solvers/test_subgradient_method.jl b/test/solvers/test_subgradient_method.jl index 3c38dcfd8b..9d8ded29c2 100644 --- a/test/solvers/test_subgradient_method.jl +++ b/test/solvers/test_subgradient_method.jl @@ -5,25 +5,18 @@ using Manifolds, ManifoldsBase, Manopt, Random, Test p = [4.0, 2.0] p0 = [5.0, 2.0] q0 = [10.0, 5.0] + sc = StopAfterIteration(200) sgs = SubGradientMethodState( - M; - p = p0, - stopping_criterion = StopAfterIteration(200), - stepsize = Manopt.ConstantStepsize(M), + M; p = p0, stopping_criterion = sc, stepsize = Manopt.ConstantStepsize(M), ) sgs_ac = SubGradientMethodState( - M; - p = q0, - stopping_criterion = StopAfterIteration(200), - stepsize = Manopt.ConstantStepsize(M, 1.0; type = :absolute), + M; p = q0, stopping_criterion = sc, stepsize = Manopt.ConstantStepsize(M, 1.0; type = :absolute), ) sgs_ad = SubGradientMethodState( - M; - p = q0, - stopping_criterion = StopAfterIteration(200), - stepsize = Manopt.DecreasingStepsize(M; length = 1.0, type = :absolute), + M; p = q0, stopping_criterion = sc, stepsize = Manopt.DecreasingStepsize(M; length = 1.0, type = :absolute), ) - @test startswith(repr(sgs), "# Solver state for `Manopt.jl`s Subgradient Method\n") + @test startswith(Manopt.status_summary(sgs), "# Solver state for `Manopt.jl`s Subgradient Method\n") + @test startswith(repr(sgs), "SubGradientMethodState(; ") @test get_iterate(sgs) == p0 sgs.X = [1.0, 0.0] f(M, q) = distance(M, q, p) @@ -34,7 +27,10 @@ using Manifolds, ManifoldsBase, Manopt, Random, Test end return -log(M, q, p) / max(10 * eps(Float64), distance(M, p, q)) end - mp = DefaultManoptProblem(M, ManifoldSubgradientObjective(f, ∂f)) + o = ManifoldSubgradientObjective(f, ∂f) + @test startswith(repr(o), "ManifoldSubgradientObjective(") + @test startswith(Manopt.status_summary(o), "A subgradient objective") + mp = DefaultManoptProblem(M, o) X = zero_vector(M, p) Y = get_subgradient(mp, p) get_subgradient!(mp, X, p) diff --git a/test/solvers/test_truncated_cg.jl b/test/solvers/test_truncated_cg.jl index d9fc8ffed6..5779caf2a0 100644 --- a/test/solvers/test_truncated_cg.jl +++ b/test/solvers/test_truncated_cg.jl @@ -6,27 +6,28 @@ using Manifolds, Manopt, ManifoldsBase, Test η = zero_vector(M, p) s = TruncatedConjugateGradientState(TangentSpace(M, p); X = η) @test startswith( - repr(s), "# Solver state for `Manopt.jl`s Truncated Conjugate Gradient Descent\n" + Manopt.status_summary(s; context = :default), + "# Solver state for `Manopt.jl`s Truncated Conjugate Gradient Descent\n" ) @test get_iterate(s) == η srr = StopWhenResidualIsReducedByFactorOrPower() ssr1 = Manopt.status_summary(srr) - @test ssr1 == "Residual reduced by factor 0.1 or power 1.0:\tnot reached" - @test repr(srr) == "StopWhenResidualIsReducedByFactorOrPower(0.1, 1.0)\n $(ssr1)" + @test startswith(ssr1, "A stopping criterion used within tCG to check whether the residual is reduced by factor") + @test repr(srr) == "StopWhenResidualIsReducedByFactorOrPower(0.1, 1.0)" str = StopWhenTrustRegionIsExceeded() str1 = Manopt.status_summary(str) - @test str1 == "Trust region exceeded:\tnot reached" - @test repr(str) == "StopWhenTrustRegionIsExceeded()\n $(str1)" + @test str1 == "A stopping criterion to stop when the trust region radius (0.0) is exceeded.\n$(Manopt._MANOPT_INDENT)not reached" + @test repr(str) == "StopWhenTrustRegionIsExceeded()" @test get_reason(str) == "" # Trigger manually str.at_iteration = 1 @test length(get_reason(str)) > 0 scn = StopWhenCurvatureIsNegative() scn1 = Manopt.status_summary(scn) - @test scn1 == "Curvature is negative:\tnot reached" - @test repr(scn) == "StopWhenCurvatureIsNegative()\n $(scn1)" + @test scn1 == "A stopping criterion to stop when the is negative\n$(Manopt._MANOPT_INDENT)not reached" + @test repr(scn) == "StopWhenCurvatureIsNegative()" smi = StopWhenModelIncreased() smi1 = Manopt.status_summary(smi) - @test smi1 == "Model Increased:\tnot reached" - @test repr(smi) == "StopWhenModelIncreased()\n $(smi1)" + @test startswith(smi1, "A stopping criterion to indicate when the model increased.") + @test repr(smi) == "StopWhenModelIncreased()" end diff --git a/test/solvers/test_trust_regions.jl b/test/solvers/test_trust_regions.jl index c7fd91a8c9..4885e9627a 100644 --- a/test/solvers/test_trust_regions.jl +++ b/test/solvers/test_trust_regions.jl @@ -30,6 +30,7 @@ include("trust_region_model.jl") sub_state = TruncatedConjugateGradientState(TpM; X = get_gradient(M, mho, p)) trs1 = TrustRegionsState(M, sub_problem) trs2 = TrustRegionsState(M, sub_problem, sub_state) + @test_throws ErrorException TrustRegionsState(M, sub_state) trs3 = TrustRegionsState(M, sub_problem; p = p) @test Manopt.get_gradient_function(sub_objective)(M, p) == X end @@ -49,12 +50,20 @@ include("trust_region_model.jl") @test get_hessian(TpM, trmo, Y, X) == H get_hessian!(TpM, Y, trmo, Y, X) @test Y == H + @test startswith(repr(trmo), "TrustRegionModelObjective(") + @test startswith(Manopt.status_summary(trmo), "The trust region model for ") end @testset "Allocating Variant" begin s = trust_regions( M, f, rgrad, rhess, p; max_trust_region_radius = 8.0, return_state = true ) - @test startswith(repr(s), "# Solver state for `Manopt.jl`s Trust Region Method\n") + @test startswith( + Manopt.status_summary(s; context = :default), + "# Solver state for `Manopt.jl`s Trust Region Method\n" + ) + @test startswith(repr(s), "TrustRegionsState(") + # not a random one -> does not contain HZ + @test !contains(repr(s), "HZ = ") p1 = get_solver_result(s) q = copy(M, p) set_gradient!(s, M, p, zero_vector(M, p)) @@ -62,28 +71,24 @@ include("trust_region_model.jl") trust_regions!(M, f, rgrad, rhess, q; max_trust_region_radius = 8.0) @test isapprox(M, p1, q) Random.seed!(42) - p2 = trust_regions( - M, f, rgrad, rhess, p; max_trust_region_radius = 8.0, randomize = true + s2 = trust_regions( + M, f, rgrad, rhess, p; max_trust_region_radius = 8.0, randomize = true, return_state = true ) + @test startswith(repr(s2), "TrustRegionsState(") + # a random one -> does contain HZ + @test contains(repr(s2), "HZ = ") + p2 = get_solver_result(s2) @test f(M, p2) ≈ f(M, p1) p3 = trust_regions( - M, - f, - rgrad, - p; - max_trust_region_radius = 8.0, + M, f, rgrad, p; max_trust_region_radius = 8.0, stopping_criterion = StopAfterIteration(2000) | StopWhenGradientNormLess(1.0e-6), ) q2 = copy(M, p) trust_regions!( - M, - f, - rgrad, - q2; + M, f, rgrad, q2; max_trust_region_radius = 8.0, stopping_criterion = StopAfterIteration(2000) | StopWhenGradientNormLess(1.0e-6), - max_trust_region_radius = 8.0, ) @test isapprox(M, p3, q2; atol = 1.0e-6) @test f(M, p3) ≈ f(M, p1) diff --git a/test/solvers/test_vectorbundle_newton.jl b/test/solvers/test_vectorbundle_newton.jl index 6c6110ce05..188fd93d65 100644 --- a/test/solvers/test_vectorbundle_newton.jl +++ b/test/solvers/test_vectorbundle_newton.jl @@ -20,6 +20,8 @@ using LinearAlgebra: eigvals A::NM b::Nrhs end + # dummy to reduce output + Base.show(io::IO, ne::NewtonEquation) = print(io, "NewtonEquation($(ne.f_prime), $(ne.f_second_prime), A, b)") function NewtonEquation(M, f_pr, f_sp) A = zeros(N + 1, N + 1) @@ -30,7 +32,7 @@ using LinearAlgebra: eigvals function (ne::NewtonEquation)(M, VB, p) ne.A .= hcat(vcat(ne.f_second_prime(p) - ne.f_prime(p) * p * Matrix{Float64}(I, N, N), p'), vcat(p, 0)) ne.b .= vcat(ne.f_prime(p)', 0) - return + return ne end function solve_augmented_system(problem, newtonstate) @@ -68,7 +70,7 @@ using LinearAlgebra: eigvals y3 = copy(M, y0) # avoid working inplace of y0 - Manopt.vectorbundle_newton!( + vbns = Manopt.vectorbundle_newton!( M, TangentBundle(M), NE, y3; sub_problem = solve_augmented_system, alg_kwargs... ) @@ -83,6 +85,11 @@ using LinearAlgebra: eigvals # test access on the VB Problem vbp = VectorBundleManoptProblem(M, TangentBundle(M), NE) @test Manopt.get_newton_equation(vbp) === NE + @test startswith(Manopt.status_summary(vbp; context = :inline), "A vector bundle problem defined on $(M)") + vbp_s = Manopt.status_summary(vbp; context = :default) + @test startswith(vbp_s, "A vector bundle problem representing a vector bundle newton equation objective") + @test contains(vbp_s, "## Manifold") + @test startswith(repr(vbp), "VectorBundleManoptProblem(") end @testset "Affine covariant stepsize" begin @@ -138,19 +145,24 @@ using LinearAlgebra: eigvals M, TangentBundle(M), NE, y0; sub_problem = solve_augmented_system, stopping_criterion = (StopAfterIteration(15) | StopWhenChangeLess(M, 1.0e-11)), retraction_method = ProjectionRetraction(), - stepsize = Manopt.AffineCovariantStepsize(M, θ_des = 0.1), + stepsize = AffineCovariantStepsize(M, θ_des = 0.1), return_state = true, ) y1 = get_iterate(st) @test any(isapprox(f(M, y1), λ; atol = 2.0 * 1.0e-2) for λ in eigvals(matrix)) - st_str = repr(st) + st_str = Manopt.status_summary(st; context = :default) @test occursin("Vector bundle Newton method", st_str) + @test startswith(repr(st), "VectorBundleNewtonState(") # we stopped since the change was small enough - @test occursin("* |Δp| < 1.0e-11: reached", st_str) - @test occursin("AffineCovariantStepsize", st_str) + @test occursin("* |Δp| < 1.0e-11:$(Manopt._MANOPT_INDENT)reached", st_str) acs = st.stepsize @test get_initial_stepsize(acs) == acs.α @test get_last_stepsize(acs) > 0.0 + @test startswith(repr(acs), "AffineCovariantStepsize(; ") @test default_stepsize(M, VectorBundleNewtonState) isa Manopt.ConstantStepsize + st_repr = repr(st) + @test occursin("retraction_method =", st_repr) + @test occursin("stopping_criterion = ", st_repr) + @test occursin(" | ", st_repr) # that :short is actually used end end diff --git a/tutorials/BoxDomain.qmd b/tutorials/BoxDomain.qmd new file mode 100644 index 0000000000..eb2d2da574 --- /dev/null +++ b/tutorials/BoxDomain.qmd @@ -0,0 +1,78 @@ +--- +title: "How to do optimization on box domains" +author: "Mateusz Baran" +--- + +# Optimization with box domains and products of manifolds and boxes + +A ``[`Hyperrectangle`](@extref `Manifolds.Hyperrectangle`)``{=commonmark} is, in general, not a manifold but a manifold with corners because locally at the boundary it looks like $\mathbb{R}^{n-k} \times \mathbb{R}^k_{\geq 0}$ for some $k > 0$, instead of $\mathbb{R}^n$ as required by the definition of a manifold. + +Such spaces require special handling when used as domains in optimization. +For simple methods like gradient descent using projected gradient and a stopping criterion involving [`StopWhenProjectedNegativeGradientNormLess`](@ref) may be sufficient, however methods that approximate the Hessian can benefit from a more advanced approach. +The core idea is considering a piecewise quadratic approximation of the objective along the descent direction in the tangent space at the current iterate, and selecting the generalized Cauchy direction -- its minimizer. +The points at which the approximation might not be differentiable correspond to hitting new boundaries along the initially selected descent direction. +Then, we can perform standard line search from the initial iterate in the generalized Cauchy direction. + +Currently `Manopt.jl` can handle domains that are either a ``[`Hyperrectangle`](@extref `Manifolds.Hyperrectangle`)``{=commonmark} or a ``[`ProductManifold`](@extref `ManifoldsBase.ProductManifold`)``{=commonmark} containing a ``[`Hyperrectangle`](@extref `Manifolds.Hyperrectangle`)``{=commonmark} as its first factor and other manifolds as subsequent factors. + +## Example + +Consider the problem of fitting covariance matrix with box constraints on variance in principal directions. +The objective is log-probability of data under a multivariate normal distribution with zero mean and covariance matrix given by the variable to optimize. +Although there are better ways to solve this problem, expressing it this way allows us to freely extend the objective to more complex scenarios beyond what is possible with closed-form solutions. + +First, we set up the problem by generating synthetic data. +The data is sampled from a multivariate normal distribution with known covariance matrix. + +```{julia} +using Manopt, Manifolds, LinearAlgebra, Random, Distributions +using ForwardDiff, DifferentiationInterface, RecursiveArrayTools +Random.seed!(41) + +N = 5 # dimensionality of data +M_spd = SymmetricPositiveDefinite(N) +M_rot = Rotations(N) +V = rand(M_rot) +cov_matrix = Symmetric(V * Diagonal([0.5; 2.0; 5.0; 10.0; 20.0]) * V') +distr = MvNormal(zeros(N), cov_matrix) +data = Matrix(rand(distr, 200)') # 200 samples +``` + +The objective function is defined as follows, with gradient calculated using automatic differentiation. + +```{julia} +function logprob_cost(::AbstractManifold, p) + D, R = p.x + logdet = sum(log, D) + invΣ = R * Diagonal(1 ./ D) * R' + ll = - 0.5 * size(data, 1) * logdet + for row in eachrow(data) + ll -= 0.5 * row' * invΣ * row + end + return -ll # We minimize negative log-likelihood +end + +function logprob_gradient(M::AbstractManifold, p) + Y = DifferentiationInterface.gradient(q -> logprob_cost(M, q), AutoForwardDiff(), p) + return riemannian_gradient(M, p, Y) +end +``` + +Finally, we can solve the optimization problem using a quasi-Newton method with box domain support. +We restrict the variances (diagonal elements of the covariance matrix) to be between 1.0 and 100.0. +The covariance matrix is represented using its eigendecomposition $\Sigma = R D R^{\top}$, where $D$ is a diagonal matrix of variances and $R$ is an orthogonal matrix of principal directions. +With constraints on variances, the optimization variable belongs to $[1,100]^N \times \mathrm{SO}(N)$. + +```{julia} +M = ProductManifold(Hyperrectangle(fill(1.0, N), fill(100.0, N)), M_rot) + +p0 = ArrayPartition(fill(10.0, N), Matrix{Float64}(I(5))) +p_mle = quasi_Newton(M, logprob_cost, logprob_gradient, p0; stopping_criterion = StopAfterIteration(100) | StopWhenProjectedNegativeGradientNormLess(1e-6)) +println("Estimated variances: $(p_mle.x[1])") +cov_matrix_mle = p_mle.x[2] * Diagonal(p_mle.x[1]) * p_mle.x[2]' +println("Estimated covariance matrix:") +println(cov_matrix_mle) +nothing +``` + +We see that despite the original covariance matrix having variances ranging from 0.5 to 20.0, the estimated covariance matrix respects the box constraints of variances between 1.0 and 100.0. diff --git a/tutorials/ImplementASolver.qmd b/tutorials/ImplementASolver.qmd index d06f9d47e5..d0f52ba823 100644 --- a/tutorials/ImplementASolver.qmd +++ b/tutorials/ImplementASolver.qmd @@ -97,9 +97,7 @@ We can defined this as ```{julia} #| output: false mutable struct RandomWalkState{ - P, - R<:AbstractRetractionMethod, - S<:StoppingCriterion, + P, R<:AbstractRetractionMethod, S<:StoppingCriterion, } <: AbstractManoptSolverState p::P q::P @@ -269,9 +267,7 @@ given start point unchanged would just add a `copy(M, p)` upfront. ```{julia} function random_walk_algorithm!( - M::AbstractManifold, - mgo::ManifoldCostObjective, - p; + M::AbstractManifold, mgo::ManifoldCostObjective, p; σ = 0.1, retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)), stopping_criterion::StoppingCriterion=StopAfterIteration(200), @@ -280,8 +276,7 @@ function random_walk_algorithm!( dmgo = decorate_objective!(M, mgo; kwargs...) dmp = DefaultManoptProblem(M, dmgo) s = RandomWalkState(M, [1.0, 0.0, 0.0]; - σ=0.1, - retraction_method=retraction_method, stopping_criterion=stopping_criterion, + σ=0.1, retraction_method=retraction_method, stopping_criterion=stopping_criterion, ) ds = decorate_state!(s; kwargs...) solve!(dmp, ds) @@ -303,17 +298,23 @@ end ## Ease of Use II: the state summary -For the case that you set `return_state=true` the solver should return a summary of the run. When a `show` method is provided, users can easily read such summary in a terminal. +For the case that you set `return_state=true` the solver should return a summary of the run. +Internally this is mapped to the (human-readable) display of a `status_summary` It should reflect its main parameters, if they are not too verbose and provide information about the reason it stopped and whether this indicates convergence. +Compared to Julias own `show(io::IO, state)` method, the status summary aims to be +in prose text. - Here it would for example look like +Here it would for example look like ```{julia} #| output: false -import Base: show -function show(io::IO, rws::RandomWalkState) +import Manopt: status_summary +function status_summary(rws::RandomWalkState; context::Symbol = :default) + (context === :short) && return repr(rws) i = get_count(rws, :Iterations) + conv_inl = (i > 0) ? (indicates_convergence(rws.stop) ? " (converged" : " (stopped") * " after $i iterations)" : "" + (context === :inline) && return "A solver state for the random walk algorithm$(conv_inl)" Iter = (i > 0) ? "After $i iterations\n" : "" Conv = indicates_convergence(rws.stop) ? "Yes" : "No" s = """ @@ -324,10 +325,9 @@ function show(io::IO, rws::RandomWalkState) * σ : $(rws.σ) ## Stopping criterion - - $(status_summary(rws.stop)) + $(status_summary(rws.stop; context = context)) This indicates convergence: $Conv""" - return print(io, s) + return s end ``` diff --git a/tutorials/Project.toml b/tutorials/Project.toml index db96359a8d..a2b2a31981 100644 --- a/tutorials/Project.toml +++ b/tutorials/Project.toml @@ -2,8 +2,10 @@ ADTypes = "47edcb42-4c32-4615-8424-f2b9edc5f35b" BenchmarkTools = "6e4b80f9-dd63-53aa-95a3-0cdb28fa8baf" Colors = "5ae59095-9a9b-59fe-a467-6f913c188581" +DifferentiationInterface = "a0c0ee7d-e4b9-4e03-894e-1c5f64a51d63" Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f" FiniteDifferences = "26cc04aa-876d-5657-8c51-4c34ba976000" +ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210" LRUCache = "8ac3fa9e-de4c-5943-b1dc-09c6b5f20637" LinearAlgebra = "37e2e46d-f89d-539d-b4ee-838fcccc9c8e" ManifoldDiff = "af67fdf4-a580-4b9f-bbec-742ef357defd" @@ -21,8 +23,10 @@ Manopt = {path = ".."} ADTypes = "1" BenchmarkTools = "1" Colors = "0.12, 0.13" +DifferentiationInterface = "0.7" Distributions = "0.25" FiniteDifferences = "0.12" +ForwardDiff = "1" LRUCache = "1.4" ManifoldDiff = "0.4" Manifolds = "0.11"