You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/src/functions.md
+8-8
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,7 @@
1
1
# Functions and constraints
2
2
3
-
Once an expression is created it is possible to create the `Term`s defining the optimization problem.
4
-
5
-
These can consists of either [Smooth functions](@ref), [Nonsmooth functions](@ref), [Inequality constraints](@ref)
3
+
Once an expression is created it is possible to create the `Term`s defining the optimization problem.
4
+
These can consists of either [Smooth functions](@ref), [Nonsmooth functions](@ref), [Inequality constraints](@ref)
6
5
or [Equality constraints](@ref).
7
6
8
7
## Smooth functions
@@ -39,21 +38,22 @@ hingeloss
39
38
40
39
## Smoothing
41
40
42
-
Sometimes the optimization problem might involve only non-smooth terms which do not lead to efficient proximal mappings. It is possible to *smooth* this terms by means of the *Moreau envelope*.
41
+
Sometimes the optimization problem might involve non-smooth terms which
42
+
do not have efficiently computable proximal mappings.
43
+
It is possible to *smoothen* these terms by means of the *Moreau envelope*.
43
44
44
45
```@docs
45
46
smooth
46
47
```
47
48
48
49
## Duality
49
50
50
-
In some cases it is more convenient to solve the *dual problem* instead of the primal problem.
51
-
52
-
It is possible to convert the primal problem into its dual form by means of the *convex conjugate*.
51
+
In some cases it is more convenient to solve the *dual problem* instead
52
+
of the primal problem. It is possible to convert a problem into its dual
53
+
by means of the *convex conjugate*.
53
54
54
55
See the [Total Variation demo](https://github.com/kul-forbes/StructuredOptimization.jl/blob/master/demos/TotalVariationDenoising.ipynb) for an example of such procedure.
Here the squared norm $\tfrac{1}{2} \| \mathbf{A} \mathbf{x} - \mathbf{y} \|^2$ is a *smooth* function $f$ wherelse the $l_1$-norm is a *nonsmooth* function $g$.
22
-
23
-
This problem can be solved with only few lines of code:
21
+
Here the squared norm $\tfrac{1}{2} \| \mathbf{A} \mathbf{x} - \mathbf{y} \|^2$ is a *smooth* function $f$ wherelse the $l_1$-norm is a *nonsmooth* function $g$. This problem can be solved with only few lines of code:
24
22
25
23
```julia
26
24
julia>using StructuredOptimization
@@ -53,16 +51,12 @@ julia> ~x # inspect solution
53
51
54
52
55
53
It is possible to access to the solution by typing `~x`.
56
-
57
54
By default variables are initialized by `Array`s of zeros.
58
-
59
55
Different initializations can be set during construction `x = Variable( [1.; 0.; ...] )` or by assignement `~x .= [1.; 0.; ...]`.
60
56
61
57
## Constrained optimization
62
58
63
-
Constrained optimization is also encompassed by the [Standard problem formulation](@ref):
64
-
65
-
for a nonempty set $\mathcal{S}$ the constraint of
59
+
Constrained optimization is also encompassed by the [Standard problem formulation](@ref): for a nonempty set $\mathcal{S}$ the constraint of
Currently `StructuredOptimization.jl` supports only *Proximal Gradient (aka Forward Backward) algorithms*, which require specific properties of the nonsmooth functions and costraint to be applicable.
150
-
151
-
In particular, the nonsmooth functions must lead to an *efficiently computable proximal mapping*.
137
+
Currently `StructuredOptimization.jl` supports only *proximal gradient algorithms* (i.e., *forward-backward splitting* base), which require specific properties of the nonsmooth functions and costraint to be applicable. In particular, the nonsmooth functions must have an *efficiently computable proximal mapping*.
152
138
153
139
If we express the nonsmooth function $g$ as the composition of
154
140
a function $\tilde{g}$ with a linear operator $A$:
141
+
155
142
```math
156
143
g(\mathbf{x}) =
157
144
\tilde{g}(A \mathbf{x})
158
145
```
159
-
then a proximal mapping of $g$ is efficiently computable if it satisifies the following properties:
160
146
161
-
1. the mapping $A$ must be a *tight frame* namely it must satisfy $A A^* = \mu Id$, where $\mu \geq 0$ and $A^*$ is the adjoint of $A$ and $Id$ is the identity operator.
147
+
then the proximal mapping of $g$ is efficiently computable if either of the following hold:
162
148
163
-
2. if $A$ is not a tight frame, than it must be possible write $g$ as a *separable* sum $g(\mathbf{x}) = \sum_j h_j (B_j \mathbf{x}_j)$ with $\mathbf{x}_j$ being a non-overlapping slices of $\mathbf{x}$ and $B_j$ being tight frames.
149
+
1. Operator $A$ is a *tight frame*, namely it satisfies $A A^* = \mu Id$, where $\mu \geq 0$, $A^*$ is the adjoint of $A$, and $Id$ is the identity operator.
164
150
165
-
Let us analyze these rules with a series of examples.
151
+
2. Function $g$ is the *separable sum* $g(\mathbf{x}) = \sum_j h_j (B_j \mathbf{x}_j)$, where $\mathbf{x}_j$ are non-overlapping slices of $\mathbf{x}$, and $B_j$ are tight frames.
166
152
153
+
Let us analyze these rules with a series of examples.
167
154
The LASSO example above satisfy the first rule:
155
+
168
156
```julia
169
157
julia>@minimizels( A*x - y ) + λ*norm(x, 1)
170
-
171
158
```
172
-
since the non-smooth function $\lambda \| \cdot \|_1$ is not composed with any operator (or equivalently is composed with $Id$ which is a tight frame).
173
159
160
+
since the non-smooth function $\lambda \| \cdot \|_1$ is not composed with any operator (or equivalently is composed with $Id$ which is a tight frame).
174
161
Also the following problem would be accepted:
162
+
175
163
```julia
176
164
julia>@minimizels( A*x - y ) + λ*norm(dct(x), 1)
177
-
178
165
```
179
-
since the discrete cosine transform (DCT) is orthogonal and is therefore a tight frame.
180
166
181
-
On the other hand, the following problem
167
+
since the discrete cosine transform (DCT) is orthogonal and is therefore a tight frame. On the other hand, the following problem
168
+
182
169
```julia
183
170
julia>@minimizels( A*x - y ) + λ*norm(x, 1) st x >=1.0
184
-
185
171
```
172
+
186
173
cannot be solved through proximal gradient algorithms, since the second rule would be violated.
187
174
Here the constraint would be converted into an indicator function and the nonsmooth function $g$ can be written as the sum:
0 commit comments