Skip to content

Commit 407c8ff

Browse files
committed
docs rewording and reformatting
1 parent 8b913be commit 407c8ff

File tree

4 files changed

+51
-73
lines changed

4 files changed

+51
-73
lines changed

docs/src/expressions.md

+11-12
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# Expressions
22

3-
With `StructuredOptimization.jl` you can easily create mathematical expressions.
4-
5-
Firstly, [Variables](@ref) must be defined: various [Mappings](@ref) can then be applied
6-
following the application of [Functions and constraints](@ref) to create the `Term`s that define the optimization problem.
3+
With `StructuredOptimization.jl` you can easily create mathematical expressions.
4+
Firstly, [Variables](@ref) must be defined: various [Mappings](@ref) can then
5+
be applied following the application of [Functions and constraints](@ref) to
6+
create the `Term`s that define the optimization problem.
77

88
## Variables
99

@@ -12,9 +12,9 @@ following the application of [Functions and constraints](@ref) to create the `Te
1212
```@docs
1313
Variable
1414
```
15-
!!! note
15+
!!! note
1616

17-
`StructuredOptimization.jl` supports complex variables. It is possible to create them by specifying the type
17+
`StructuredOptimization.jl` supports complex variables. It is possible to create them by specifying the type
1818
`Variable(Complex{Float64}, 10)` or by initializing them with a complex array `Variable(randn(10)+im*randn(10))`.
1919

2020
### Utilities
@@ -39,11 +39,11 @@ eltype
3939

4040
## Mappings
4141

42-
As shown in the [Quick tutorial guide](@ref) it is possible to apply different mappings to the variables
43-
using a simple syntax.
42+
As shown in the [Quick tutorial guide](@ref) it is possible to apply different mappings to the variables
43+
using a simple syntax.
4444

45-
Alternatively, as shown in [Multiplying expressions](@ref), it is possible to define the mappings using
46-
[`AbstractOperators.jl`](https://github.com/kul-forbes/ProximalAlgorithms.jl) and to apply them
45+
Alternatively, as shown in [Multiplying expressions](@ref), it is possible to define the mappings using
46+
[`AbstractOperators.jl`](https://github.com/kul-forbes/ProximalAlgorithms.jl) and to apply them
4747
to the variable (or expression) through multiplication.
4848

4949
### Basic mappings
@@ -80,12 +80,11 @@ sigmoid
8080

8181
## Utilities
8282

83-
It is possible to access the variables, mappings and displacement of an expression.
83+
It is possible to access the variables, mappings and displacement of an expression.
8484
Notice that these commands work also for the `Term`s described in [Functions and constraints](@ref).
8585

8686
```@docs
8787
variables
8888
operator
8989
displacement
9090
```
91-

docs/src/functions.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,7 @@
11
# Functions and constraints
22

3-
Once an expression is created it is possible to create the `Term`s defining the optimization problem.
4-
5-
These can consists of either [Smooth functions](@ref), [Nonsmooth functions](@ref), [Inequality constraints](@ref)
3+
Once an expression is created it is possible to create the `Term`s defining the optimization problem.
4+
These can consists of either [Smooth functions](@ref), [Nonsmooth functions](@ref), [Inequality constraints](@ref)
65
or [Equality constraints](@ref).
76

87
## Smooth functions
@@ -39,21 +38,22 @@ hingeloss
3938

4039
## Smoothing
4140

42-
Sometimes the optimization problem might involve only non-smooth terms which do not lead to efficient proximal mappings. It is possible to *smooth* this terms by means of the *Moreau envelope*.
41+
Sometimes the optimization problem might involve non-smooth terms which
42+
do not have efficiently computable proximal mappings.
43+
It is possible to *smoothen* these terms by means of the *Moreau envelope*.
4344

4445
```@docs
4546
smooth
4647
```
4748

4849
## Duality
4950

50-
In some cases it is more convenient to solve the *dual problem* instead of the primal problem.
51-
52-
It is possible to convert the primal problem into its dual form by means of the *convex conjugate*.
51+
In some cases it is more convenient to solve the *dual problem* instead
52+
of the primal problem. It is possible to convert a problem into its dual
53+
by means of the *convex conjugate*.
5354

5455
See the [Total Variation demo](https://github.com/kul-forbes/StructuredOptimization.jl/blob/master/demos/TotalVariationDenoising.ipynb) for an example of such procedure.
5556

5657
```@docs
5758
conj
5859
```
59-

docs/src/solvers.md

+9-16
Original file line numberDiff line numberDiff line change
@@ -8,21 +8,17 @@
88

99
!!! note "Problem warm-starting"
1010

11-
By default *warm-starting* is always enabled.
12-
13-
For example, if two problems that utilize the same variables are solved consecutively,
11+
By default *warm-starting* is always enabled.
12+
For example, if two problems that utilize the same variables are solved consecutively,
1413
the second one will be automatically warm-started by the solution of the first one.
15-
16-
That is because the variables are always linked to their respective data vectors.
17-
18-
If one wants to avoid this, the optimization variables needs to be manually re-initialized
14+
That is because the variables are always linked to their respective data vectors.
15+
If one wants to avoid this, the optimization variables needs to be manually re-initialized
1916
before solving the second problem e.g. to a vector of zeros: `~x .= 0.0`.
2017

2118

2219
## Specifying solver and options
2320

2421
As shown above it is possible to choose the type of algorithm and specify its options by creating a `Solver` object.
25-
2622
Currently, the following algorithms are supported:
2723

2824
* *Proximal Gradient (PG)* [[1]](http://www.mit.edu/~dimitrib/PTseng/papers/apgm.pdf), [[2]](http://epubs.siam.org/doi/abs/10.1137/080716542)
@@ -39,8 +35,7 @@ PANOC
3935

4036
## Build and solve
4137

42-
The macro [`@minimize`](@ref) automatically parse and solve the problem.
43-
38+
The macro [`@minimize`](@ref) automatically parse and solve the problem.
4439
An alternative syntax is given by the function [`problem`](@ref) and [`solve`](@ref).
4540

4641
```@docs
@@ -50,12 +45,10 @@ solve
5045

5146
It is important to stress out that the `Solver` objects created using
5247
the functions above ([`PG`](@ref), [`FPG`](@ref), etc.)
53-
specify only the type of algorithm to be used together with its options.
54-
55-
The actual solver
56-
(namely the one of [`ProximalAlgorithms.jl`](https://github.com/kul-forbes/ProximalAlgorithms.jl))
57-
is constructed altogether with the problem formulation.
58-
48+
specify only the type of algorithm to be used together with its options.
49+
The actual solver
50+
(namely the one of [`ProximalAlgorithms.jl`](https://github.com/kul-forbes/ProximalAlgorithms.jl))
51+
is constructed altogether with the problem formulation.
5952
The problem parsing procedure can be separated from the solver application using the functions [`build`](@ref) and [`solve!`](@ref).
6053

6154
```@docs

docs/src/tutorial.md

+23-37
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,7 @@ The *least absolute shrinkage and selection operator* (LASSO) belongs to this cl
1818
\underset{ \mathbf{x} }{\text{minimize}} \ \tfrac{1}{2} \| \mathbf{A} \mathbf{x} - \mathbf{y} \|^2+ \lambda \| \mathbf{x} \|_1.
1919
```
2020

21-
Here the squared norm $\tfrac{1}{2} \| \mathbf{A} \mathbf{x} - \mathbf{y} \|^2$ is a *smooth* function $f$ wherelse the $l_1$-norm is a *nonsmooth* function $g$.
22-
23-
This problem can be solved with only few lines of code:
21+
Here the squared norm $\tfrac{1}{2} \| \mathbf{A} \mathbf{x} - \mathbf{y} \|^2$ is a *smooth* function $f$ wherelse the $l_1$-norm is a *nonsmooth* function $g$. This problem can be solved with only few lines of code:
2422

2523
```julia
2624
julia> using StructuredOptimization
@@ -53,16 +51,12 @@ julia> ~x # inspect solution
5351

5452

5553
It is possible to access to the solution by typing `~x`.
56-
5754
By default variables are initialized by `Array`s of zeros.
58-
5955
Different initializations can be set during construction `x = Variable( [1.; 0.; ...] )` or by assignement `~x .= [1.; 0.; ...]`.
6056

6157
## Constrained optimization
6258

63-
Constrained optimization is also encompassed by the [Standard problem formulation](@ref):
64-
65-
for a nonempty set $\mathcal{S}$ the constraint of
59+
Constrained optimization is also encompassed by the [Standard problem formulation](@ref): for a nonempty set $\mathcal{S}$ the constraint of
6660

6761
```math
6862
\begin{align*}
@@ -81,9 +75,7 @@ g(\mathbf{x}) = \delta_{\mathcal{S}} (\mathbf{x}) = \begin{cases}
8175
```
8276

8377
to obtain the standard form. Constraints are treated as *nonsmooth functions*.
84-
8578
This conversion is automatically performed by `StructuredOptimization.jl`.
86-
8779
For example, the non-negative deconvolution problem:
8880

8981
```math
@@ -110,28 +102,24 @@ julia> @minimize ls(conv(x,h)-y) st x >= 0.
110102
!!! note
111103

112104
The convolution mapping was applied to the variable `x` using `conv`.
113-
114-
`StructuredOptimization.jl` provides a set of functions that can be used to apply
115-
specific operators to variables and create mathematical expression.
116-
117-
The available functions can be found in [Mappings](@ref).
118-
105+
`StructuredOptimization.jl` provides a set of functions that can be
106+
used to apply specific operators to variables and create mathematical
107+
expression. The available functions can be found in [Mappings](@ref).
119108
In general it is more convenient to use these functions instead of matrices,
120-
as these functions apply efficient algorithms for the forward and adjoint mappings leading to
121-
*matrix free optimization*.
109+
as these functions apply efficient algorithms for the forward and adjoint
110+
mappings leading to *matrix free optimization*.
122111

123112
## Using multiple variables
124113

125-
It is possible to use multiple variables which are allowed to be matrices or even tensors.
126-
127-
For example a non-negative matrix factorization problem:
114+
It is possible to use multiple variables which are allowed to be matrices or even tensors. For example a non-negative matrix factorization problem:
128115

129116
```math
130117
\begin{align*}
131118
\underset{ \mathbf{X}_1, \mathbf{X}_2 }{\text{minimize}} \ & \tfrac{1}{2} \| \mathbf{X}_1 \mathbf{X}_2 - \mathbf{Y} \| \\
132119
\text{subject to} \ & \mathbf{X}_1 \geq 0, \ \mathbf{X}_2 \geq 0,
133120
\end{align*}
134121
```
122+
135123
can be solved using the following code:
136124

137125
```julia
@@ -146,57 +134,55 @@ julia> @minimize ls(X1*X2-Y) st X1 >= 0., X2 >= 0.
146134

147135
## Limitations
148136

149-
Currently `StructuredOptimization.jl` supports only *Proximal Gradient (aka Forward Backward) algorithms*, which require specific properties of the nonsmooth functions and costraint to be applicable.
150-
151-
In particular, the nonsmooth functions must lead to an *efficiently computable proximal mapping*.
137+
Currently `StructuredOptimization.jl` supports only *proximal gradient algorithms* (i.e., *forward-backward splitting* base), which require specific properties of the nonsmooth functions and costraint to be applicable. In particular, the nonsmooth functions must have an *efficiently computable proximal mapping*.
152138

153139
If we express the nonsmooth function $g$ as the composition of
154140
a function $\tilde{g}$ with a linear operator $A$:
141+
155142
```math
156143
g(\mathbf{x}) =
157144
\tilde{g}(A \mathbf{x})
158145
```
159-
then a proximal mapping of $g$ is efficiently computable if it satisifies the following properties:
160146

161-
1. the mapping $A$ must be a *tight frame* namely it must satisfy $A A^* = \mu Id$, where $\mu \geq 0$ and $A^*$ is the adjoint of $A$ and $Id$ is the identity operator.
147+
then the proximal mapping of $g$ is efficiently computable if either of the following hold:
162148

163-
2. if $A$ is not a tight frame, than it must be possible write $g$ as a *separable* sum $g(\mathbf{x}) = \sum_j h_j (B_j \mathbf{x}_j)$ with $\mathbf{x}_j$ being a non-overlapping slices of $\mathbf{x}$ and $B_j$ being tight frames.
149+
1. Operator $A$ is a *tight frame*, namely it satisfies $A A^* = \mu Id$, where $\mu \geq 0$, $A^*$ is the adjoint of $A$, and $Id$ is the identity operator.
164150

165-
Let us analyze these rules with a series of examples.
151+
2. Function $g$ is the *separable sum* $g(\mathbf{x}) = \sum_j h_j (B_j \mathbf{x}_j)$, where $\mathbf{x}_j$ are non-overlapping slices of $\mathbf{x}$, and $B_j$ are tight frames.
166152

153+
Let us analyze these rules with a series of examples.
167154
The LASSO example above satisfy the first rule:
155+
168156
```julia
169157
julia> @minimize ls( A*x - y ) + λ*norm(x, 1)
170-
171158
```
172-
since the non-smooth function $\lambda \| \cdot \|_1$ is not composed with any operator (or equivalently is composed with $Id$ which is a tight frame).
173159

160+
since the non-smooth function $\lambda \| \cdot \|_1$ is not composed with any operator (or equivalently is composed with $Id$ which is a tight frame).
174161
Also the following problem would be accepted:
162+
175163
```julia
176164
julia> @minimize ls( A*x - y ) + λ*norm(dct(x), 1)
177-
178165
```
179-
since the discrete cosine transform (DCT) is orthogonal and is therefore a tight frame.
180166

181-
On the other hand, the following problem
167+
since the discrete cosine transform (DCT) is orthogonal and is therefore a tight frame. On the other hand, the following problem
168+
182169
```julia
183170
julia> @minimize ls( A*x - y ) + λ*norm(x, 1) st x >= 1.0
184-
185171
```
172+
186173
cannot be solved through proximal gradient algorithms, since the second rule would be violated.
187174
Here the constraint would be converted into an indicator function and the nonsmooth function $g$ can be written as the sum:
188175

189176
```math
190177
g(\mathbf{x}) =\lambda \| \mathbf{x} \|_1 + \delta_{\mathcal{S}} (\mathbf{x})
191178
```
192179

193-
which is not separable.
180+
which is not separable. On the other hand this problem would be accepted:
194181

195-
On the other hand this problem would be accepted:
196182
```julia
197183
julia> @minimize ls( A*x - y ) + λ*norm(x[1:div(n,2)], 1) st x[div(n,2)+1:n] >= 1.0
198-
199184
```
185+
200186
as not the optimization variables $\mathbf{x}$ are partitioned into non-overlapping groups.
201187

202188
!!! note

0 commit comments

Comments
 (0)