Skip to content

Commit 3c705f5

Browse files
authored
Entropy signature and folder structure cleanup (#168)
* update gitignore * Unified signature for `entropy` * Fix diversity test * Reorganize tests * probabilities * Add deprecation * Default to Shannon entropy * resolve ambiguous method signature by being less strict * Fix examples * `base` belongs in the entropy, not the estimator * Simplify constructor * Remove redundant type parameter * Fix tests
1 parent cce1c15 commit 3c705f5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

69 files changed

+701
-531
lines changed

.DS_Store

6 KB
Binary file not shown.

.github/PULL_REQUEST_TEMPLATE/new_entropy_template.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ project [devdocs](https://juliadynamics.github.io/Entropies.jl/dev/devdocs/).
1313
Ticking the boxes below will help us provide good feedback and speed up the review process.
1414
Partial PRs are welcome too, and we're happy to help if you're stuck on something.
1515

16-
- [ ] The new entropy subtypes `Entropy` or `IndirectEntropy`.
16+
- [ ] The new entropy (estimator) subtypes `Entropy` (`EntropyEstimator`).
1717
- [ ] The new entropy has an informative docstring, which is referenced in
1818
`docs/src/entropies.md`.
1919
- [ ] Relevant sources are cited in the docstring.

docs/.DS_Store

6 KB
Binary file not shown.

docs/src/examples.md

+8-8
Original file line numberDiff line numberDiff line change
@@ -137,9 +137,9 @@ for (i, r) in enumerate(rs)
137137
lyaps[i] = lyapunov(ds, N_lyap)
138138
139139
x = trajectory(ds, N_ent) # time series
140-
hperm = entropy(x, SymbolicPermutation(; m, τ))
141-
hwtperm = entropy(x, SymbolicWeightedPermutation(; m, τ))
142-
hampperm = entropy(x, SymbolicAmplitudeAwarePermutation(; m, τ))
140+
hperm = entropy(SymbolicPermutation(; m, τ), x)
141+
hwtperm = entropy(SymbolicWeightedPermutation(; m, τ), x)
142+
hampperm = entropy(SymbolicAmplitudeAwarePermutation(; m, τ), x)
143143
144144
hs_perm[i] = hperm; hs_wtperm[i] = hwtperm; hs_ampperm[i] = hampperm
145145
end
@@ -173,7 +173,7 @@ using DynamicalSystemsBase, CairoMakie, Distributions
173173
N = 500
174174
D = Dataset(sort([rand(𝒩) for i = 1:N]))
175175
x, y = columns(D)
176-
p = probabilities(D, NaiveKernel(1.5))
176+
p = probabilities(NaiveKernel(1.5), D)
177177
fig, ax = scatter(D[:, 1], D[:, 2], zeros(N);
178178
markersize=8, axis=(type = Axis3,)
179179
)
@@ -301,9 +301,9 @@ des = zeros(length(windows))
301301
pes = zeros(length(windows))
302302
303303
m, c = 2, 6
304-
est_de = Dispersion(encoding = GaussianCDFEncoding(c), m = m, τ = 1)
304+
est_de = Dispersion(c = c, m = m, τ = 1)
305305
for (i, window) in enumerate(windows)
306-
des[i] = entropy_normalized(Renyi(), y[window], est_de)
306+
des[i] = entropy_normalized(Renyi(), est_de, y[window])
307307
end
308308
309309
fig = Figure()
@@ -344,8 +344,8 @@ for N in (N1, N2)
344344
local w = trajectory(Systems.lorenz(), N÷10; Δt = 0.1, Ttr = 100)[:, 1] # chaotic
345345
346346
for q in (x, y, z, w)
347-
h = entropy(q, PowerSpectrum())
348-
n = entropy_normalized(q, PowerSpectrum())
347+
h = entropy(PowerSpectrum(), q)
348+
n = entropy_normalized(PowerSpectrum(), q)
349349
println("entropy: $(h), normalized: $(n).")
350350
end
351351
end

docs/src/index.md

+5-1
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,11 @@ Thus, any of the implemented [probabilities estimators](@ref probabilities_estim
4040

4141
These names are common place, and so in Entropies.jl we provide convenience functions like [`entropy_wavelet`](@ref). However, it should be noted that these functions really aren't anything more than 2-lines-of-code wrappers that call [`entropy`](@ref) with the appropriate [`ProbabilitiesEstimator`](@ref).
4242

43-
There are only a few exceptions to this rule, which are quantities that are able to compute Shannon entropies via alternate means, without explicitly computing some probability distributions. These are `IndirectEntropy` instances, such as [`Kraskov`](@ref).
43+
In addition to `ProbabilitiesEstimators`, we also provide [`EntropyEstimator`](@ref)s,
44+
which compute entropies via alternate means, without explicitly computing some
45+
probability distribution. For example, [`Kraskov`](@ref) estimator computes Shannon
46+
entropy via a nearest neighbor algorithm, while the [`Zhu`](@ref) estimator computes
47+
Shannon entropy using order statistics.
4448

4549
### Other complexity measures
4650

src/deprecations.jl

+8
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,14 @@ function probabilities(x::Vector_or_Dataset, ε::Union{Real, Vector{<:Real}})
88
probabilities(x, ValueHistogram(ε))
99
end
1010

11+
function probabilities(x, est::ProbabilitiesEstimator)
12+
@warn """
13+
`probabilities(x, est::ProbabilitiesEstimator)`
14+
is deprecated, use `probabilities(est::ProbabilitiesEstimator, x) instead`.
15+
"""
16+
return probabilities(est, x)
17+
end
18+
1119
export genentropy, permentropy
1220

1321
function permentropy(x; τ = 1, m = 3, base = MathConstants.e)

src/entropies/convenience_definitions.jl

+9-9
Original file line numberDiff line numberDiff line change
@@ -23,26 +23,26 @@ for the weighted/amplitude-aware versions.
2323
"""
2424
function entropy_permutation(x; base = 2, kwargs...)
2525
est = SymbolicPermutation(; kwargs...)
26-
entropy(Shannon(; base), x, est)
26+
entropy(Shannon(; base), est, x)
2727
end
2828

2929
"""
30-
entropy_spatial_permutation(x, stencil, periodic = true; kwargs...)
30+
entropy_spatial_permutation(x, stencil; periodic = true; kwargs...)
3131
3232
Compute the spatial permutation entropy of `x` given the `stencil`.
3333
Here `x` must be a matrix or higher dimensional `Array` containing spatial data.
3434
This function is just a convenience call to:
3535
3636
```julia
3737
est = SpatialSymbolicPermutation(stencil, x, periodic)
38-
entropy(Renyi(;kwargs...), x, est)
38+
entropy(Renyi(;kwargs...), est, x)
3939
```
4040
4141
See [`SpatialSymbolicPermutation`](@ref) for more info, or how to encode stencils.
4242
"""
43-
function entropy_spatial_permutation(x, stencil, periodic = true; kwargs...)
43+
function entropy_spatial_permutation(x, stencil; periodic = true, kwargs...)
4444
est = SpatialSymbolicPermutation(stencil, x, periodic)
45-
entropy(Renyi(;kwargs...), x, est)
45+
entropy(Renyi(;kwargs...), est, x)
4646
end
4747

4848
"""
@@ -52,14 +52,14 @@ Compute the wavelet entropy. This function is just a convenience call to:
5252
5353
```julia
5454
est = WaveletOverlap(wavelet)
55-
entropy(Shannon(base), x, est)
55+
entropy(Shannon(base), est, x)
5656
```
5757
5858
See [`WaveletOverlap`](@ref) for more info.
5959
"""
6060
function entropy_wavelet(x; wavelet = Wavelets.WT.Daubechies{12}(), base = 2)
6161
est = WaveletOverlap(wavelet)
62-
entropy(Shannon(; base), x, est)
62+
entropy(Shannon(; base), est, x)
6363
end
6464

6565
"""
@@ -69,12 +69,12 @@ Compute the dispersion entropy. This function is just a convenience call to:
6969
7070
```julia
7171
est = Dispersion(kwargs...)
72-
entropy(Shannon(base), x, est)
72+
entropy(Shannon(base), est, x)
7373
```
7474
7575
See [`Dispersion`](@ref) for more info.
7676
"""
7777
function entropy_dispersion(x; base = 2, kwargs...)
7878
est = Dispersion(kwargs...)
79-
entropy(Shannon(; base), x, est)
79+
entropy(Shannon(; base), est, x)
8080
end

src/entropies/entropies.jl

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@ include("tsallis.jl")
33
include("curado.jl")
44
include("streched_exponential.jl")
55
include("convenience_definitions.jl")
6-
include("direct_entropies/direct_entropies.jl")
6+
include("estimators/estimators.jl")

src/entropies/direct_entropies/nearest_neighbors/KozachenkoLeonenko.jl renamed to src/entropies/estimators/nearest_neighbors/KozachenkoLeonenko.jl

+14-8
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,38 @@
11
export KozachenkoLeonenko
22

33
"""
4-
KozachenkoLeonenko <: IndirectEntropy
4+
KozachenkoLeonenko <: EntropyEstimator
55
KozachenkoLeonenko(; k::Int = 1, w::Int = 1, base = 2)
66
7-
An indirect entropy estimator used in [`entropy`](@ref)`(KozachenkoLeonenko(), x)` to
8-
estimate the Shannon entropy of `x` (a multi-dimensional `Dataset`) to the given
9-
`base` using nearest neighbor searches using the method from Kozachenko &
10-
Leonenko[^KozachenkoLeonenko1987], as described in Charzyńska and Gambin[^Charzyńska2016].
7+
The `KozachenkoLeonenko` estimator computes the [`Shannon`](@ref) [`entropy`](@ref) of `x`
8+
(a multi-dimensional `Dataset`) to the given `base`, based on nearest neighbor searches
9+
using the method from Kozachenko & Leonenko (1987)[^KozachenkoLeonenko1987], as described in
10+
Charzyńska and Gambin[^Charzyńska2016].
1111
1212
`w` is the Theiler window, which determines if temporal neighbors are excluded
1313
during neighbor searches (defaults to `0`, meaning that only the point itself is excluded
1414
when searching for neighbours).
1515
1616
In contrast to [`Kraskov`](@ref), this estimator uses only the *closest* neighbor.
1717
18+
See also: [`entropy`](@ref).
19+
1820
[^Charzyńska2016]: Charzyńska, A., & Gambin, A. (2016). Improvement of the k-NN entropy
1921
estimator with applications in systems biology. Entropy, 18(1), 13.
2022
[^KozachenkoLeonenko1987]: Kozachenko, L. F., & Leonenko, N. N. (1987). Sample estimate of
2123
the entropy of a random vector. Problemy Peredachi Informatsii, 23(2), 9-16.
2224
"""
23-
@Base.kwdef struct KozachenkoLeonenko{B} <: IndirectEntropy
25+
@Base.kwdef struct KozachenkoLeonenko{B} <: EntropyEstimator
2426
w::Int = 1
2527
base::B = 2
2628
end
2729

28-
function entropy(e::KozachenkoLeonenko, x::AbstractDataset{D, T}) where {D, T}
29-
(; w, base) = e
30+
function entropy(e::Renyi, est::KozachenkoLeonenko, x::AbstractDataset{D, T}) where {D, T}
31+
e.q == 1 || throw(ArgumentError(
32+
"Renyi entropy with q = $(e.q) not implemented for $(typeof(est)) estimator"
33+
))
34+
(; w, base) = est
35+
3036
N = length(x)
3137
ρs = maximum_neighbor_distances(x, w, 1)
3238
# The estimated entropy has "unit" [nats]

src/entropies/direct_entropies/nearest_neighbors/Kraskov.jl renamed to src/entropies/estimators/nearest_neighbors/Kraskov.jl

+11-8
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,34 @@
11
export Kraskov
22

33
"""
4-
Kraskov <: IndirectEntropy
4+
Kraskov <: EntropyEstimator
55
Kraskov(; k::Int = 1, w::Int = 1, base = 2)
66
7-
An indirect entropy used in [`entropy`](@ref)`(Kraskov(), x)` to estimate the Shannon
8-
entropy of `x` (a multi-dimensional `Dataset`) to the given
9-
`base` using `k`-th nearest neighbor searches as in [^Kraskov2004].
7+
The `Kraskov` estimator computes the [`Shannon`](@ref) [`entropy`](@ref) of `x`
8+
(a multi-dimensional `Dataset`) to the given `base`, using the `k`-th nearest neighbor
9+
searches method from [^Kraskov2004].
1010
1111
`w` is the Theiler window, which determines if temporal neighbors are excluded
1212
during neighbor searches (defaults to `0`, meaning that only the point itself is excluded
1313
when searching for neighbours).
1414
15-
See also: [`KozachenkoLeonenko`](@ref).
15+
See also: [`entropy`](@ref), [`KozachenkoLeonenko`](@ref).
1616
1717
[^Kraskov2004]:
1818
Kraskov, A., Stögbauer, H., & Grassberger, P. (2004).
1919
Estimating mutual information. Physical review E, 69(6), 066138.
2020
"""
21-
Base.@kwdef struct Kraskov{B} <: IndirectEntropy
21+
Base.@kwdef struct Kraskov{B} <: EntropyEstimator
2222
k::Int = 1
2323
w::Int = 1
2424
base::B = 2
2525
end
2626

27-
function entropy(e::Kraskov, x::AbstractDataset{D, T}) where {D, T}
28-
(; k, w, base) = e
27+
function entropy(e::Renyi, est::Kraskov, x::AbstractDataset{D, T}) where {D, T}
28+
e.q == 1 || throw(ArgumentError(
29+
"Renyi entropy with q = $(e.q) not implemented for $(typeof(est)) estimator"
30+
))
31+
(; k, w, base) = est
2932
N = length(x)
3033
ρs = maximum_neighbor_distances(x, w, k)
3134
# The estimated entropy has "unit" [nats]

src/entropies/direct_entropies/nearest_neighbors/Zhu.jl renamed to src/entropies/estimators/nearest_neighbors/Zhu.jl

+17-12
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,43 @@
11
export Zhu
22

33
"""
4-
Zhu <: IndirectEntropy
5-
Zhu(k = 1, w = 0, base = MathConstants.e)
4+
Zhu <: EntropyEstimator
5+
Zhu(k = 1, w = 0)
66
7-
The `Zhu` indirect entropy estimator (Zhu et al., 2015)[^Zhu2015] estimates the Shannon
8-
entropy of `x` (a multi-dimensional `Dataset`) to the given `base`, by approximating
9-
probabilities within hyperrectangles surrounding each point `xᵢ ∈ x` using
7+
The `Zhu` estimator (Zhu et al., 2015)[^Zhu2015] computes the [`Shannon`](@ref)
8+
[`entropy`](@ref) of `x` (a multi-dimensional `Dataset`), by
9+
approximating probabilities within hyperrectangles surrounding each point `xᵢ ∈ x` using
1010
using `k` nearest neighbor searches.
1111
12-
This estimator is an extension to [`KozachenkoLeonenko`](@ref).
13-
1412
`w` is the Theiler window, which determines if temporal neighbors are excluded
1513
during neighbor searches (defaults to `0`, meaning that only the point itself is excluded
1614
when searching for neighbours).
1715
16+
This estimator is an extension to [`KozachenkoLeonenko`](@ref).
17+
18+
See also: [`entropy`](@ref).
19+
1820
[^Zhu2015]:
1921
Zhu, J., Bellanger, J. J., Shu, H., & Le Bouquin Jeannès, R. (2015). Contribution to
2022
transfer entropy estimation via the k-nearest-neighbors approach. Entropy, 17(6),
2123
4173-4201.
2224
"""
23-
Base.@kwdef struct Zhu{B} <: IndirectEntropy
25+
Base.@kwdef struct Zhu <: EntropyEstimator
2426
k::Int = 1
2527
w::Int = 0
26-
base::B = MathConstants.e
2728
end
2829

29-
function entropy(e::Zhu, x::AbstractDataset{D, T}) where {D, T}
30-
(; k, w, base) = e
30+
function entropy(e::Renyi, est::Zhu, x::AbstractDataset{D, T}) where {D, T}
31+
e.q == 1 || throw(ArgumentError(
32+
"Renyi entropy with q = $(e.q) not implemented for $(typeof(est)) estimator"
33+
))
34+
(; k, w) = est
35+
3136
N = length(x)
3237
tree = KDTree(x, Euclidean())
3338
nn_idxs = bulkisearch(tree, x, NeighborNumber(k), Theiler(w))
3439
h = digamma(N) + mean_logvolumes(x, nn_idxs, N) - digamma(k) + (D - 1) / k
35-
return h / log(base, MathConstants.e)
40+
return h / log(e.base, MathConstants.e)
3641
end
3742

3843
function mean_logvolumes(x, nn_idxs, N::Int)

src/entropies/direct_entropies/nearest_neighbors/ZhuSingh.jl renamed to src/entropies/estimators/nearest_neighbors/ZhuSingh.jl

+14-14
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
using DelayEmbeddings: minmaxima
22
using SpecialFunctions: digamma
3-
using Entropies: Entropy, IndirectEntropy
3+
using Entropies: Entropy, EntropyEstimator
44
using Neighborhood: KDTree, Chebyshev, bulkisearch, Theiler, NeighborNumber
55

66
export ZhuSingh
77

88
"""
9-
ZhuSingh <: IndirectEntropy
10-
ZhuSingh(k = 1, w = 0, base = MathConstants.e)
9+
ZhuSingh <: EntropyEstimator
10+
ZhuSingh(k = 1, w = 0)
1111
12-
The `ZhuSingh` indirect entropy estimator (Zhu et al., 2015)[^Zhu2015] estimates the Shannon
13-
entropy of `x` (a multi-dimensional `Dataset`) to the given `base`.
12+
The `ZhuSingh` estimator (Zhu et al., 2015)[^Zhu2015] computes the [`Shannon`](@ref)
13+
[`entropy`](@ref) of `x` (a multi-dimensional `Dataset`).
1414
1515
Like [`Zhu`](@ref), this estimator approximates probabilities within hyperrectangles
1616
surrounding each point `xᵢ ∈ x` using using `k` nearest neighbor searches. However,
@@ -21,6 +21,8 @@ This estimator is an extension to the entropy estimator in Singh et al. (2003).
2121
during neighbor searches (defaults to `0`, meaning that only the point itself is excluded
2222
when searching for neighbours).
2323
24+
See also: [`entropy`](@ref).
25+
2426
[^Zhu2015]:
2527
Zhu, J., Bellanger, J. J., Shu, H., & Le Bouquin Jeannès, R. (2015). Contribution to
2628
transfer entropy estimation via the k-nearest-neighbors approach. Entropy, 17(6),
@@ -30,24 +32,22 @@ when searching for neighbours).
3032
neighbor estimates of entropy. American journal of mathematical and management
3133
sciences, 23(3-4), 301-321.
3234
"""
33-
Base.@kwdef struct ZhuSingh{B} <: IndirectEntropy
35+
Base.@kwdef struct ZhuSingh <: EntropyEstimator
3436
k::Int = 1
3537
w::Int = 0
36-
base::B = MathConstants.e
37-
38-
function ZhuSingh(k::Int, w::Int, base::B) where B
39-
new{B}(k, w, base)
40-
end
4138
end
4239

43-
function entropy(e::ZhuSingh, x::AbstractDataset{D, T}) where {D, T}
44-
(; k, w, base) = e
40+
function entropy(e::Renyi, est::ZhuSingh, x::AbstractDataset{D, T}) where {D, T}
41+
e.q == 1 || throw(ArgumentError(
42+
"Renyi entropy with q = $(e.q) not implemented for $(typeof(est)) estimator"
43+
))
44+
(; k, w) = est
4545
N = length(x)
4646
tree = KDTree(x, Euclidean())
4747
nn_idxs = bulkisearch(tree, x, NeighborNumber(k), Theiler(w))
4848
mean_logvol, mean_digammaξ = mean_logvolumes_and_digamma(x, nn_idxs, N, k)
4949
h = digamma(N) + mean_logvol - mean_digammaξ
50-
return h / log(base, MathConstants.e)
50+
return h / log(e.base, MathConstants.e)
5151
end
5252

5353
function mean_logvolumes_and_digamma(x, nn_idxs, N::Int, k::Int)

0 commit comments

Comments
 (0)