Addressing review comments #99

kahaaga · 2022-09-24T19:27:06Z

Hey, @Datseris. I've tried to address all your comments on #96 in this PR.

Added alphabet_length function that dispatches on estimators (throws error if alphabet length cannot be obtained)
Standardised API for normalized Tsallis, Renyi and Shannon entropies.
Added literature references justifying that normalization for symbolic entropy estimators are valid for all orders q.
Made distance_to_whitenoise part of public API.
Some minor formatting and language fixes.

Some things to potentially discuss are:

Naming (normalization interface)

Instead of a single entropy_normalized function, I used entropy_renyi_norm,
entropy_tsallis_norm and entropy_shannon_norm. This mirrors the entropy_renyi and
similar functions we already have. The alternative is to have entropy_normalized(x, est = Renyi(); kwargs...) or something along those lines. As we already discussed elsewhere, I don't think the latter approach is a good idea.
Do we provide normalized version of the convenience functions too (entropy_permutation_norm), or leave it up to the user to figure out how to do so?

Normalization/alphabet length

Where to indicate which estimators implements alphabet_length? Alternatives are:

To throw an error in alphabet_length for estimators that don't support normalization, e.g. throw("unimplemented"). I favor explicitly throwing an informative error instead of silently
returning nothing when an estimator doesn't implement alphabet_length, because this
reduces changes of errors for users.
In the docstring for renyi_entropy_norm.
In each individual estimator under a ##Implements heading.
A combination.

For user-friendlyness, I think either of the latter is best. For code maintenance purposes, the first alternative is best (which is what I've done in this PR).

codecov · 2022-09-24T19:47:08Z

Codecov Report

Merging #99 (b595031) into newapi/dispersion_estimators (6b58635) will decrease coverage by 1.43%.
The diff coverage is 51.35%.

@@                       Coverage Diff                        @@
##           newapi/dispersion_estimators      #99      +/-   ##
================================================================
- Coverage                         71.95%   70.51%   -1.44%     
================================================================
  Files                                31       33       +2     
  Lines                               756      770      +14     
================================================================
- Hits                                544      543       -1     
- Misses                              212      227      +15

Impacted Files	Coverage Δ
src/entropies/renyi.jl	`48.38% <0.00%> (-1.62%)`	⬇️
src/entropies/shannon.jl	`0.00% <0.00%> (ø)`
...tors/permutation_ordinal/SymbolicAmplitudeAware.jl	`94.73% <0.00%> (-5.27%)`	⬇️
...imators/permutation_ordinal/SymbolicPermutation.jl	`88.37% <0.00%> (-2.11%)`	⬇️
...permutation_ordinal/SymbolicWeightedPermutation.jl	`94.44% <0.00%> (-5.56%)`	⬇️
src/probabilities_estimators/utils.jl	`0.00% <0.00%> (ø)`
src/symbolization/GaussianSymbolization.jl	`100.00% <ø> (ø)`
src/entropies/normalized_entropy.jl	`33.33% <33.33%> (ø)`
.../probabilities_estimators/dispersion/dispersion.jl	`72.22% <75.00%> (-8.55%)`	⬇️
src/entropies/tsallis.jl	`66.66% <77.77%> (+16.66%)`	⬆️
... and 1 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Datseris

Naming (normalization interface)

We don't have to do it this way, we can still have a "single function" without introducing any new stuff like Renyi. The key is that in Julia functions are their own types. Thus, we can have a single function

function normalize(entropy_function, args...; kwargs...)
end

which returns the normalized entropy. Instead of the common name normalize we could have entropy_normalized instead. You define method dispatch as such:

function entropy_normalized(entropy_function::typeof(entropy_renyi), args...; kwargs...)
end

and this is a new method.

There are some advantages in this approach:

We define a single API function with a single and rather consise docstring. It practically says that the entropy is normalized versus the entropy of uniform distribution across alphabet.
If we add more entrpy methods in the future, we can add them to this existing function instead.
We can have a cleaner code by defining this one function in its separate file.

I am thinking that we need to define only a single function. Given the alphabet length, we can generate a Probabilities vector with all entries equal, and pass that to the entropy function directly, no matter what this entropy function is!

(throws error if alphabet length cannot be obtained)

Actually in the docs you say it returns nothing already. So the question is, what should it do by default? At the moment it errors by default, which I'm still not sure. I guess it's okay to leave it as is.

Datseris · 2022-09-24T21:47:47Z

src/probabilities_estimators/utils.jl

+julia> est = SymbolicPermutation(m = 4)
+SymbolicPermutation{typeof(Entropies.isless_rand)}(1, 4, Entropies.isless_rand)


Put ; here at the end of est assignment?

Datseris · 2022-09-24T21:51:38Z

src/probabilities_estimators/utils.jl

+alphabet_length(est::SymbolicPermutation)::Int = factorial(est.m)
+alphabet_length(est::SymbolicWeightedPermutation)::Int = factorial(est.m)
+alphabet_length(est::SymbolicAmplitudeAwarePermutation)::Int = factorial(est.m)


This function shoould be extended in the same file the probabilities function is extended, i.e., in the file of the estimator.

Datseris · 2022-09-24T21:52:05Z

src/probabilities_estimators/utils.jl

+function alphabet_length end
+
+alphabet_length(est) =


These two definitions should be combined. Also the second definition should type annotate est::ProbabilitiesEstimator.

Datseris · 2022-09-24T21:59:05Z

Here's my idea for a generic function:

function normalize(entropy_function, est::ProbabilitiesEstimator, args...; kwargs...)
    L = alphabet_length(est)
    np = Probabilities(ones(L))
    return entropy_function(est, args...; kwargs...)/entropy_function(np, args...; kwargs...)
end

kahaaga · 2022-09-24T22:00:17Z

Here's my idea for a generic function:

Excellent. I didn't think of that at all. I will adjust the PR accordingly and ping you when ready.

Datseris · 2022-09-24T22:01:56Z

The citation

# Normalization is well-defined for all values of `q`, e.g. Kumar, U., Kumar, V., &
# Kapur, J. N. (1986). Normalized measures of entropy. International Journal Of General
# System, 12(1), 55-69.

should be included in the docstring of this function I guess.

kahaaga · 2022-09-25T07:52:22Z

Thinking a bit more about this, I'm not convinced that your pseudocode above is the best way to go. There are two reasons for this:

We implicitly assume that all generalized entropies are maximized for a uniform distribution. I don't know of any proof that this is the case. For example, is the Kaniadakis entropy (Kaniadakis entropy as a new generalized entropy #92) maximized for the uniform distribution?
If a generalized entropy E shows up that is maximized for some other distribution, do we introduce a new normalization function specifically for E (e.g. normalized_X)?
For entropies that actually are maximized for uniform distributions (the ones we currently have), why take the extra step of allocating an extra probability vector and computing the entropy of those probabilities too? This essentially doubles the computational cost of computing normalized entropies, when it can be done with a single division operation (dividing by log(alphabet_length(est))), like so:

function normalize(x, entropy_function, est::ProbabilitiesEstimator, args...; kwargs...)
    return entropy_function(x, est, args...; kwargs...)/alphabet_length(est)
end

Datseris · 2022-09-25T08:33:31Z

We implicitly assume that all generalized entropies are maximized for a uniform distribution

No, we never assume this. We return the entropy normalized versus the uniform distribution. We don't make claims of what the uniform distribution does.

If a generalized entropy E shows up that is maximized for some other distribution, do we introduce a new normalization function specifically for E (e.g. normalized_X)?

No, because maximizing was never the goal here. Normalizing versus an established standard is.

This essentially doubles the computational cost of computing normalized entropies

The cost of computing the entropy includes computing the probabilities. Here we just create a small vector, which I bet would have minimal cost versus that of actually computing the probabilities... But sure, if you are analytically certain that for all entropy methods, regardless of q, we get log(alphabet_length(est)) as the result, the use that, definitely. I don't know this for sure, and I don't want to study all these Tsallis entropies now to find out :D

kahaaga · 2022-09-25T09:06:15Z

No, we never assume this. We return the entropy normalized versus the uniform distribution. We don't make claims of what the uniform distribution does.

Sure, but the whole point of normalisation is to ensure that the computed value falls in the range [0, 1], so that entropy values can be compared. This happens precisely when normalising to the maximal value of the entropy. For the entropies we have now, normalising to the uniform distribution does normalize to [0, 1], but only because the uniform distribution is the one that maximizes Renyi and Tsallis entropies (according to the references in this PR).

For other entropies, this might not be the case. If there is suddenly an entropy that isn't maximised by the uniform distribution, then the normalized value is no longer guaranteed to be on [0, 1], but on [0, a], where a is determined by how that particular entropy behaves for the uniform distribution. Then entropy_normalized no longer behaves in a consistent manner.

Of course, we can just define normalization in Entropies.jl to mean normalization to the entropy of a uniform distribution (as you suggest), and leave it to the user to find out if this normalization is meaningful to them . However, I think we should strive to provide "meaningful" output out-of-the-box, which makes having a completely generic entropy_normalized function potentially problematic.

But sure, if you are analytically certain that for all entropy methods, regardless of q, we get log(alphabet_length(est)) as the result, the use that, definitely.

Based on the references I found, this is the case for Renyi and Tsallis entropies. Not sure about the Kaniadakis and entropies, though. But this is a potential issue regardless if we normalize to log(alphabet_length(est)) or manually construct a flat distribution and normalize to the entropy of that.

EDIT: we can of course just state in the docstring of entropy_normalized that this normalization only restricts the entropy to [0, 1] for the types of entropies that are maximized by the uniform distribution, then leave it to users to normalize (future) more complicated entropies themselves. I don't know if this is the way to go. What do you think?

EDIT 2: I think an informative docstring message abut what the normalization actually does is the better approach for now, since I too don't want to delve too deeply into all these generalized entropies at the moment.

Datseris · 2022-09-25T09:43:40Z

What do you think?

That we are spending way too much time thinking about this :P

Let's just say, clearly, in the docstring that we divide by log(alphabet_length, and that this corresponds to maximum value for the two entropies we already have, and be done with it.

Datseris · 2022-09-25T09:44:27Z

new entropies will get same treatment, same division, and the user needs to be educated about whether this makes sense for them I think...

kahaaga · 2022-09-25T09:46:08Z

That we are spending way too much time thinking about this :P

Fair enough 😆

Let's just say, clearly, in the docstring that we divide by log(alphabet_length, and that this corresponds to maximum value for the two entropies we already have, and be done with it.

Ok, done. I'll tag you once I've implemented the new interface.

kahaaga · 2022-09-26T13:32:58Z

@Datseris Changes since the last comments:

alphabet_length(est) throws error when the alphabet length is not defined for est.
Moved alphabet_length functions to the individual estimator files.
Added analytical tests for Tsallis entropy, based on the original paper. It turned out Tsallis himself had a more general normalization that I use instead (works for all q). Then, dividing by log(alphabet_length(est)) would not be the most general normalization, so I chose to divide by the entropy of a manually constructed flat distribution instead, like you initially proposed, so we do the same for all estimators. It is still possible to dispatch on particular estimators if we need to reduce the computational cost of the extra allocation and entropy computation.
Normalization interface entropy_normalization as discussed above, with signatures covering the use cases I imagine I'd want as a user.
Documentation for entropy_normalization includes a note on the output range for different types of entropies.
Utility functions maxentropy_tsallis, maxentropy_renyi, maxentropy_shannon (listed under "Utility methods" in the docs) that can be used for manual normalization.

Datseris · 2022-09-26T16:02:56Z

src/entropies/normalized_entropy.jl

+    entropy_normalized(f::Function, p::Probabilities, est::ProbabilitiesEstimator, args...;
+        kwargs...)
+
+Normalize the entropy, as returned by the entropy function `f` called with the given
+arguments (i.e. `f(p, args...; kwargs...)`), to the entropy of a uniform distribution,
+inferring [`alphabet_length`](@ref) from `est`.


I'm not convinced that this method is useful.... If the user has the est they may as well use the first method.

We can just remove it. The user can just normalize to log(N) manually if they need to.

Wait, I'm not so sure of that. If we remove this method, then:

Say I have a set of probabilities p = Probabilities([0.2, 0.2, 0.3, 0.3]) that I have already computed with a known estimator, but I don't have the original data for. I now want to compute 1) Renyi entropy and 2) normalized Renyi entropy. Without this method, it is easy to compute 1), but no convenient way to compute 2), without re-implementing precisely this function does myself. That is: I'd always have to start from raw data if wanting to compute normalized entropy.

What do you mean, you do entropy_renyi(p)/log(alpgabet_length(est)). since you have the estimator but not the original data.

What do you mean, you do entropy_renyi(p)/log(length(p)). What's the problem :D

See this comment.

Datseris · 2022-09-26T16:07:53Z

src/entropies/renyi.jl

+
+See also [`entropy_renyi`](@ref), [`entropy_normalized`](@ref).
+"""
+maxentropy_renyi(N::Int; base = MathConstants.e) = log(base, N)


Why would we need such a function? I'm not convinced that providing a wrapper function over log is useful either :D

Because maximum entropy in itself is a useful thing to be able to compute. For Renyi entropy, it is trivially log(N), but for other entropies, it is not trivial at all. For stretched exponential entropy, for example, which I've prepared in a non-committed PR, the maximum entropy is a function of the parameters of the method and involves complicated expression involving gamma and incomplete gamma functions.
For Tsallis entropy, to be valid for all parameter values, the maxentropy is no longer log(N) but a more complicated expression too.

I think there should be a convenient way for the user to compute this quantity, to also be able to work with the entropies from a theoretical standpoint. All analytical tests for this stretched exponential entropy, and the example reproducing the original paper, for example, involve using maxentropy_stretched_exponential.

The maximum entropy is a fundamental property of entropies. I don't see why we shouldn't provide this functionality. This is after all package, or "library", that is dedicated specifically to computing entropies and related quantities. I don't think pruning is necessary just because we don't see an immediate use for the method at the moment.

When we end up with a complete library of generalized entropies (I think I've seen ca. 10 different ones until now), it would be weird not to provide functionality to get relevant parameters/values for all of them (i.e. not providing maxentropy for Renyi, but for all other entropies).

Okay, I see.

How about a compromise between providing such methods, but also this library not reaching 1,000 exported names, by extending an already existing function. Specifically, the maximum. So the interface would be,

maximum(entropy_function, estimator)

and we add dispatch for the entropy_function.

Actually, this whole business makes me think if we should make "entropy types". We can have a single entropy(...) function, and we have "entropy types" such as Renyi() and Tsallis(). So the renyi entropy for a given estimator would be entropy(Renyi(), x, est) This way we can actually call maximum(Renyi(), PermutationX()) instead of passing functions. The estimators can have arguments such as Renyi(q) for the order q.

I vote for the last suggestion and I am now more convinced of the entropy types you suggested at some point but we decided not to go with...

If you also vote for that I'd like to do it after I merge this PR.

Actually, this whole business makes me think if we should make "entropy types". We can have a single entropy(...) function, and we have "entropy types" such as Renyi() and Tsallis(). So the renyi entropy for a given estimator would be entropy(Renyi(), x, est) This way we can actually call maximum(Renyi(), PermutationX()) instead of passing functions. The estimators can have arguments such as Renyi(q) for the order q.

Yes, going for the dispatch version would perhaps be better once the codebase grows. I wasn't aware to begin with how many types of entropies there actually are, but they are in fact plentiful.

If the rest of this PR looks ok to you, maybe we just merge this PR (in the current state) and the original one to main. Then, as you say, we can implement the multiple dispatch entropy(EntropyType(), Estimator()) version in a separate PR.

If this is merged quickly, I can put together a PR tonight for the new version, because I'd like to get this off the table as soon as possible and arrive at a stable API.

Datseris · 2022-09-26T16:09:14Z

I'd still cut down on content here. E.g, i don't think providing maxentropy_renyi, which literally is the log, makes sense. we can mention it in the docstring isteadn. Let me know if you want me to do the pruning here or if yo uwant to continue working here instead.

kahaaga · 2022-09-26T16:23:28Z

I'd still cut down on content here. E.g, i don't think providing maxentropy_renyi, which literally is the log, makes sense. we can mention it in the docstring isteadn. Let me know if you want me to do the pruning here or if yo uwant to continue working here instead.

See comment above

Datseris · 2022-09-26T16:32:30Z

Let me do some pruning here, and if I prune too much, you let me know and we add back the things you believe are useful.

Datseris · 2022-09-26T17:01:58Z

@kahaaga I'd prefer to make the first draft for the new API if that's okay with you and you can review it instead. I'll have it done by tonight.

Datseris · 2022-09-26T17:02:35Z

@kahaaga Actually, wait, I'm confused, where was this PR just got merged into? Not at the api_refactor, right? Because I was going to update that PR to the new API!

kahaaga · 2022-09-26T17:03:33Z

@kahaaga Actually, wait, I'm confused, where was this PR just got merged into? Not at the api_refactor, right? Because I was going to update that PR to the new API!

This was merged into the dispersion_estimators branch.

@kahaaga I'd prefer to make the first draft for the new API if that's okay with you and you can review it instead. I'll have it done by tonight.

Perfect.

Datseris · 2022-09-26T17:03:48Z

Forget everything I said, I haven't realized that refactor_api was already merged into main. We should merge #96 into main as well and I start a new pr with the new, new API.

kahaaga · 2022-09-26T17:04:13Z

Forget everything I said, I haven't realized that refactor_api was already merged into main. We should merge #96 into main as well and I start a new pr with the new, new API.

Yes, precisely.

* Dispersion and reverse dispersion probability estimators * Fix tests * ReverseDispersion should be a complexity measure, plus addressing some comments * Export reverse_dispersion * Reverse dispersion * Remove reference to `ReverseDispersion` * Update docstring * Add analytical tests * Improve tests and add doc examples * Fix tests * Better docs and doctests * Remove file * Fix tests * Update src/complexity_measures/reverse_dispersion_entropy.jl Co-authored-by: George Datseris <[email protected]> * Addressing review comments (#99) * Address review comments * Throw error * Remove non-used field * Fix tests * Merge docs and move individual methods to their respective files * Tsallis reduces to Shannon entropy for q -> 1. * Normalized entropy API, including utility functions * Analytical test cases for Tsallis * Include `k` in `maxentropy_tsallis` Co-authored-by: George Datseris <[email protected]>

kahaaga added 4 commits September 24, 2022 21:21

Address review comments

499a9dc

Throw error

c1e4d70

Remove non-used field

dde5dd4

Fix tests

729eb05

Datseris reviewed Sep 24, 2022

View reviewed changes

kahaaga added 4 commits September 26, 2022 15:25

Merge docs and move individual methods to their respective files

eb48ab3

Tsallis reduces to Shannon entropy for q -> 1.

b20dc82

Normalized entropy API, including utility functions

19e0765

Analytical test cases for Tsallis

633dcb1

Include k in maxentropy_tsallis

b595031

Datseris reviewed Sep 26, 2022

View reviewed changes

Datseris marked this pull request as ready for review September 26, 2022 17:01

Datseris merged commit 9391e9f into newapi/dispersion_estimators Sep 26, 2022

Datseris deleted the kah/dispersion_estimators_fixes branch September 26, 2022 17:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Addressing review comments #99

Addressing review comments #99

kahaaga commented Sep 24, 2022 •

edited

Loading

codecov bot commented Sep 24, 2022 •

edited

Loading

Datseris left a comment

Datseris Sep 24, 2022

Datseris Sep 24, 2022

Datseris Sep 24, 2022

Datseris commented Sep 24, 2022

kahaaga commented Sep 24, 2022

Datseris commented Sep 24, 2022

kahaaga commented Sep 25, 2022 •

edited

Loading

Datseris commented Sep 25, 2022

kahaaga commented Sep 25, 2022 •

edited

Loading

Datseris commented Sep 25, 2022

Datseris commented Sep 25, 2022

kahaaga commented Sep 25, 2022

kahaaga commented Sep 26, 2022 •

edited

Loading

Datseris Sep 26, 2022

kahaaga Sep 26, 2022

kahaaga Sep 26, 2022

Datseris Sep 26, 2022 •

edited

Loading

kahaaga Sep 26, 2022

Datseris Sep 26, 2022

kahaaga Sep 26, 2022 •

edited

Loading

Datseris Sep 26, 2022 •

edited

Loading

Datseris Sep 26, 2022

Datseris Sep 26, 2022

kahaaga Sep 26, 2022

Datseris commented Sep 26, 2022 •

edited

Loading

kahaaga commented Sep 26, 2022

Datseris commented Sep 26, 2022

Datseris commented Sep 26, 2022

Datseris commented Sep 26, 2022

kahaaga commented Sep 26, 2022

Datseris commented Sep 26, 2022

kahaaga commented Sep 26, 2022

		julia> est = SymbolicPermutation(m = 4)
		SymbolicPermutation{typeof(Entropies.isless_rand)}(1, 4, Entropies.isless_rand)

Addressing review comments #99

Addressing review comments #99

Conversation

kahaaga commented Sep 24, 2022 • edited Loading

Naming (normalization interface)

Normalization/alphabet length

codecov bot commented Sep 24, 2022 • edited Loading

Codecov Report

Datseris left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Datseris commented Sep 24, 2022

kahaaga commented Sep 24, 2022

Datseris commented Sep 24, 2022

kahaaga commented Sep 25, 2022 • edited Loading

Datseris commented Sep 25, 2022

kahaaga commented Sep 25, 2022 • edited Loading

Datseris commented Sep 25, 2022

Datseris commented Sep 25, 2022

kahaaga commented Sep 25, 2022

kahaaga commented Sep 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Datseris Sep 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kahaaga Sep 26, 2022 • edited Loading

Choose a reason for hiding this comment

Datseris Sep 26, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Datseris commented Sep 26, 2022 • edited Loading

kahaaga commented Sep 26, 2022

Datseris commented Sep 26, 2022

Datseris commented Sep 26, 2022

Datseris commented Sep 26, 2022

kahaaga commented Sep 26, 2022

Datseris commented Sep 26, 2022

kahaaga commented Sep 26, 2022

kahaaga commented Sep 24, 2022 •

edited

Loading

codecov bot commented Sep 24, 2022 •

edited

Loading

kahaaga commented Sep 25, 2022 •

edited

Loading

kahaaga commented Sep 25, 2022 •

edited

Loading

kahaaga commented Sep 26, 2022 •

edited

Loading

Datseris Sep 26, 2022 •

edited

Loading

kahaaga Sep 26, 2022 •

edited

Loading

Datseris Sep 26, 2022 •

edited

Loading

Datseris commented Sep 26, 2022 •

edited

Loading