Complexity measure API, v2 #133

kahaaga · 2022-10-18T09:01:10Z

The need for a common `complexity interface`

This issue supersedes #81.

I think we should have a common interface for computing various entropy-like complexity measures, such as reverse_dispersion (already implemented), sample_entropy (#71 ), approximate_entropy (#72), missing_dispersion_patterns (#124), statistical_complexity (#131), etc (currently open PRs).

It is fine to have the methods sample_entropy, reverse_dispersion, statistical_complexity, but I really, really like the generic entropy interface where we just dispatch on different Entropy methods and ProbabilitiesEstimators, and would like something similar for other complexity measures.

Deciding on an interface like this is important because:

Having a complexity interface would hugely simply the implementation of multiscale methods (Multiscale analysis #132).
We need to decide on how to deal with complexity measures in general before merging WIP: Statistical complexity #131, and the other open PRs.

Suggested interface

I propose the following:

abstract type ComplexityMeasure end

function complexity(c::ComplexityMeasure, x, args...; kwargs...)
# potentially also
function complexity_normalized(c::ComplexityMeasure, x, args...; kwargs...)

Every complexity measure can then just have its own complexity type that we can dispatch on, e.g.

struct StatisticalComplexity{A1, A2} <: ComplexityMeasure
    a1::A1
    a2::A2
    ...
end

function complexity(c::StatisticalComplexity, x)
    ....
end

What do you think, @ikottlarz and @Datseris ?

The text was updated successfully, but these errors were encountered:

Datseris · 2022-10-18T10:27:57Z

How does having a complexity function simplify the multiscale PR?

kahaaga · 2022-10-18T10:35:16Z

How does having a complexity function simplify the multiscale PR?

With a complexity function, the exported methods for multiscale would be:

multiscale(e::Entropy, alg::MultiScaleAlgorithm, x::AbstractVector, est::ProbabilitiesEstimator; kwargs...)
multiscale(c::ComplexityMeasure, alg::MultiScaleAlgorithm, x; kwargs...)

Without it, the exported functions would be

multiscale(e::Entropy, alg::MultiScaleAlgorithm, x::AbstractVector, est::ProbabilitiesEstimator; kwargs...)

multiscale_sampleentropy(alg::MultiScaleAlgorithm, x; kwargs...)
multiscale_approx_entropy(alg::MultiScaleAlgorithm, x; kwargs...)
multiscale_reverse_dispersion(alg::MultiScaleAlgorithm, x; kwargs...)
multiscale_statistical_complexity(alg::MultiScaleAlgorithm, x; kwargs...)
multiscale_missing_disp_patterns(alg::MultiScaleAlgorithm, x; kwargs...)
...

kahaaga · 2022-10-18T10:38:00Z

Moreover, there are multiple variations on, for example, multiscale sample entropy. With this approach, we can capture all variations of it using parameters of struct SampleEntropy <: ComplexityMeasure end, not needing to define custom methods with custom names and different keywords for each variation of the method.

Datseris · 2022-10-18T10:38:47Z

Okay I agree with the proposed API. I don;t agree with having both the complexity API, and individual functions for each complexity measure as we do with entropies_.... I'm kind of done with having all these "conveniecne" methods which don't really offer any convenience and just allow an alternative syntax.

kahaaga · 2022-10-18T10:53:39Z

Okay I agree with the proposed API.

Ok, good!

I don't agree with having both the complexity API, and individual functions for each complexity measure as we do with entropies_.... I'm kind of done with having all these "conveniecne" methods which don't really offer any convenience and just allow an alternative syntax.

I get the point. However, the issues previously raised about search engine visibility etc still apply. We've been insisting on keeping convenience methods for e.g. entropy_permutation. But looking at the number of citations for some of these complexity measures, they are actually used more than the permutation entropy:

Approximate entropy (Pincus, 1991). ~6200 citations.
Sample entropy (Richman et al., 2000). ~ 7100 citations.
Permutation entropy (Bandt & Pompe, 2002). ~3800 citations.

If argue that if we want to use the visibility argument for the case of specific entropies (probabilities), we should also apply it for the case of complexity measures. If not, we should just drop the convenience methods completely.

Datseris · 2022-10-18T11:07:45Z

However, the issues previously raised about search engine visibility etc still apply.

I don't get your point. Search engines work with keywords. In the docstring of whatever method you will have the keywords "Approximate entropy". Search engines won't find function names, because they don't have spaces and hence don't qualify the keywords the engines search for. We definitely have the keywords "permutation entropy" in the library.

If not, we should just drop the convenience methods completely.

From my perspective the reason to keep a few of these convenience methods is (1) for education purposes so that users learn that the permutation entropy is just a subclass of something more general and (2) for the future paper, so that we can highlight that, if we were using the approach of alternative software, we would offer 1000s upon 1000s of functions, but instead we manage the same amount of content with much less function definitions.

kahaaga · 2022-10-18T11:31:26Z

From my perspective the reason to keep a few of these convenience methods is (1) for education purposes so that users learn that the permutation entropy is just a subclass of something more general and (2) for the future paper, so that we can highlight that, if we were using the approach of alternative software, we would offer 1000s upon 1000s of functions, but instead we manage the same amount of content with much less function definitions.

Excellent point.

Shall we keep the current entropy convenience methods, and perhaps add one or two convenience methods (for the sake of educational purposes) complexity measures too? sample_entropy and approximate_entropy are good candidates, since they are so widely used.

With only two functions (entropy and entropy_normalized) , we provideN = (2 * nP * nE * nM) + 2*nD ways of estimating entropy, where nP is the number of ProbabilitiesEstimators, nE is the number of Entropys, nM is the number of multiscale sampling schemes, and nD is the number of direct (Shannon) entropy estimation methods. We currently have 12 probabilities estimators, 5 entropy types, 2 multiscale sampling schemes, and 2 direct entropy estimators, yielding N = (2 * 5 * 12 * 2) + 2*2 = 244 currently possible ways of estimating entropy, not including parameter variations, if all methods have normalised versions.

A simple (nonexhaustive) table in the documentation could replace many of the convenience methods, and can be a good way of ensuring all common names appear somewhere in the documentation. This may also serve as a starting point for a table listing common literature methods for our paper, e.g.

Method	Syntax	Reference
Permutation entropy	`entropy(x, SymbolicPermutation())`	Bandt & Pompe (2002)
Dispersion entropy	`entropy(x, Dispersion())`	Rostaghi et al. (2016)
Composite multiscale dispersion entropy	`multiscale(Dispersion(), Composite(), x)`	Azami et al. (2017)

We'll reach similar numbers of possible combinations for the complexity measures in the future, too, I guess. A similar table could be useful.

Method	Syntax	Reference
Reverse dispersion entropy	`complexity(ReverseDispersion(), x)`	Gao & Want (2002)
StatisticalComplexity	`complexity(StatisticalComplexity(), x)`	Rosso et al. (2016)

Datseris · 2022-10-18T11:51:36Z

I like this a lot! The table is great! Only change I would do is replace SymbolicPermutation() with SymbolicPermutation(...) so that it hints that there are options.

Okay let's keep the sample_entropy and approxiamte_entropy as the only "convenience complexity functions".

kahaaga · 2022-10-18T11:56:04Z

I like this a lot! The table is great! Only change I would do is replace SymbolicPermutation() with SymbolicPermutation(...) so that it hints that there are options.
Okay let's keep the sample_entropy and approximate_entropy as the only "convenience complexity functions".

Great!

@ikottlarz this will affect your PR #131. Do the designs of complexity and complexity_normalized functions above sound reasonable to you?

I'll submit a PR within the hours where I re-do the reverse dispersion method with the new API, so you can see how it'll look in the end.

ikottlarz · 2022-10-18T12:02:06Z

@kahaaga this sounds good to me. I'm currently working on your reviews, and will reorganize my code accordingly!

kahaaga · 2022-10-19T08:39:19Z

Closed by #134

kahaaga added the discussion-design Discussion or design matters label Oct 18, 2022

kahaaga mentioned this issue Oct 18, 2022

Multiscale analysis #132

Merged

This was referenced Oct 18, 2022

Complexity API #134

Merged

Add tables with currently available literature methods for entropies and complexity measures. #135

Open

kahaaga closed this as completed Oct 19, 2022

kahaaga mentioned this issue Oct 23, 2022

Approximate entropy #72

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complexity measure API, v2 #133

Complexity measure API, v2 #133

kahaaga commented Oct 18, 2022 •

edited

Loading

Datseris commented Oct 18, 2022

kahaaga commented Oct 18, 2022

kahaaga commented Oct 18, 2022

Datseris commented Oct 18, 2022

kahaaga commented Oct 18, 2022 •

edited

Loading

Datseris commented Oct 18, 2022

kahaaga commented Oct 18, 2022 •

edited

Loading

Datseris commented Oct 18, 2022

kahaaga commented Oct 18, 2022

ikottlarz commented Oct 18, 2022

kahaaga commented Oct 19, 2022

Complexity measure API, v2 #133

Complexity measure API, v2 #133

Comments

kahaaga commented Oct 18, 2022 • edited Loading

The need for a common complexity interface

Suggested interface

Datseris commented Oct 18, 2022

kahaaga commented Oct 18, 2022

kahaaga commented Oct 18, 2022

Datseris commented Oct 18, 2022

kahaaga commented Oct 18, 2022 • edited Loading

Datseris commented Oct 18, 2022

kahaaga commented Oct 18, 2022 • edited Loading

Datseris commented Oct 18, 2022

kahaaga commented Oct 18, 2022

ikottlarz commented Oct 18, 2022

kahaaga commented Oct 19, 2022

kahaaga commented Oct 18, 2022 •

edited

Loading

The need for a common `complexity interface`

kahaaga commented Oct 18, 2022 •

edited

Loading

kahaaga commented Oct 18, 2022 •

edited

Loading