Skip to content

Entropy signature and folder structure cleanup #168

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Dec 2, 2022
Merged

Conversation

kahaaga
Copy link
Member

@kahaaga kahaaga commented Nov 9, 2022

Work in progress. Addresses #167. I will update this when #162 is merged.

Changes

  • Small bug fix for Zhu and ZhuSingh estimators: the base belongs in the entropy struct, not in the estimator.

  • IndirectEntropy is now EntropyEstimator. Documentation strings for subtypes of EntropyEstimator explicitly say what type of entropy they estimate. For example, the docstring for Kraskov says that it estimates Shannon entropy.

  • Folder structures, both for source code and tests, are changed to reflect this conceptual change.

    1. Entropy estimator B lives in folder src/entropies/estimators/..., and its tests live in test/entropies/estimators/B.jl
    2. Probability estimator A lives in folder src/probabilities_estimators/A, and its tests live in test/probabilities/estimators/A.jl.

Order of arguments

See discussion above for justification for the following changes.

Entropy API is now

  • entropy(e::Entropy, est::EntropyEstimator, x)
  • entropy(e::Entropy, est::ProbabilityEstimator, x)
  • entropy(e::Entropy, probs::Probabilities)

entropy_maximum and entropy_normalized have been adjusted accordingly.

Probabilities API is now

  • probabilities(est::ProbabilitiesEstimator, x)

TODO:

  • Update documentation
  • Update tests and make sure they pass.

@kahaaga kahaaga added the cleanup label Nov 9, 2022
@codecov
Copy link

codecov bot commented Nov 9, 2022

Codecov Report

Merging #168 (ce728ad) into main (cce1c15) will increase coverage by 0.92%.
The diff coverage is 90.21%.

@@            Coverage Diff             @@
##             main     #168      +/-   ##
==========================================
+ Coverage   84.01%   84.94%   +0.92%     
==========================================
  Files          37       37              
  Lines         951      963      +12     
==========================================
+ Hits          799      818      +19     
+ Misses        152      145       -7     
Impacted Files Coverage Δ
src/deprecations.jl 0.00% <0.00%> (ø)
.../estimators/nearest_neighbors/nearest_neighbors.jl 100.00% <ø> (ø)
...es/estimators/order_statistics/order_statistics.jl 100.00% <ø> (ø)
...ities_estimators/histograms/rectangular_binning.jl 80.89% <ø> (ø)
src/entropy.jl 69.69% <72.22%> (-2.18%) ⬇️
src/probabilities.jl 58.62% <75.00%> (ø)
src/entropies/convenience_definitions.jl 100.00% <100.00%> (ø)
...estimators/nearest_neighbors/KozachenkoLeonenko.jl 100.00% <100.00%> (ø)
.../entropies/estimators/nearest_neighbors/Kraskov.jl 100.00% <100.00%> (ø)
src/entropies/estimators/nearest_neighbors/Zhu.jl 100.00% <100.00%> (ø)
... and 21 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Datseris
Copy link
Member

Datseris commented Nov 10, 2022

In summary, entropy API is now

entropy(e::Entropy, est::EntropyEstimator, x)
entropy(e::Entropy, est::ProbabilityEstimator, x) 

Why make a change to the existing API, that was entropy(e::Entropy, x, est::ProbabilityEstimator)? What was wrong with that? Can we focus on minimizing the change here, given that this seems to be a change that is up to preference without much practical implications? Or there are practical arguments to change the whole API? entropy(e::Entropy, x, est::EntropyEstimator) would be the only change if we don't change hte probabilities stuff.

@kahaaga
Copy link
Member Author

kahaaga commented Nov 11, 2022

Why make a change to the existing API, that was entropy(e::Entropy, x, est::ProbabilityEstimator)? What was wrong with that? Can we focus on minimizing the change here, given that this seems to be a change that is up to preference without much practical implications? Or there are practical arguments to change the whole API? entropy(e::Entropy, x, est::EntropyEstimator) would be the only change if we don't change hte probabilities stuff.

Two reasons:

  1. Following Julia's style guide, which requires writing functions with argument ordering similar to Julia Base. This means types first, then input data. We make a huge effort to write good, clear and consistently formatted code. There's no reason why this shouldn't apply for entropy & friends.

  2. Having the input data as the last argument makes it easier to provide multiple datasets. Then one can do entropy(Shannon(), Kraskov(), x, y, z, ...) to compute the joint entropy between x, y, z, ... . This simplifies code upstream in CausalityTools too.

  3. The order of the arguments for CausalityTools functions in V2 will be someinfomeasure(::TypeOfMeasure, ::Estimator, input_x, input_y, ....) (because reasons 1 and 2). It would be nice if we speak the same language across packages.

@kahaaga
Copy link
Member Author

kahaaga commented Nov 11, 2022

That said, which I completely forgot, because I don't explicitly use it in CausalityTools at the moment - we should also change the order of the arguments for probabilities.

I don't see any downside of doing so, other than "it is a change". But we're releasing 2.0, so we might as well be consistent everywhere.

@Datseris
Copy link
Member

ok, are you changing the order here? There is a massive git conflict.

@kahaaga
Copy link
Member Author

kahaaga commented Nov 11, 2022

ok, are you changing the order here?

Yes, I will change the order here.

There is a massive git conflict.

Yes, I think that's because of the folder structure change. However, that can easily be fixed. I'll fix it as I do the remaining changes for the probabilities.

@kahaaga kahaaga marked this pull request as ready for review November 17, 2022 13:05
@kahaaga kahaaga requested a review from Datseris November 17, 2022 13:14
@kahaaga
Copy link
Member Author

kahaaga commented Nov 17, 2022

@Datseris The refactoring, according to our discussion above, is now ready. All tests pass, and the documentation looks good.

The git diff looks massive due to folder structure changes, but in reality, there aren't really any changes except for the ordering of method arguments. Exact overview is found in my updated initial comment.

@kahaaga kahaaga changed the title WIP: Entropy signature Entropy signature and folder structure cleanup Nov 17, 2022
@kahaaga kahaaga added this to the 2.0 milestone Nov 17, 2022
@kahaaga kahaaga mentioned this pull request Nov 30, 2022
@kahaaga kahaaga merged commit 3c705f5 into main Dec 2, 2022
@kahaaga kahaaga deleted the entropy_signature branch December 2, 2022 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants