Skip to content

Finish API #187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Dec 19, 2022
Merged

Finish API #187

merged 16 commits into from
Dec 19, 2022

Conversation

Datseris
Copy link
Member

@Datseris Datseris commented Dec 17, 2022

Closes #185, closes #184

Implements the final version of the API which is:

  • algorithm deciding arguments first, input data after.
  • Also applies for encode/decode.
  • Use funtions not functors

Also re-writes FixedRectangularBinning to have welldefined outcome space always.

@Datseris Datseris marked this pull request as ready for review December 19, 2022 08:15
@Datseris
Copy link
Member Author

@kahaaga what's up with the signature entropy([e::Entropy,] est::EntropyEstimator, x), why does it exist? How can a user specify an entropy definition and an entropy estimator? The estimator by definition computes a specific version of entropy, since its an estimator. If it wasn't an estimator of an entropy directly, then surely it must be computing a probability density, and hence it isn't an entropy estimator, but a probability estimator.

Are we complicating things too much here? Why would we even care to allow this. I say we should have entropy(est::Estimator, x) and make clear that this is a completely separated method.

@Datseris
Copy link
Member Author

@kahaaga Unfortunately I have to change FixedRectangularBinning in this PR. It is a nightmare to allow it to both have "actuall fixed" size by providing εmin/max for each dimension, but also allow it to deduce the dimensions by itself and hence having arbitrary dimension. I'll make a different convenience method for that. For FixedRectangularBinning the input dimension must always be given clearly so that the struct always has well defined outcome space without requiring the input.

@Datseris Datseris changed the title Finish API (WIP) Finish API Dec 19, 2022
@kahaaga
Copy link
Member

kahaaga commented Dec 19, 2022

@kahaaga what's up with the signature entropy([e::Entropy,] est::EntropyEstimator, x), why does it exist? How can a user specify an entropy definition and an entropy estimator? The estimator by definition computes a specific version of entropy, since its an estimator. If it wasn't an estimator of an entropy directly, then surely it must be computing a probability density, and hence it isn't an entropy estimator, but a probability estimator.
Are we complicating things too much here? Why would we even care to allow this. I say we should have entropy(est::Estimator, x) and make clear that this is a completely separated method.

I think we should keep the three-argument version. This is related to the answer I linked in #184. In short:

  • A consistent three-argument signature that works for both discrete entropy (i.e. ProbabilitiesEstimators) and differential/continuous entropy (i.e. EntropyEstimators) allows seamless construction of plug-in estimators for any entropy-based information measure in CausalityTools. One method signature = one unified wrapper, not many wrappers.
  • A single estimator (i.e. from a single research paper) can be used to compute different quantities. For example, the LeonenkoProzantoSavani <: EntropyEstimator (which currently only exists on a dev branch of CausalityTools) can compute Shannon, Renyi and Tsallis entropies (see screenshot), all based on the estimation of the same underlying integral. The alternative would be to define LeonenkoProzantoSavaniShannon, LeonenkoProzantoSavaniTsallis, LeonenkoProzantoSavaniRenyi, which I think is much more messy, when the underlying estimators are identical, up to some scaling factor.

Screenshot 2022-12-19 at 10 50 48

Screenshot 2022-12-19 at 10 46 49

@kahaaga
Copy link
Member

kahaaga commented Dec 19, 2022

Follow-up to the last post: The reason I wouldn't add separate LeonenkoProzantoSavaniShannon, LeonenkoProzantoSavaniRenyi, LeonenkoProzantoSavaniTsallis estimators is the same reason I wouldn't add separate TransferOperatorTsallis, TransferOperatorRenyi, TransferOperatorShannon probability estimators.

EntropyEstimators implicitly or explicitly estimate some integral (usually a density, but not always). This integral estimate is then plugged into some formula. In the case of the LeonenkoProzantoSavani estimator, the formula for Shannon, Renyi and Tsallis entropy are different, both all contain the same integral, so the same estimator is used to compute all three quantities.

@kahaaga
Copy link
Member

kahaaga commented Dec 19, 2022

Unfortunately I have to change FixedRectangularBinning in this PR. It is a nightmare to allow it to both have "actuall fixed" size by providing εmin/max for each dimension, but also allow it to deduce the dimensions by itself and hence having arbitrary dimension. I'll make a different convenience method for that. For FixedRectangularBinning the input dimension must always be given clearly so that the struct always has well defined outcome space without requiring the input.

Ok, no problem. As long as the public API has an easy way to specify a fixed grid in multiple dimensions, the specifics of the underlying implementation doesn't really matter.

@Datseris
Copy link
Member Author

ok then I merge this. In a separate PR we should separate the documentation of the entropy with hte entropy estimators into a different documentation block. especially since you say they compute differential entropy.

@Datseris Datseris merged commit a6a5d02 into main Dec 19, 2022
@Datseris Datseris deleted the finish_api branch December 19, 2022 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make all probability estimators have an explicit outcome space? Current status of API is unclear
2 participants