-
Notifications
You must be signed in to change notification settings - Fork 14
Spatiotemporal Permutation Entropy #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #78 +/- ##
==========================================
+ Coverage 79.23% 79.33% +0.09%
==========================================
Files 22 23 +1
Lines 631 663 +32
==========================================
+ Hits 500 526 +26
- Misses 131 137 +6
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Okay, code is working now. Only adding tests remain and comparing performance with #74 . cc @ikottlarz and @kahaaga feel free to review the API. |
Alright, the code here is a clear speedup of 10x:
with code: # %%
stencil = CartesianIndex.([(0,1), (0,1), (1,1)])
x = rand(100, 100)
est = SpatiotemporalPermutation(stencil, x, false)
p = probabilities(x, est)
using BenchmarkTools
@btime probabilities($x, $est)
# %% old code
using Entropies
x = rand(100, 100)
stencil = PermutationStencil(2, 2)
probabilities(x, stencil)
using BenchmarkTools
@btime probabilities($x, $stencil) |
Excellent! Then it's a no-brainer that we'll go for this approach. Nice job, @Datseris. I'll get to reviewing this shortly. |
I added the same tests you had, but they are fail, so maybe I did something wrong and what I thought was the same as your version (with the 2x2 boolean matrix as stencil), actually isn't. |
Do you want to debug a bit more before I look at this? If not, I can have a go at figuring out why tests fail. May it have something to do with the ordering of the points obtained when using stencils? I guess the frequencies of symbols would not be the same going down-right versus some other direction, unless the input data was perfectly symmetric. |
I'm confident the code does what I want it to do up to the point I pass it to |
One question: I am a bit confused about stencil construction and the fact that "the pixel itself is always included", when stencils are constructed to the bottom and right from the pixel.
[1 1;
1 1] But if the pixel itself is included (assuming the start pixel is in the top left corner), how do I create a stencil like the following? [0 1;
1 1] |
You would instead create |
Let me clarify what my implementation did. Say I have the following input array: 1 2 3 4;
5 6 7 8;
9 10 11 12;
13 14 15 16; If using a 3*3 stencil where all elements of the stencil are included, say (ignoring the cartesian index construction for simplicity) 1 1 1
1 1 1
1 1 1 Starting in the top left corner, then going down-right, then that would give me four sub-matrices: 1 2 3
5 6 7
9 10 11 5 6 7
9 10 11
13 14 15 2 3 4
6 7 8
10 11 12 6 7 8
10 11 12
14 15 16 From which the resulting motifs to be symbolized are (starting top left, then down and right) However, using the following 3x3 stencil on the same input 0 0 1
1 0 0
0 1 1 I'd get the following motifs: Is that what you're doing too in your implementation? |
So for the following stencil, would one have to start from, say, the top left corner when defining the stencil? Since the pixel itself is included, I guess it wouldn't be possible to construct the stencil with the center zero as the origin?
|
Yes, that's what you should get. I guess this is a good analytic test to add to the test suite. I'm waiting for the symbolize refactor so we can add a
Stencils are translationally invariant, so you can define the same stencil with as many ways as there are valid pixels. You could also start in the lower right if you wanted to. |
1 1 1
1 0 1
1 1 1 If a user decided to use the center of this square as the origin when defining the stencil, that would not be possible for this particular example, because the center would be an included pixel in the resulting stencil? I.e. with the center as the origin, 1 1 1
1 1 1
1 1 1 and not (as desired) 1 1 1
1 0 1
1 1 1 Is that right? |
Yeah, well, the user should not make this choice, and choose any of the other 8 pixels :D |
Do you prefer the alternative of forcing users to add the 0-offset index? In all of my experience in permutation entropies, it does not make sense to not include current pixel. |
Okay, then I follow! Perhaps it would be nice to provide a convenience constructor, similar to the one I had, where the user could provide a (hyper)rectangular array of 1s and 0s, that would be automatically translated to the relevant cartesian indices, filtering out the ones where there are 0s in the input array? Is it possible for the user to type out the indices manually? Yes, absolutely. But is it slightly annoying? Yes, for me at least 😁 I can make the extra constructor if you don't want to spend time on it. |
I'd like a tuple input of ranges like |
Seems like we are 100 to 1000 times faster than https://github.com/arthurpessa/ordpy which, to my astonishment, has a Chaos publication associated with it. I didn't know Chaos accepted such "trivial" papers. |
Great! I'm surprised that the speed-ups are so substantial. I think we should take this as a sign for us to, soon, actually write up a paper on this software too. There's of course some API polishing to do, decisions regarding documentation improvements/centralization, and some more obvious methods that should be included, but I think we're getting close to something which I would feel is acceptable for publication. |
Ok, if ok with you, I can add the tuple version while doing the matrix version too. If you're done with your contributions to this PR, I'll finish up the symbolization refactoring, add the convenience constructors, and make sure that everything works with the updated symbolization API. Does that sound good, @Datseris ? |
Ok, please do symbolic refactoring in a different PR though. This PR is very clean and I'd very much like to keep it this way. |
Closes #73
An alternative solution to #74 with really, really small source code and 10x better performance. It uses knowledge from Agents.jl.