Skip to content

Spatiotemporal Permutation Entropy #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Sep 6, 2022
Merged

Spatiotemporal Permutation Entropy #78

merged 12 commits into from
Sep 6, 2022

Conversation

Datseris
Copy link
Member

@Datseris Datseris commented Sep 4, 2022

Closes #73

An alternative solution to #74 with really, really small source code and 10x better performance. It uses knowledge from Agents.jl.

@codecov
Copy link

codecov bot commented Sep 4, 2022

Codecov Report

Merging #78 (982ec33) into main (875cfbe) will increase coverage by 0.09%.
The diff coverage is 79.41%.

❗ Current head 982ec33 differs from pull request most recent head 0bacc58. Consider uploading reports for the commit 0bacc58 to get more accurate results

@@            Coverage Diff             @@
##             main      #78      +/-   ##
==========================================
+ Coverage   79.23%   79.33%   +0.09%     
==========================================
  Files          22       23       +1     
  Lines         631      663      +32     
==========================================
+ Hits          500      526      +26     
- Misses        131      137       +6     
Impacted Files Coverage Δ
src/symbolic/symbolic.jl 0.00% <ø> (ø)
src/core.jl 57.14% <50.00%> (ø)
src/symbolic/spatial_permutation.jl 81.25% <81.25%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Datseris
Copy link
Member Author

Datseris commented Sep 4, 2022

Okay, code is working now. Only adding tests remain and comparing performance with #74 . cc @ikottlarz and @kahaaga feel free to review the API.

@Datseris
Copy link
Member Author

Datseris commented Sep 4, 2022

Alright, the code here is a clear speedup of 10x:

# PR 78
  256.600 μs (9 allocations: 230.50 KiB)
# PR 73
  2.166 ms (9839 allocations: 385.72 KiB)

with code:

# %%
stencil = CartesianIndex.([(0,1), (0,1), (1,1)])
x = rand(100, 100)
est = SpatiotemporalPermutation(stencil, x, false)
p = probabilities(x, est)
using BenchmarkTools
@btime probabilities($x, $est)

# %% old code
using Entropies
x = rand(100, 100)
stencil = PermutationStencil(2, 2)
probabilities(x, stencil)
using BenchmarkTools
@btime probabilities($x, $stencil)

@kahaaga
Copy link
Member

kahaaga commented Sep 4, 2022

Alright, the code here is a clear speedup of 10x:

Excellent! Then it's a no-brainer that we'll go for this approach. Nice job, @Datseris.

I'll get to reviewing this shortly.

@Datseris
Copy link
Member Author

Datseris commented Sep 4, 2022

I added the same tests you had, but they are fail, so maybe I did something wrong and what I thought was the same as your version (with the 2x2 boolean matrix as stencil), actually isn't.

@kahaaga
Copy link
Member

kahaaga commented Sep 5, 2022

I added the same tests you had, but they are fail, so maybe I did something wrong and what I thought was the same as your version (with the 2x2 boolean matrix as stencil), actually isn't.

Do you want to debug a bit more before I look at this? If not, I can have a go at figuring out why tests fail.

May it have something to do with the ordering of the points obtained when using stencils? I guess the frequencies of symbols would not be the same going down-right versus some other direction, unless the input data was perfectly symmetric.

@Datseris
Copy link
Member Author

Datseris commented Sep 5, 2022

Do you want to debug a bit more before I look at this? If not, I can have a go at figuring out why tests fail.

I'm confident the code does what I want it to do up to the point I pass it to Entropies.encode_motif, so please do review. I've tried both versions of the sequence of indices in the stencil but it didn't change the outcome.

@kahaaga
Copy link
Member

kahaaga commented Sep 5, 2022

One question:

I am a bit confused about stencil construction and the fact that "the pixel itself is always included", when stencils are constructed to the bottom and right from the pixel.

stencil = CartesianIndex.([(0,1), (0,1),(1,1)]) gives the following stencil

[1 1;
1 1]

But if the pixel itself is included (assuming the start pixel is in the top left corner), how do I create a stencil like the following?

[0 1;
1 1]

@Datseris
Copy link
Member Author

Datseris commented Sep 5, 2022

But if the pixel itself is included (assuming the start pixel is in the top left corner), how do I create a stencil like the following?

You would instead create stencil = CartesianIndex.([(0,-1),(-1,0)]) that goes up and left. I have not found any kind of stencil that makes conceptual sense without including the "current" pixel.

@kahaaga
Copy link
Member

kahaaga commented Sep 5, 2022

Let me clarify what my implementation did.

Say I have the following input array:

1  2  3  4;
5  6  7  8;
9  10 11 12;
13 14 15 16;

If using a 3*3 stencil where all elements of the stencil are included, say (ignoring the cartesian index construction for simplicity)

1 1 1
1 1 1
1 1 1

Starting in the top left corner, then going down-right, then that would give me four sub-matrices:

1  2  3
5  6  7
9  10 11
5  6  7
9  10 11
13 14 15
2  3  4
6  7  8
10 11 12
6  7  8
10 11 12
14 15 16

From which the resulting motifs to be symbolized are (starting top left, then down and right) [1, 5, 9, 2, 6, 10, 3, 7, 11], [5, 9, 13, 6, 10, 14, 7, 11, 15], [2, 6, 10, 3, 7, 11, 4, 8, 12], [6, 10, 14, 7, 11, 15, 8, 12, 16].

However, using the following 3x3 stencil on the same input

0 0 1
1 0 0
0 1 1

I'd get the following motifs:
[5, 10, 3, 11]
[9, 14, 7, 15]
[6, 11, 4, 12]
[10, 15, 8, 16].

Is that what you're doing too in your implementation?

@kahaaga
Copy link
Member

kahaaga commented Sep 5, 2022

You would instead create stencil = CartesianIndex.([(0,-1),(-1,0)]) that goes up and left. I have not found any kind of stencil that makes conceptual sense without including the "current" pixel.

So for the following stencil, would one have to start from, say, the top left corner when defining the stencil? Since the pixel itself is included, I guess it wouldn't be possible to construct the stencil with the center zero as the origin?

1 1 1
1 0 1
1 1 1

@Datseris
Copy link
Member Author

Datseris commented Sep 5, 2022

I'd get the following motifs:

Yes, that's what you should get. I guess this is a good analytic test to add to the test suite. I'm waiting for the symbolize refactor so we can add a symbolize function here and can call it later in the test suite.

So for the following stencil, would one have to start from, say, the top left corner when defining the stencil

Stencils are translationally invariant, so you can define the same stencil with as many ways as there are valid pixels. You could also start in the lower right if you wanted to.

@kahaaga
Copy link
Member

kahaaga commented Sep 5, 2022

Stencils are translationally invariant, so you can define the same stencil with as many ways as there are valid pixels. You could also start in the lower right if you wanted to.

1 1 1
1 0 1
1 1 1

If a user decided to use the center of this square as the origin when defining the stencil, that would not be possible for this particular example, because the center would be an included pixel in the resulting stencil?

I.e. with the center as the origin, CartesianIndex.([(-1, -1), (1, 1), (1, -1), (-1, 1), (0, -1), (-1, 0), (1, 0), (0, 1)]) would give

1 1 1
1 1 1
1 1 1

and not (as desired)

1 1 1
1 0 1
1 1 1

Is that right?

@Datseris
Copy link
Member Author

Datseris commented Sep 5, 2022

If a user decided to use the center of this square as the origin when defining the stencil, that would not be possible for this particular example, because the center would be an included pixel in the resulting stencil?

Yeah, well, the user should not make this choice, and choose any of the other 8 pixels :D

@Datseris
Copy link
Member Author

Datseris commented Sep 5, 2022

Do you prefer the alternative of forcing users to add the 0-offset index? In all of my experience in permutation entropies, it does not make sense to not include current pixel.

@kahaaga
Copy link
Member

kahaaga commented Sep 5, 2022

Yeah, well, the user should not make this choice, and choose any of the other 8 pixels :D

Okay, then I follow!

Perhaps it would be nice to provide a convenience constructor, similar to the one I had, where the user could provide a (hyper)rectangular array of 1s and 0s, that would be automatically translated to the relevant cartesian indices, filtering out the ones where there are 0s in the input array?

Is it possible for the user to type out the indices manually? Yes, absolutely. But is it slightly annoying? Yes, for me at least 😁

I can make the extra constructor if you don't want to spend time on it.

@Datseris
Copy link
Member Author

Datseris commented Sep 5, 2022

Perhaps it would be nice to provide a convenience constructor, similar to the one I had, where the user could provide a (hyper)rectangular array of 1s and 0s, that would be automatically translated to the relevant cartesian indices, filtering out the ones where there are 0s in the input array?

I'd like a tuple input of ranges like (2,2) makes a 2x2 square, seems faster for the user than to construct the matrix! You can also add your version with 0s and 1s

@Datseris
Copy link
Member Author

Datseris commented Sep 5, 2022

Seems like we are 100 to 1000 times faster than https://github.com/arthurpessa/ordpy which, to my astonishment, has a Chaos publication associated with it. I didn't know Chaos accepted such "trivial" papers.

@kahaaga
Copy link
Member

kahaaga commented Sep 5, 2022

Seems like we are 100 to 1000 times faster than https://github.com/arthurpessa/ordpy which, to my astonishment, has a Chaos publication associated with it. I didn't know Chaos accepted such "trivial" papers.

Great! I'm surprised that the speed-ups are so substantial.

I think we should take this as a sign for us to, soon, actually write up a paper on this software too.

There's of course some API polishing to do, decisions regarding documentation improvements/centralization, and some more obvious methods that should be included, but I think we're getting close to something which I would feel is acceptable for publication.

@kahaaga
Copy link
Member

kahaaga commented Sep 5, 2022

I'd like a tuple input of ranges like (2,2) makes a 2x2 square, seems faster for the user than to construct the matrix! You can also add your version with 0s and 1s

Ok, if ok with you, I can add the tuple version while doing the matrix version too.

If you're done with your contributions to this PR, I'll finish up the symbolization refactoring, add the convenience constructors, and make sure that everything works with the updated symbolization API. Does that sound good, @Datseris ?

@Datseris
Copy link
Member Author

Datseris commented Sep 5, 2022

Ok, please do symbolic refactoring in a different PR though. This PR is very clean and I'd very much like to keep it this way.

@Datseris Datseris merged commit f9ebf98 into main Sep 6, 2022
@Datseris Datseris deleted the spatialpermutation branch September 6, 2022 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Permutation Entropy for 2D and 3D Objects
2 participants