ripsnet by hensel-f · Pull Request #587 · GUDHI/gudhi-devel

hensel-f · 2022-03-01T15:04:20Z

A pull request for adding RipsNet to GUDHI. An architecture for fast and robust estimation of persistence diagram vectorizations of point clouds.

Notice that, for the moment, RipsNet documentation is only available from installation/tensorflow. We will deal later with this when perslay and PD gradient (other tensorflow features) will be merged

VincentRouvreau

To be complete, you also missed some required changes in src/python/CMakeLists.txt as Mathieu did on PD gradient PR, but it's ok, this file is a nightmare ;-)

src/python/doc/ripsnet.inc

src/python/doc/ripsnet.rst

src/python/doc/ripsnet.inc

src/python/doc/ripsnet.rst

License update Co-authored-by: Vincent Rouvreau <10407034+VincentRouvreau@users.noreply.github.com>

Co-authored-by: Vincent Rouvreau <10407034+VincentRouvreau@users.noreply.github.com>

VincentRouvreau

In order to generate the documentation, I am proposing to you a quick 'fix'. In src/python/doc/installation.rst, in the Tesorflow section (~ l.398), add:

:doc:`RipsNet <ripsnet>` module requires `TensorFlow <https://https://www.tensorflow.org/install/>`_.

You will be able to see that your bibliography is correct (you can remove the links). I can show you the result on CI when it will be built.

src/python/test/test_ripsnet.py

mglisse

A first batch of comments.
(not all comments require code changes, some are questions)

mglisse · 2022-04-15T19:12:16Z

src/python/doc/ripsnet.rst

+    tf_data_test = tf.ragged.constant([
+        [list(c) for c in list(data_test[i])] for i in range(len(data_test))], ragged_rank=1)


Uh, so we build data_test using numpy (from lists), only to convert it again to lists here, and finally build a tensorflow object from those lists? Would it be possible to skip some of those conversions? We probably don't need to import numpy at all.

Yes, you're right, thanks. I updated it.

mglisse · 2022-04-15T19:19:06Z

src/python/doc/ripsnet.rst

+Once RN is properly trained (which we skip in this documentation) it can be used to make predictions.
+A possible output is:


It isn't obvious what the example is computing. Maybe adding a comment or 2 would help (or an image). Is data_test a list of 3 point sets in 2D, and is the output some kind of vectorized persistence diagram? It becomes clear once we read the detailed doc, but I think some minimal comments in the example would still make sense.

Good point, I added a sentence to describe it.

mglisse · 2022-04-15T20:40:39Z

src/python/gudhi/tensorflow/ripsnet.py

+        return outputs
+
+
+class TFBlock(tf.keras.layers.Layer):


This seems very general. Is it related to tf.keras.Sequential, or is there some other utility already providing this composition?

It is possible that there is an existing utility for this composition, but I couldn't find anything explanation of how to make it work with ragged inputs.

mglisse · 2022-04-15T20:54:45Z

src/python/gudhi/tensorflow/ripsnet.py

+        """
+        super().__init__(dynamic=True, **kwargs)
+        self._supports_ragged_inputs = True
+        self.pop = perm_op


Using the name pop is confusing (I was wondering from which list you we removing an element), could we stick to perm_op or anything that isn't already an English word with an unrelated meaning?

Yes, I've updated it to avoid confusion, thanks.

mglisse · 2022-04-15T21:00:31Z

src/python/gudhi/tensorflow/ripsnet.py

+    def build(self, input_shape):
+        super().build(input_shape)


Isn't that exactly what happens if you don't define this function?
I am also trying to understand the difference between this and what you did for DenseRaggedBlock and RipsNet.

I'm also not exactly sure what the effect of this is, I modeled it after an example I saw somewhere. But I've changed it to mach the case for RipsNet and DenseRagged. Is that fine, or what would you suggest?

mglisse · 2022-04-15T21:10:20Z

src/python/gudhi/tensorflow/ripsnet.py

+        return outputs
+
+
+class DenseRaggedBlock(tf.keras.layers.Layer):


This looks identical to TFBlock except for _supports_ragged_inputs? Would TFBlock(ragged=True) make sense? Or could it even be implicit, ragged iff the first layer is?

Indeed, I have changed TFBlock such that it supports ragged inputs if the first layer is an instance of DenseRagged. So, indeed I think that DenseRaggedBlock is no longer needed. I have commented it for now but if the change is confirmed it can be deleted.

mglisse · 2022-04-15T21:16:17Z

src/python/gudhi/tensorflow/ripsnet.py

+        elif self.pop == 'sum':
+            pop_ragged = PermopRagged(tf.math.reduce_sum)
+        else:
+            raise ValueError(f'Permutation invariant operation: {self.pop} is not allowed, must be "mean" or "sum".')


If perm_op is not one of those 2 strings, should we assume that it is a function, so users can pass tf.math.reduce_max if they want? Or is that useless?

Well, the only concern I have is that if we leave this entirely up to the user, then their input function may be such that our theoretical guarantees for RipsNet may perhaps no longer be satisfied. I'm not sure what the best option is in this case. What do you think?

I think it's good to let the user do whatever they want (as long as it is properly documented). Our theoretical results are of the form "if ..., then" ; but nothing prevents to use RipsNet with some other perm_op and a user may be interested in doing so.

Okay, sounds good, I've removed the requirement that the permutation invariant function has to be 'mean' or 'sum', so that users can specify their own functions.

mglisse · 2022-04-15T21:56:25Z

src/python/gudhi/tensorflow/ripsnet.py

+        outputs = tf.ragged.map_flat_values(tf.matmul, inputs, self.kernel)
+        if self.use_bias:
+            outputs = tf.ragged.map_flat_values(tf.nn.bias_add, outputs, self.bias)
+        outputs = tf.ragged.map_flat_values(self.activation, outputs)


This looks like an ad hoc reimplementation of a dense layer, applied to each element. Would it make sense first to define a small network that takes an input of size 2 (if we work with 2d points) and that may be composed of several layers, and only then apply (map) it to all the points in the tensor, so there is a single call to a map function for the whole phi1? It seems conceptually simpler, but it might be slower if tensorflow doesn't realize that it is equivalent.

Sorry, I'm not exactly sure what you mean, or if it makes it faster. But if you have a concrete change in mind please just adapt it directly or let me know.

tlacombe

Short reading of the doc.

tlacombe · 2022-05-17T15:30:28Z

src/python/doc/ripsnet.rst

+
+.. testcode::
+
+    from gudhi.tensorflow import *


It may be better to be explicit, i.e. naming the imported functions or to do import gudhi.tensorflow as gtf just to make clear to the user which functions below are indeed coming from the gudhi.tensorflow package.

I agree, that makes it a bit clearer.

tlacombe · 2022-05-17T15:33:13Z

src/python/doc/ripsnet.rst

+    activation_fct = 'gelu'
+    output_activation = 'sigmoid'
+    dropout = 0
+    kernel_regularization = 0


Perhaps it may be worth to comment (not in detail) what are these hyper-parameters ; in particular how one is supposed to chose ragged_layers_size and dense_layers_size.

Sure, I added a comment saying they should be tuned according to the specific dataset in order to reach a better performance.

tlacombe · 2022-05-17T15:48:26Z

src/python/doc/ripsnet.rst

+
+    RN.predict(tf_data_test)
+
+Once RN is properly trained (which we skip in this documentation) it can be used to make predictions.


Would it be cumbersome to provide a sort of minimal working example to train RipsNet ?

A user may not be familiar with tensorflow and have no clue on how to train the RipsNet model at this stage. Perhaps just setting the typical optimizer = ..., loss_function = ... , and do a single step of gradient descent here, would help to and not discourage the user not familiar with tf?

Another option is to write a Tutorial ( https://github.com/GUDHI/TDA-tutorial ) to reproduce, say, the synthetic experiment of the paper (multiple_circles) and to refer to it in this doc (I understand that we don't want this doc to be too long).

A final option (that requires more development) would be to provide a method train to RipsNet that does the job with some default parameters, so that one could get starting by simply going for something like

RN = ripsnet.RipsNet(...) RN.train(train_data) RN.predict(test_data)

Of course these are just suggestions.

I think the best option is to link to the notebook containing the synthetic examples. This provides a very nice example where one can see the workflow.

I've adapted Mathieu's original tutorial on the synthetic data so that it illustrates the use (including setup and training) of a RipsNet architecture. So, as @tlacombe suggested, I think it may be nice to include and link to this tutorial somewhere.

I've opened a PR (here: GUDHI/TDA-tutorial#59) to include the tutorial notebook so that we can then link to it.

tlacombe · 2022-05-17T15:50:19Z

src/python/doc/ripsnet.rst

+    RN.predict(tf_data_test)
+
+Once RN is properly trained (which we skip in this documentation) it can be used to make predictions.
+In this example RipsNet estimates persistence vectorizations (of output size 25) of a list of point clouds (of 3 points) in 2D.


Isn't it simply "a list of 3 point clouds ~~(of 3 points)~~" ?
(Here, I understand it as "each point cloud is made of 3 points", but perhaps my English is just broken.)

Could also add "yielding 3 vectorizations hence an output with shape nb_training_data x output_units " for clarity?

Sure, I will add a sentence for clarification, thanks for the suggestion. (In this example each of the pointclouds only has 3 points actually, so I guess you understood it as intended.)

hensel-f added 6 commits February 16, 2022 10:08

ripsnet first commit

352fce1

ripsnet typos fixed

51eeb03

changed perm_op API and added documentation

f879d1d

added test

8ff14bf

added test

9eb3231

updated test

8e5003f

VincentRouvreau reviewed Mar 7, 2022

View reviewed changes

hensel-f and others added 8 commits March 7, 2022 12:06

Update src/python/doc/ripsnet.inc

496a141

License update Co-authored-by: Vincent Rouvreau <10407034+VincentRouvreau@users.noreply.github.com>

Fixed imports

fcf7edf

Co-authored-by: Vincent Rouvreau <10407034+VincentRouvreau@users.noreply.github.com>

Fixed imports

e23e929

Co-authored-by: Vincent Rouvreau <10407034+VincentRouvreau@users.noreply.github.com>

updated documentation

25b842e

updated CMakeLists.txt

f01e3cc

updated documentation and bibliography

9281794

updated CMakeLists.txt

6b9ecce

Update src/python/doc/ripsnet.inc

ec19f2d

Co-authored-by: Vincent Rouvreau <10407034+VincentRouvreau@users.noreply.github.com>

VincentRouvreau reviewed Mar 10, 2022

View reviewed changes

src/python/test/test_ripsnet.py Outdated Show resolved Hide resolved

hensel-f added 7 commits March 10, 2022 13:26

documentation fix

aef67bd

increased error margin from 1e-7 to 1e-6

28d7320

fixed imports

32ba96b

fixed imports

33424c8

fixed imports

146b34e

fixed testoutput check

f29ca81

removed print statement

6edf471

hensel-f marked this pull request as ready for review March 11, 2022 13:46

This comment was marked as resolved.

Sign in to view

mglisse reviewed Apr 17, 2022

View reviewed changes

hensel-f added 4 commits April 26, 2022 10:34

update of gitignore

59dc256

removede numpy import and added explanation

db348e6

changed name of pop to perm_op

dd5e94e

changed TFBlock to support regged inputs and commented DenseRaggedBlock

25222ca

hensel-f added 3 commits April 26, 2022 12:20

changed to TFBlock in test_ripsnet.py

63edfc2

fixed __init__.py

1cca679

fixed documentation

1122d11

tlacombe reviewed May 17, 2022

View reviewed changes

hensel-f added 3 commits June 7, 2022 09:44

allowing user specified permop functions

a25d3f8

updated comments in the documentation and changed imports

a05f825

added get_config()

67cb4d4

		tf_data_test = tf.ragged.constant([
		[list(c) for c in list(data_test[i])] for i in range(len(data_test))], ragged_rank=1)

		Once RN is properly trained (which we skip in this documentation) it can be used to make predictions.
		A possible output is:

		return outputs


		class DenseRaggedBlock(tf.keras.layers.Layer):


		RN.predict(tf_data_test)

		Once RN is properly trained (which we skip in this documentation) it can be used to make predictions.

Conversation

hensel-f commented Mar 1, 2022 • edited by VincentRouvreau Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

VincentRouvreau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VincentRouvreau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as resolved.

mglisse left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hensel-f Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hensel-f Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tlacombe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hensel-f commented Mar 1, 2022 •

edited by VincentRouvreau

Loading

hensel-f Apr 26, 2022 •

edited

Loading

hensel-f Apr 26, 2022 •

edited

Loading

hensel-f Jun 7, 2022 •

edited

Loading