-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
⭐️ Entity embedder interface is here #286
Conversation
Thanks @EssamWisam for this valuable contribution. Do we know why buildkite tests are failing? |
Thanks @EssamWisam for the diagnostics. That's very helpful. The good news is that a warning we added previously has correctly flagged the issue. The build kite tests include GPU tests, which is why you are not seeing the issue locally, I expect. Reproducibility using RNGs on a GPU is a can of worms, and we need to dodge that. The layers are initialised on the CPU and moved across. It's just the dropout that causes the problems. Can you please try this:
Unrelated comment: It looks like |
I did deep copying and no dropout as you said. When I use the |
Indeed, I can look into this later. |
Okay, sorry, we need something without a dropout layer at all. How about something like |
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work. Please see what you can do with my suggestions.
Co-authored-by: Anthony Blaom, PhD <[email protected]>
Co-authored-by: Anthony Blaom, PhD <[email protected]>
Co-authored-by: Anthony Blaom, PhD <[email protected]>
…into entity-embedder
@ablaom I took actions regarding all the points. Please check if it's ready now and thank you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great work, thanks.
One final thing I noticed is that NeuralNetworkBinaryClassifier
is supported by the wrapper, but this is missing from the docs.
Otherwise good to go
I used to think that I do indeed wonder why both exist, now that I realize that the binary classifier is exposed in MLJFlux docs. In any case, I added it in the docs here. Feel free to merge. |
This classifier was added by @tiemvanderdeure . My understanding is that it provides some binary-specific optimisations, which was worth doing as it is such a common use case. |
If a Sigmoid is used then I can imagine that the final matrix multiplication does becomes faster because the corresponding matrix now has one less column. That said, if it's true that they are mathematically equivalent, in the way I understand, then it may be better to automatically switch the Just sharing thoughts. |
Yes, I added the binary classifier because it wasn't possible to use a sigmoid finalizer or binary cross-entropy loss with the existing classifier.
I'm not against this at all. I wasn't aware that they are mathematically equivalent. I think I did some testing and got better results with the binary case (or it was just faster, I don't exactly remember). |
Add an interface for
EntityEmbedder
that can wrap any basic deep learning model in an unsupervised model.PR Checklist