-
Notifications
You must be signed in to change notification settings - Fork 0
0.1.0 Release #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
markdouthwaite
wants to merge
27
commits into
master
Choose a base branch
from
0.1.0
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
0.1.0 Release #13
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* ADDED - New Keras-compliant GMF, MLP and NMF models. These now use the Keras model API directly. * ADDED - `BatchedDataset` class for streaming datasets as generators (limited use until `negative_sample` function is converted to a generator). * ADDED - `reshape_recommended` utility function for reshaping outputs of a `Model.predict` call into nice neat recommendations. * ADDED - `xanthus.models.utils.InputEmbeddingBlock` as custom layer to build each of the input blocks for the NMF architectures. * UPDATED - `he_sampling` has been renamed `create_rankings`, and has the `unravel` parameter for passing rankings to `Model.predict` methods. * UPDATED - The previous model API defined in the original prerelease is available under the `xanthus.models.legacy` subpackage. This will be removed in a later version. * UPDATED - `Dataset` class now has `user_dim` and `item_dim` utility methods. * UPDATED - `Dataset` class now has the `batched` method for generating a `BatchedDataset` version of itself. * UPDATED - Improved type annotations throughout. * UPDATED - Various errors and warning messages clarified slightly. * UPDATED - The `xanthus.models.utils.batched` function has moved to `xanthus.datasets.utils.batched`. * UPDATED - Bumped TensorFlow version to `tensorflow==2.3.0`.
* ADDED - `movielens` module to `xanthus.datasets`, moved `download` to this module. You can now download and load Movielens with `xanthus.datasets.movielens.download` and `xanthus.datasets.movielens.load` respectively. * UPDATED - Refactored `create_datasets` to be `build` under `xanthus.datasets.build.build`. * UPDATED - `examples/advanced_training.py` to use the latest versions of the Xanthus models. * UPDATED - `setup.py` to bump TensorFlow to `tensorflow==2.3.0`.
…-model Updates neural models to implement the Keras Model API.
* UPDATED - `examples/metadata.py` now uses `fire`, and utilizes the 'new' Keras API. * UPDATED - Fixed some type annotation errors in `xanthus.models.neural` and `xanthus.models.utils`. * UPDATED - Bumped version!
…-model Update examples and type annotations.
* UPDATED - `README.md` to include new links and drop superfluous info.
Docs/update docs for release
* ADDED - New 'benchmarking' tool (`ModelManager`, `benchmark`, `save`). * UPDATED - Applied black.
* UPDATED - Refactored various files to point at the correct import paths.
Adds new benchmarking utils, fixes import path issues.
Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.3.0 to 2.3.1. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.3.0...v2.3.1) Signed-off-by: dependabot[bot] <[email protected]>
…2.3.1 Bump tensorflow from 2.3.0 to 2.3.1
Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.3.1 to 2.5.0. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.3.1...v2.5.0) Signed-off-by: dependabot[bot] <[email protected]>
…2.5.0 Bump tensorflow from 2.3.1 to 2.5.0
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Xanthus 0.1.0
This is the first release of Xanthus, a Neural Recommendation Model package implemented in Python on top of TensorFlow and utilising the high-level Keras API. Xanthus came into existence as an exercise in implementing and replicating the results of a relatively current ML paper and to try out some of the new features of TensorFlow 2.0 (and changes to the Keras API over the last couple of years!).
Release notes
Here's what's in the box:
Models
Three neural recommender models implemented with the Keras Model API:
GeneralizedMatrixFactorization(GMF) - This model generalizes 'classic' matrix factorization (MF) as a neural model. By using the pointwise negative sampling approach outlined in literature, this model can produce higher performance than some 'classic' MF approaches for some datasets.MultiLayerPerceptron(MLP) - A model with two input embedding blocks feeding into a 'classic' Multi-Layer Perceptron (MLP) block. As demonstrated in the literature, this architecture benefits from the depth of the model over 'shallower' models such as the GMF model in some cases.NeuralMatrixFactorization(NMF) - This model combines the GMF and MLP models into a single model. Theoretically with the benefits of both!Bonus features
ModelAPI, Xanthus natively supports TensorBoard for model training and monitoring -- plus custom callbacks too. Why not Slack yourself after each training epoch? What could possibly go wrong?Data Utilities
Getting your data encoded neatly and quickly, generating useful training and evaluation datasets and getting that data into a format that can be used by your models can be a fiddly and time consuming process. To alleviate some of these issues and to help you get stuck into tuning your models, Xanthus provides the following utilities:
xanthus.datasets.Dataset- A utility class for quickly and (relatively) efficiently building recommendation-friendly datasets, with a bunch of bundled utilities for manipulating these datasets too.xanthus.datasets.DatasetEncoder- Another utility class for encoding and decoding datasets, and to aid in preserving consistency across split datasets (i.e. train/test datasets).xanthus.evaluate.split- An implementation of the 'Recommender Split' implemented as part of the Azure ML Studio. This gives you the option of sampling hold-out interactions, selecting subsets of interactions, and ensuring consistency between the resulting train and test sets.xanthus.evaluate.create_rankings- An implementation of the common ranking evaluation protocol used for recommendation models wheren'positive' items (items a user has interacted with, but weren't present in the test set) are appended tom'negative' items (items a user hasn't interacted with). The model can then be queried to generate a ranking for these items, with the hope that 'positive' items will appear higher in the query results.Bonus Features
But wait, there's more! Xanthus implements some common recommendation model metric functions including:
xanthus.evaluate.metrics.ndcg- An implementation of the Normalized Discounted Cumulative Gain (NDCG) metric. Yes, that is a reference to Wikipedia.xanthus.evaluate.metrics.hit_ratio- An implementation of the common 'hit ratio' metric used in many recommendation model evaluation activities (see alsoxanthus.evaluate.metrics.precision_at_k).xanthus.evaluate.metrics.truncated_ndcg- A special-case NDCG implementation that has some performance optimizations for cases when the target set consists of a single 'positive' item in a set as opposed to the more general case addressed above.Additionally, to make using these functions easier, you can use:
xanthus.evaluate.metrics.score- A utility function for executing amapoperation over a set of recommendations, applying a provided metric function, and then returning these scores as a NumPy array. This function provides support for parallel processing too!Finally, if you're interested in 'coverage' metrics, there's:
xanthus.evaluate.metrics.coverage_at_k- Coverage metrics can be handy for understanding how diverse your model's recommendations are -- exploring product catalogues is often a major motivation for recommenders in the first place, so 'pure' accuracy and ranking metrics (as above) might not give you the full picture.Notes
Xanthus has been implemented with the aim of helping new users get a decent neural recommender model working as quickly as possible. From this point of view, it could be a good starting place for folk trying to get started with neural recommendation models.
That said, while neural models sound exciting and might attract attention, you might find that 'classic' recommendation models fit you're use-case better: 'lightweight' matrix factorization approaches are often simpler, faster and easier to use, so you might do well to look at those first. If you're interested, you should check out:
implicit- Implicit Matrix FactorizationLightFM- Hybrid Matrix FactorizationDisclaimer
The neural architectures implemented in this package are (currently) based directly upon He et al's work on Neural Collaborative Filtering. This team has their own repository with the code they used in their paper. It's a good paper, I encourage you to check it out!