0.1.0 Release #13

markdouthwaite · 2020-08-15T13:26:43Z

Xanthus 0.1.0

This is the first release of Xanthus, a Neural Recommendation Model package implemented in Python on top of TensorFlow and utilising the high-level Keras API. Xanthus came into existence as an exercise in implementing and replicating the results of a relatively current ML paper and to try out some of the new features of TensorFlow 2.0 (and changes to the Keras API over the last couple of years!).

Release notes

Here's what's in the box:

Models

Three neural recommender models implemented with the Keras Model API:

GeneralizedMatrixFactorization (GMF) - This model generalizes 'classic' matrix factorization (MF) as a neural model. By using the pointwise negative sampling approach outlined in literature, this model can produce higher performance than some 'classic' MF approaches for some datasets.
MultiLayerPerceptron (MLP) - A model with two input embedding blocks feeding into a 'classic' Multi-Layer Perceptron (MLP) block. As demonstrated in the literature, this architecture benefits from the depth of the model over 'shallower' models such as the GMF model in some cases.
NeuralMatrixFactorization (NMF) - This model combines the GMF and MLP models into a single model. Theoretically with the benefits of both!

Bonus features

Metadata support - The implementations of the above models (+ supporting utils) in this package are implemented to make it easy to quickly introduce metadata into your recommendation models. This means Xanthus natively supports 'hybrid' recommendations (interaction data + user/item metadata). This is mentioned in He et al's work, but not implemented or assessed. Here's an example if you're interested.
TensorBoard support - By using the Keras Model API, Xanthus natively supports TensorBoard for model training and monitoring -- plus custom callbacks too. Why not Slack yourself after each training epoch? What could possibly go wrong?

Data Utilities

Getting your data encoded neatly and quickly, generating useful training and evaluation datasets and getting that data into a format that can be used by your models can be a fiddly and time consuming process. To alleviate some of these issues and to help you get stuck into tuning your models, Xanthus provides the following utilities:

xanthus.datasets.Dataset - A utility class for quickly and (relatively) efficiently building recommendation-friendly datasets, with a bunch of bundled utilities for manipulating these datasets too.
xanthus.datasets.DatasetEncoder - Another utility class for encoding and decoding datasets, and to aid in preserving consistency across split datasets (i.e. train/test datasets).
xanthus.evaluate.split - An implementation of the 'Recommender Split' implemented as part of the Azure ML Studio. This gives you the option of sampling hold-out interactions, selecting subsets of interactions, and ensuring consistency between the resulting train and test sets.
xanthus.evaluate.create_rankings - An implementation of the common ranking evaluation protocol used for recommendation models where n 'positive' items (items a user has interacted with, but weren't present in the test set) are appended to m 'negative' items (items a user hasn't interacted with). The model can then be queried to generate a ranking for these items, with the hope that 'positive' items will appear higher in the query results.

Bonus Features

But wait, there's more! Xanthus implements some common recommendation model metric functions including:

xanthus.evaluate.metrics.ndcg - An implementation of the Normalized Discounted Cumulative Gain (NDCG) metric. Yes, that is a reference to Wikipedia.
xanthus.evaluate.metrics.hit_ratio - An implementation of the common 'hit ratio' metric used in many recommendation model evaluation activities (see also xanthus.evaluate.metrics.precision_at_k).
xanthus.evaluate.metrics.truncated_ndcg - A special-case NDCG implementation that has some performance optimizations for cases when the target set consists of a single 'positive' item in a set as opposed to the more general case addressed above.

Additionally, to make using these functions easier, you can use:

xanthus.evaluate.metrics.score - A utility function for executing a map operation over a set of recommendations, applying a provided metric function, and then returning these scores as a NumPy array. This function provides support for parallel processing too!

Finally, if you're interested in 'coverage' metrics, there's:

xanthus.evaluate.metrics.coverage_at_k - Coverage metrics can be handy for understanding how diverse your model's recommendations are -- exploring product catalogues is often a major motivation for recommenders in the first place, so 'pure' accuracy and ranking metrics (as above) might not give you the full picture.

Notes

Xanthus has been implemented with the aim of helping new users get a decent neural recommender model working as quickly as possible. From this point of view, it could be a good starting place for folk trying to get started with neural recommendation models.

That said, while neural models sound exciting and might attract attention, you might find that 'classic' recommendation models fit you're use-case better: 'lightweight' matrix factorization approaches are often simpler, faster and easier to use, so you might do well to look at those first. If you're interested, you should check out:

implicit - Implicit Matrix Factorization
LightFM - Hybrid Matrix Factorization

Disclaimer

The neural architectures implemented in this package are (currently) based directly upon He et al's work on Neural Collaborative Filtering. This team has their own repository with the code they used in their paper. It's a good paper, I encourage you to check it out!

…o 0.1.0

* ADDED - New Keras-compliant GMF, MLP and NMF models. These now use the Keras model API directly. * ADDED - `BatchedDataset` class for streaming datasets as generators (limited use until `negative_sample` function is converted to a generator). * ADDED - `reshape_recommended` utility function for reshaping outputs of a `Model.predict` call into nice neat recommendations. * ADDED - `xanthus.models.utils.InputEmbeddingBlock` as custom layer to build each of the input blocks for the NMF architectures. * UPDATED - `he_sampling` has been renamed `create_rankings`, and has the `unravel` parameter for passing rankings to `Model.predict` methods. * UPDATED - The previous model API defined in the original prerelease is available under the `xanthus.models.legacy` subpackage. This will be removed in a later version. * UPDATED - `Dataset` class now has `user_dim` and `item_dim` utility methods. * UPDATED - `Dataset` class now has the `batched` method for generating a `BatchedDataset` version of itself. * UPDATED - Improved type annotations throughout. * UPDATED - Various errors and warning messages clarified slightly. * UPDATED - The `xanthus.models.utils.batched` function has moved to `xanthus.datasets.utils.batched`. * UPDATED - Bumped TensorFlow version to `tensorflow==2.3.0`.

* ADDED - `movielens` module to `xanthus.datasets`, moved `download` to this module. You can now download and load Movielens with `xanthus.datasets.movielens.download` and `xanthus.datasets.movielens.load` respectively. * UPDATED - Refactored `create_datasets` to be `build` under `xanthus.datasets.build.build`. * UPDATED - `examples/advanced_training.py` to use the latest versions of the Xanthus models. * UPDATED - `setup.py` to bump TensorFlow to `tensorflow==2.3.0`.

…-model Updates neural models to implement the Keras Model API.

* UPDATED - `examples/metadata.py` now uses `fire`, and utilizes the 'new' Keras API. * UPDATED - Fixed some type annotation errors in `xanthus.models.neural` and `xanthus.models.utils`. * UPDATED - Bumped version!

…as imports.

…-model Update examples and type annotations.

* UPDATED - `README.md` to include new links and drop superfluous info.

Docs/update docs for release

* ADDED - New 'benchmarking' tool (`ModelManager`, `benchmark`, `save`). * UPDATED - Applied black.

* UPDATED - Refactored various files to point at the correct import paths.

Adds new benchmarking utils, fixes import path issues.

Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.3.0 to 2.3.1. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.3.0...v2.3.1) Signed-off-by: dependabot[bot] <[email protected]>

…2.3.1 Bump tensorflow from 2.3.0 to 2.3.1

Bumps [tensorflow](https://github.com/tensorflow/tensorflow) from 2.3.1 to 2.5.0. - [Release notes](https://github.com/tensorflow/tensorflow/releases) - [Changelog](https://github.com/tensorflow/tensorflow/blob/master/RELEASE.md) - [Commits](tensorflow/tensorflow@v2.3.1...v2.5.0) Signed-off-by: dependabot[bot] <[email protected]>

…2.5.0 Bump tensorflow from 2.3.1 to 2.5.0

douthwaite-io and others added 15 commits July 16, 2020 08:10

Bump versions and update setup.py.

f22ea9b

Bump versions and update setup.py.

26b0072

Update python-publish.yml

9d9b631

Bump version.

4ddbdc2

Merge branch '0.1.0' of https://github.com/markdouthwaite/xanthus int…

53dea1e

…o 0.1.0

Update README.md

1a250dd

Update README.md

8f201bc

Update issue templates

591aa3e

* UPDATED - Type annotations in xanthus.datasets.movielens.

0e52ca4

Merge pull request #11 from markdouthwaite/feature/inherit-from-keras…

22e8547

…-model Updates neural models to implement the Keras Model API.

* UPDATED - examples/advanced_training to use fire, applied black.

b41c2d5

* UPDATED - `examples/metadata.py` now uses `fire`, and utilizes the 'new' Keras API. * UPDATED - Fixed some type annotation errors in `xanthus.models.neural` and `xanthus.models.utils`. * UPDATED - Bumped version!

* UPDATED - examples/advanced_training.py to remove unnecessary Ker…

a31cc1e

…as imports.

Merge pull request #12 from markdouthwaite/feature/inherit-from-keras…

503c3a8

…-model Update examples and type annotations.

markdouthwaite added 0.1.0 Related to 0.1.0 release umbrella This issue relates to a collection of possible issues and acts as the parent issue for these. labels Aug 15, 2020

markdouthwaite added this to the 0.1.0 milestone Aug 15, 2020

markdouthwaite self-assigned this Aug 15, 2020

douthwaite-io and others added 11 commits August 15, 2020 14:52

* UPDATED - getting-started.ipynb to reflect 0.1.0rc1 changes.

f3a83ad

* UPDATED - `README.md` to include new links and drop superfluous info.

* UPDATED - Regenerated getting-started.md.

e4b15ce

Merge branch '0.1.0' into docs/update-docs-for-release

f771573

Merge pull request #14 from markdouthwaite/docs/update-docs-for-release

059a8d7

Docs/update docs for release

* ADDED - New 'basic' training example.

ed88387

* ADDED - New 'benchmarking' tool (`ModelManager`, `benchmark`, `save`). * UPDATED - Applied black.

* ADDED - New benchmarking scripts & notebook.

07c0f1c

* UPDATED - Refactored various files to point at the correct import paths.

Update __init__.py

5ad1ed7

Merge pull request #15 from markdouthwaite/docs/update-docs-for-release

9f93a6f

Adds new benchmarking utils, fixes import path issues.

Merge pull request #16 from markdouthwaite/dependabot/pip/tensorflow-…

4de18d6

…2.3.1 Bump tensorflow from 2.3.0 to 2.3.1

Merge pull request #17 from markdouthwaite/dependabot/pip/tensorflow-…

8d4e64b

…2.5.0 Bump tensorflow from 2.3.1 to 2.5.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

0.1.0 Release #13

0.1.0 Release #13

Uh oh!

markdouthwaite commented Aug 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

0.1.0 Release #13

Are you sure you want to change the base?

0.1.0 Release #13

Uh oh!

Conversation

markdouthwaite commented Aug 15, 2020

Xanthus 0.1.0

Release notes

Models

Bonus features

Data Utilities

Bonus Features

Notes

Disclaimer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants