Skip to content

Project directory structure refactor #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open

Project directory structure refactor #5

wants to merge 8 commits into from

Conversation

Roj
Copy link
Member

@Roj Roj commented Mar 16, 2019

This PR organizes the project into the following directories:

  • data - as before
  • preprocessing - converting data files into text files for word2vec to use
  • models - word embedding models and recommendation systems
  • analysis - benchmarking of models and embedding analysis

Roj added 8 commits December 31, 2018 17:24
PlaylistIterator now accepts a parameter to load track metadata (artist
data only*). idomaarReader caches this data if it is loaded, so
it doesn't have to be loaded on each new instance.
Also fixed some bugs ref. to the load of session data.
Before this commit the model wouldn't actually use the metadata.
the iterator now works without hard-coding dataset values or
schema. It is less efficient as it now uses dictionaries instead of
vectors whether they are better or not. However, this allows one to
not to worry about dataset quirks when parsing metadata.
The iterator also has a new registry that servers as a lookup table
for existing entities, so if a session has some song it just keeps
the reference of the existing entity. This also allows metadata
to be preloaded into songs and artists.
An important change is that all elements are constructed and
persisted in cascade, even if you do not use them (users,
for example). It might be a good idea to keep a blacklist or
whitelist of entity types to save later. For now it's enough.
(it may not be up-to-date)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant