GitHub - vijeshm/Movie-Recommendation-Algorithm: This code handles the recommendation module for the project "Movie Manager" by Sandeep Raju

vijeshm / Movie-Recommendation-Algorithm Public

Notifications You must be signed in to change notification settings
Fork 2
Star 3

This code handles the recommendation module for the project "Movie Manager" by Sandeep Raju

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README		README
runMe.py		runMe.py
testFile.txt		testFile.txt

Repository files navigation

Recommendation Systems are intended to provide ratings or suggestions for a user. They are usually built around a model that represents the data structures and the algorithms in performing the tasks. At the backend of any recommendation system, is a database of objects that are intended to be recommended. Some of the examples of recommendation systems include Pandora Radio and Netflix.

Our Movie Recommendation Algorithm uses a dynamic database of movies in the form of a graph data structure. The software also maintains a log file that contains the sequence of movies that the user has watched overtime. The algorithm takes this sequence as the input, modifies the graph based of the sequence, and outputs a list of recommended movies in the order of its priority.

Now that we've covered the working of the algorithm on an abstract level, we'll discuss more about its specifics. Every movie can be visualized as objects having attributes such as movie name, year of release, director etc. On an abstract level, two movies can be said to be related, if they share any of the attributes. This scenario can be best-represented using graphs. Every movie is represented by a node in the graph. Conversely, every node is identified by a movie name. These nodes have various properties such as Director, Actors, Genre, Rating etc. We also maintain an attribute called 'weight', which signifies the strength of the recommendation.

Two nodes are connected by an edge if they share any of these attributes. For effective performance of the algorithm, each edge is associated with a list of common attributes. For example, if X and Y are movies directed by the same person, then the edge between X and Y is associated with the keyword 'Director'. If X and Y are movies released in the same year and has a common actor, the edge between them is associated with the keywords 'Year' and 'Actors'. This defines the structure of the dynamic database.

As mentioned earlier, the algorithm uses the sequence of the movies watched by the user in order to generate a recommendation list. Before we delve into how the algorithm generates this list, it is important to observe the user's behavior in selecting the movies that he wishes to watch next. Different users give different level of importance to Actors, Directors, Genre etc. i.e, One particular user may tend to watch the movies directed by a single person. In contrast, another user may watch the movies directed by a particular actor, irrespective of the director. We account for this difference in priorities using a normalized weightage mechanism. i.e, we allow users to set the priorites that they have towards the various attributes. We also provide a default set of priorities that people normally tend to have.

Once the user's attribute priorities have been set, the algorithm works towards the generating the recommendation list. The sequence of the movies that have been watched is traversed. Note that the recommendation is based on the movies that have been watched by the user. The intuition behind the algorithm is as follows: When a user has watched a movie, the next movie that he watches will be one of its neighbors in the graph. From the set of neighbors, the user's preference will be proporational to the values in his attribute-priority list.

We extend this intuition over the sequence of movies that the user has watched. For each movie in the sequence, we just increment the weight of its neighbors in proportion with the user's predefine attribute-priority list. For example, if a user has watched a movie X, and Y is one of its neighbors that share a few attributes, we increment the weight of Y in proportion with the shared attributes' user-priority value. We perform this operation over all the neighbors of all the movies in the movies-watched sequence.

To generate the final recommendation list, we just sort the nodes according to their weights in the decreasing order.

Key Points to note:
1. The algorithm and data structures used here are built for scalability.
New movies can be dynamically added and linked to the other existing. Addition of new attributes is a fairly simple task as well.
2. The user's attribute-priority list can be a default one, or can be user-set. It can be static or dynamic. Dynamic implementations open up a possibilty of adapting machine learning algorithms for better recommendation as well. In a dynamic technique, the attribute-priority list itself is modified, based on the order in which the user watches the movies. The priorities tend to converge as the same user watches more and more movies.
3. If the information about when the movie was watched is available in the input sequence, then a more weightage could be given to the movies that were watched recently, rather than the ones watched far behind in time.