🇺🇸
🇷🇺
IRPNet
🇺🇸
🇷🇺
IRPNet is a model for classifying textual reviews into positive and negative. Completed as a bachelor's diploma work.
IRPNet is an abbrevation for "Is Review Positive?"-determining Network.
There are two folders in repository. One is for working with data and training the model, the second is a user interface for interacting with the trained model:
- data processing:
Parsing.ipynb
- parsing KinoPoisk using API to extract labeled reviews;Embedding.ipynb
- training a Word2Vec model based on all extracted KinoPoisk reviews;Preprocessing.ipynb
- forming of qualitative training dataset for a neural network;Training.ipynb
- training a neural network on a GPU using Google Colab;utils.py
- some features like timer and progress bar printing;
- interface:
main.py
- CLI implementation;model.py
- classification model implementation;utils.py
- some features like colored output in CLI;parameters/
- parameters of neural network (embedding, dictionary of tokens, weights and biases).
Metrics calculated for a dataset of more than 200K KinoPoisk reviews:
Metric | Value |
---|---|
Accuracy | 0.9134621659162305 |
Precision | 0.9914457491022066 |
Recall | 0.909656810116406 |
F1-score | 0.9487919240454142 |
- Kinopoisk Api Unofficial
[2.0.1]
- parsing reviews; - Pandas
[1.4.2]
- working with data; - Matplotlib
[3.4.3]
- visualization; - PyTorch
[1.11.0]
- neural network; - Gensim
[4.2.0]
- Word2Vec model; - Jupyter Notebook - interactive environment for Python;
- Google Colab - GPU training.
First of all, you need to install PyTorch:
$ pip install torch
Next, you need to download a module consisting of model.py
and a folder with model parameters, and then insert these files into your project.
After that you can use IRPNet:
from model import Model
model = Model()
example_reviews = [
'Все плохо, не советую', # Negative 99.392%
'Все отлично, рекомендую', # Positive 99.996%
'В целом пойдет' # Positive 64.878%
]
for review in example_reviews:
pos, neg = model.process_review(review)
print(review + ':')
print(' Positive:', pos)
print(' Negative:', neg)