`🇺🇸` 🇷🇺
IRPNet

IRPNet is a model for classifying textual reviews into positive and negative. Completed as a bachelor's diploma work.

IRPNet is an abbrevation for "Is Review Positive?"-determining Network.

Repository structure

There are two folders in repository. One is for working with data and training the model, the second is a user interface for interacting with the trained model:

data processing:
- Parsing.ipynb - parsing KinoPoisk using API to extract labeled reviews;
- Embedding.ipynb - training a Word2Vec model based on all extracted KinoPoisk reviews;
- Preprocessing.ipynb - forming of qualitative training dataset for a neural network;
- Training.ipynb - training a neural network on a GPU using Google Colab;
- utils.py - some features like timer and progress bar printing;
interface:
- main.py - CLI implementation;
- model.py - classification model implementation;
- utils.py - some features like colored output in CLI;
- parameters/ - parameters of neural network (embedding, dictionary of tokens, weights and biases).

Model architecture

Quality metrics

Metrics calculated for a dataset of more than 200K KinoPoisk reviews:

Metric	Value
Accuracy	0.9134621659162305
Precision	0.9914457491022066
Recall	0.909656810116406
F1-score	0.9487919240454142

Technologies

Kinopoisk Api Unofficial [2.0.1] - parsing reviews;
Pandas [1.4.2] - working with data;
Matplotlib [3.4.3] - visualization;
PyTorch [1.11.0] - neural network;
Gensim [4.2.0] - Word2Vec model;
Jupyter Notebook - interactive environment for Python;
Google Colab - GPU training.

Usage

First of all, you need to install PyTorch:

$ pip install torch

Next, you need to download a module consisting of model.py and a folder with model parameters, and then insert these files into your project.

After that you can use IRPNet:

from model import Model


model = Model()

example_reviews = [
    'Все плохо, не советую',   # Negative 99.392%
    'Все отлично, рекомендую', # Positive 99.996%
    'В целом пойдет'           # Positive 64.878%
]

for review in example_reviews:
    pos, neg = model.process_review(review)
    print(review + ':')
    print(' Positive:', pos)
    print(' Negative:', neg)

Name	Name	Last commit message	Last commit date
Latest commit Ostrill Remove .idea from indexing Aug 3, 2024 90f6ee0 · Aug 3, 2024 History 13 Commits
assets	assets	Add README_RU.md	Jun 4, 2022
data processing	data processing	Refactor gitignore	Aug 3, 2024
interface	interface	Remove .idea from indexing	Aug 3, 2024
.gitignore	.gitignore	Refactor gitignore	Aug 3, 2024
LICENSE	LICENSE	Fix license	Aug 3, 2024
README-RU.md	README-RU.md	Update readme	Aug 3, 2024
README.md	README.md	Update readme	Aug 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

`🇺🇸` 🇷🇺
IRPNet

Repository structure

Model architecture

Quality metrics

Technologies

Usage

About

Releases 1

Packages

Contributors 2

Languages

License

Ostrill/IRPNet

Folders and files

Latest commit

History

Repository files navigation

🇺🇸 🇷🇺 IRPNet

Repository structure

Model architecture

Quality metrics

Technologies

Usage

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Languages

`🇺🇸` 🇷🇺
IRPNet

Packages