ReviewRadar

ReviewRadar is a web application that classifies reviews as real or fake based on a user-defined strictness level. It leverages crawl4ai for scraping reviews, utilizes an OpenAI LLM extraction strategy for JSON formatting, and employs a pretrained SVC classifier with 88% accuracy for classification.

Users can input URLs to review pages or individual reviews for immediate classification. Additionally, there's an open API to integrate the fake review detection service into another application.

Key Differences Between Fake and Real Reviews

Overly Positive Language (Fake): Exaggerated praise with phrases like "Love this!", "Amazing!", and "10 Stars", often without specifics.
Repetitive Phrasing (Fake): Common phrases like "I love" and "the only problem is" repeated across reviews.
Personal Pronouns (Fake): Frequent use of "I", "we", or "my" to create a forced personal connection.
Contradictory/Incomplete Sentences (Fake): Some reviews seem cut off or make contradictory statements.
Time Mentions (Fake): Fake reviews often mention usage duration to imply long-term experience.
Specific & Critical Feedback (Real): Real reviews focus on specific product features, highlighting both pros and cons.
Practicality Focus (Real): Emphasis on product functionality and day-to-day use.
Balanced Opinions (Real): Mix of positive and negative points.
Short & Objective (Real): Concise and direct feedback, avoiding unnecessary embellishment.

Features

Review Page Analysis: Submit URLs containing reviews to be scraped and classified as real or fake based on selected strictness levels.
Individual Review Testing: Input single reviews and receive immediate authenticity feedback.
Data Visualization: View insightful visualizations like pie charts and histograms to analyze the distribution of real and fake reviews.
Open API: Access an API endpoint to integrate fake review detection into your own applications.

API Endpoint

POST /api/openapi-verify-review

Developers can verify reviews through a simple JSON interface. The input includes the review text and a threshold level (high, medium, or low), and the output is a classification of the review as real or fake.

Example Request:

{
  "review": "Great product, highly recommend!",
  "threshold": "high"
}

Example Response:

{
  "analyzed_reviews": {
    "review_text": "Great product, highly recommend!",
    "is_fake": false,
    "confidence": 0.92
  }
}

Model Approach

Objective: Classify reviews as fake or genuine using text-based features.

Algorithm

Support Vector Classifier (SVC): Chosen for its effectiveness in text classification tasks.

Process

Data Preparation:
- Loaded a dataset of reviews and preprocessed the text (removing punctuation, filtering stop words).
- Split the dataset into training and testing sets.
Text Processing:
- Prepared reviews for vectorization using a text processing function.
- Applied CountVectorizer to convert the text into a bag-of-words model.
Model Training:
- Created a pipeline combining:
  - CountVectorizer: Converts text into numerical vectors.
  - TF-IDF Transformer: Scales vectors by term frequency-inverse document frequency.
  - SVC Classifier: Trained on the vectorized data.
- Trained the model on the preprocessed data.
Model Saving:
- Saved the trained model using joblib for later predictions.

Outcome

Achieves 88% accuracy in classifying reviews as real or fake.
Allows users to analyze new reviews by loading the model for inference.

How It Works

User Submission: Submit a URL containing reviews or an individual review.
Scraping and Parsing: Scrapes reviews from the URL using crawl4ai and processes the content through an OpenAI LLM to generate structured JSON data.
Review Classification: Passes the structured data through the pretrained SVC classifier to identify fake reviews.
Visualization: Displays the results with visual insights through pie charts and histograms.

Usage Example

Submit a link to a review page to receive a visual breakdown of real vs. fake reviews in pie chart and histogram formats based on the chosen strictness level (high, medium, low). Alternatively, manually input a review to check its authenticity.

Installation & Setup

To reproduce ReviewRadar on your local system, follow the steps below for both the frontend and backend setups.

Prerequisites

Frontend:
- Node.js (v14.x or higher)
Backend:
- Python (v3.12.x)
- pip (comes with Python)

Frontend Setup

Clone the Repository:

git clone https://github.com/muzzlol/review-radar.git

Navigate to the Frontend Directory:
```
cd review-radar/frontend
```
Install Dependencies:
```
npm install
```
Start the Frontend Application:
```
npm start
```
The application will run on http://localhost:3000 by default.

Backend Setup

Navigate to the Backend Directory:
```
cd ../backend
```
Set Up Environment Variables:
- Create a .env file in the backend directory based on the .env.example provided.
- Add necessary environment variables, such as your OpenAI API key.
```
OPENAI_API_KEY=your_openai_api_key
```
Choose Your Installation Method:
- Using pip:
  1. Create a Virtual Environment:
```
python3.12 -m venv venv
```
  2. Activate the Virtual Environment:
    - On macOS/Linux:
      source venv/bin/activate
    - On Windows:
      venv\Scripts\activate
  3. Install Dependencies:
```
pip install --upgrade pip
pip install -r requirements.txt
```
- Using Poetry:
  1. Install Poetry (if not already installed):
```
curl -sSL https://install.python-poetry.org | python3 -
```
  2. Set Python Version:
```
poetry env use python3.12
```
  3. Check Python Version & Environment Path:
```
poetry env info
```
    Take note of the Path field in the output - you can use this path as the Python interpreter in VS Code or your preferred editor's settings.
  4. Activate the Virtual Environment:
```
poetry shell
```
  5. Install Dependencies:
```
poetry install
```
Run the Backend Server:
```
poetry run uvicorn main:app --reload
```
The backend server will run on http://localhost:8000 by default.

Contributing

Contributions are welcome! Please open an issue or submit a pull request for any changes or improvements. Future work includes upgrading the model to a deep learning approach to better capture the sequential relationships between words and their indexing in the overall review text.

License

MIT License. See LICENSE for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReviewRadar

Table of Contents

Key Differences Between Fake and Real Reviews

Features

API Endpoint

Example Request:

Example Response:

Model Approach

Algorithm

Process

Outcome

How It Works

Usage Example

Installation & Setup

Prerequisites

Frontend Setup

Backend Setup

Contributing

License

About

Releases

Packages

Contributors 3

Languages

muzzlol/review-radar

Folders and files

Latest commit

History

Repository files navigation

ReviewRadar

Table of Contents

Key Differences Between Fake and Real Reviews

Features

API Endpoint

Example Request:

Example Response:

Model Approach

Algorithm

Process

Outcome

How It Works

Usage Example

Installation & Setup

Prerequisites

Frontend Setup

Backend Setup

Contributing

License

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages