Reelphish

Phishing is the dominant initial access vector across ransomware, business email compromise, and credential-harvesting campaigns. When a suspicious .eml file lands in a SOC ticket — from a user report, an email gateway alert, or an IR artifact dump — triage is often manual: copying headers into a text editor, inspecting URLs one at a time, and cross-referencing threat intel by hand.

Reelphish automates that triage. It parses .eml files, extracts every observable indicator, scores URLs against a weighted ruleset, and optionally enriches results through VirusTotal — producing a structured risk report in seconds.

How It Works

Analysis runs in three sequential stages:

1. Parse — The .eml file is ingested via Python's email.parser.BytesParser, which handles MIME multipart structures (text/plain, text/html, multipart/mixed). Headers are extracted by field. URLs are pulled from both the plain text and HTML body parts using regex, deduplicated, and passed downstream. Attachment metadata (filename, MIME type, byte size) is captured without writing content to disk.

2. Score — Each URL is evaluated against a pattern ruleset. Scoring is additive: signals stack rather than override. A single .tk domain hosting a login keyword over HTTP accumulates enough signal to flag without any external threat intel.

3. Enrich (optional) — When a VirusTotal API key is present, URLs are submitted to the VT v3 API. Engine verdict counts (malicious / suspicious / harmless / undetected) are appended to the per-URL result. Rate limiting for the free tier is enforced internally.

IOC Extraction

Header Signals

Field	What's Checked
`From`	Sender address extracted; domain isolated for mismatch comparison
`Reply-To`	Domain compared against `From` — divergence flags sender spoofing
`Subject`	Captured for report context

Reply-To mismatch is one of the more reliable spoofing tells. An attacker sending from a lookalike domain often needs replies routed to infrastructure they actually control — and that divergence surfaces in the headers. The tool compares the domain portion of From against Reply-To and emits a warning when they differ.

URL Scoring

Each URL is scored independently. The final score determines whether the URL is flagged and contributes to the overall risk level.

Signal	Score	Rationale
Suspicious TLD (`.tk`, `.ml`, `.ga`, `.cf`, `.gq`, `.xyz`, `.top`)	+2	Free registrar TLDs disproportionately used for throwaway phishing infrastructure
IP address as host	+3	Legitimate services don't use raw IPs; common in kit-based phishing with no domain attribution
Phishing keyword in domain (`verify`, `secure`, `login`, `account`, `confirm`, etc.)	+1 each	Attackers name domains to impersonate legitimacy; keywords accumulate
HTTP instead of HTTPS	+1	Absence of TLS on a credential-harvesting page is a signal in context
VirusTotal malicious detection	+5	Hard confirmation from 70+ engine consensus

Threshold: Score ≥ 3 → SUSPICIOUS

Risk rollup:

Level	Condition
`HIGH`	Max URL score ≥ 5, or more than 3 suspicious URLs
`MEDIUM`	Max URL score ≥ 3, or more than 1 suspicious URL
`LOW`	Below threshold

Attachments

Filename, MIME type, and byte size are extracted and surfaced in the report. No content is saved to disk or parsed beyond MIME metadata.

Detection in Action

Scenario: Microsoft account security alert impersonation — multipart HTML email with a spoofed sender domain and a .tk-hosted payload embedded in both the plain text and HTML parts.

The email:

From: Microsoft Support <security@microsoft-account-verify.xyz>
To: user@example.com
Subject: Action Required: Unusual Sign-In Activity
Date: Tue, 04 Feb 2026 14:22:00 +0000

The HTML part renders a convincing branded notification — Microsoft logo, a sign-in alert table, and a styled "Secure My Account" CTA button. The sender domain (microsoft-account-verify.xyz) passes a casual visual scan. Every URL in the email, including the logo image request, resolves to the same attacker-controlled host.

Terminal report (color-coded in terminal — red for suspicious/high risk, green for clean/low risk):

+-----------------------------------------------------------------------------+
|                       PHISHING EMAIL ANALYSIS REPORT                        |
|                    Generated: 2026-02-22T09:14:37.882041                    |
+-----------------------------------------------------------------------------+

-------------------------------- EMAIL HEADERS --------------------------------
+---------------------------------------------------------------------+
| From    | Microsoft Support <security@microsoft-account-verify.xyz> |
| To      | user@example.com                                          |
| Subject | Action Required: Unusual Sign-In Activity                 |
| Date    | Tue, 04 Feb 2026 14:22:00 +0000                           |
+---------------------------------------------------------------------+

--------------------------- URL ANALYSIS  (2 found) ---------------------------
+--------------------------------------------------------------------+
| URL                                       |  Score  |    Status    |
|-------------------------------------------+---------+--------------|
| http://microsoft-secure-login.tk/verify   |    5    |  SUSPICIOUS  |
| http://microsoft-secure-login.tk/logo.png |    5    |  SUSPICIOUS  |
+--------------------------------------------------------------------+
  http://microsoft-secure-login.tk/verify
    • Suspicious TLD: .tk
    • Suspicious keyword in domain: secure
    • Suspicious keyword in domain: login
    • Uses Http instead of Https

  http://microsoft-secure-login.tk/logo.png
    • Suspicious TLD: .tk
    • Suspicious keyword in domain: secure
    • Suspicious keyword in domain: login
    • Uses Http instead of Https

+-----------------------------------------------------------------------------+
|                             OVERALL RISK: HIGH                              |
+-----------------------------------------------------------------------------+

Both URLs hit the same infrastructure. The logo request is worth noting — attackers frequently use tracking pixels and remote image loads on their phishing domains, meaning any email open sends a beacon and confirms the address is live.

JSON report (--format json):

{
  "timestamp": "2026-02-22T09:14:37.882041",
  "email": {
    "subject": "Action Required: Unusual Sign-In Activity",
    "sender": "Microsoft Support <security@microsoft-account-verify.xyz>",
    "recipient": "user@example.com",
    "date": "Tue, 04 Feb 2026 14:22:00 +0000",
    "attachments": [],
    "reply_to": "",
    "reply_to_mismatch": false
  },
  "url_analyses": {
    "total_urls": 2,
    "results": [
      {
        "url": "http://microsoft-secure-login.tk/verify",
        "is_suspicious": true,
        "suspicion_score": 5,
        "reasons": [
          "Suspicious TLD: .tk",
          "Suspicious keyword in domain: secure",
          "Suspicious keyword in domain: login",
          "Uses Http instead of Https"
        ],
        "virustotal_result": null
      },
      {
        "url": "http://microsoft-secure-login.tk/logo.png",
        "is_suspicious": true,
        "suspicion_score": 5,
        "reasons": [
          "Suspicious TLD: .tk",
          "Suspicious keyword in domain: secure",
          "Suspicious keyword in domain: login",
          "Uses Http instead of Https"
        ],
        "virustotal_result": null
      }
    ]
  },
  "risk_level": "HIGH"
}

JSON output is suitable for piping into a SIEM, case management system, or enrichment pipeline.

Reply-To Spoofing Detection

A separate detection path covers sender spoofing via header manipulation. When From and Reply-To resolve to different domains, the report flags it:

From: Amazon Support <support@amazon.com>
Reply-To: refund-claims@amaz0n-support.ru

⚠  Reply-To mismatch detected — possible sender spoofing

This pattern appears in BEC and refund scams where the attacker impersonates a trusted brand in the From field but needs victim replies to reach infrastructure they control. The From domain passes a visual check; the Reply-To exposes the operation.

Setup

Requirements: Python 3.10+

git clone https://github.com/xsubv/reelphish.git
cd reelphish

python -m venv venv
source venv/bin/activate        # Linux/macOS
# venv\Scripts\activate         # Windows

pip install -r requirements.txt

VirusTotal integration (optional):

cp .env.example .env
# Edit .env: VIRUSTOTAL_API_KEY=your_key_here

Free API keys at virustotal.com/gui/my-apikey. The free tier supports 4 requests/minute — rate limiting is handled automatically via a 15-second inter-request delay.

Usage

# Analyze a single email
python main.py suspicious.eml

# JSON output for downstream processing
python main.py suspicious.eml --format json

# Batch scan a directory
python main.py --directory /path/to/eml/exports/

Architecture

phishing_analyzer/
├── email_parser.py      # MIME parsing, header/URL/attachment extraction
├── url_analyzer.py      # Pattern scoring and VirusTotal enrichment
├── report_generator.py  # Text and JSON report formatting, risk rollup
└── config.py            # Suspicious indicator lists and score thresholds

Security Design

Passive analysis — URLs are scored by inspection, not visited. No outbound requests are made unless VirusTotal is explicitly configured via environment variable.
No attachment execution — Content is not extracted, saved, or parsed beyond MIME metadata (filename, content type, size).
Untrusted input handling — Email files are parsed via BytesParser with charset fallback and error handling for malformed or encoding-broken content.
Credential isolation — API keys are loaded from .env via python-dotenv; the file is excluded from version control.

Testing

pytest tests/ -v

The test suite covers nine .eml scenarios: classic phishing, HTML-based phishing, spear phishing with Reply-To mismatch, attachment-based phishing, and legitimate baselines for false-positive validation. CI runs on Python 3.10, 3.11, and 3.12 via GitHub Actions.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github		.github
phishing_analyzer		phishing_analyzer
tests		tests
.DS_Store		.DS_Store
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reelphish

How It Works

IOC Extraction

Header Signals

URL Scoring

Attachments

Detection in Action

Reply-To Spoofing Detection

Setup

Usage

Architecture

Security Design

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Reelphish

How It Works

IOC Extraction

Header Signals

URL Scoring

Attachments

Detection in Action

Reply-To Spoofing Detection

Setup

Usage

Architecture

Security Design

Testing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages