Security Scanner & Threat Audit Tool for Veeam Backup & Replication Restore Points using the Veeam Data Integration API.
Imagine you’re responsible for your company’s security. You have backups. You trust them. But what if malware or suspicious files are hiding right now or were silently active weeks ago? Wouldn’t it be great to look inside your backups, like a time machine, and search for threats?
That’s exactly what Retro Hunter does! It connects directly to your Veeam backups and scans files from past restore points for signs of malware, hacked binaries, and other suspicious behavior. And the best part: it comes with a beautiful, easy-to-use dashboard that shows you everything.
Retro Hunter is a lightweight Python-based toolkit that scans Veeam Backup & Replication restore points for malware, suspicious binaries, and unusual patterns. It allows you to investigate historical restore points and perform regular scans over time. By integrating with threat intelligence sources like MalwareBazaar, it continuously checks known file hashes and monitors the scanning results.
- Search for Veeam backups from a specific host or Veeam Backup Repository
- Mounts the manually selected backup via the Veeam Data Integration API or mounts the latest restore point from all supported platforms from a Veeam Backup Repository
- Scans mounted Veeam restore points
- Saves useful metadata from the presented files
- Parallelized scanning using Python’s multiprocessing for fast performance
- Tracks file changes across restore points
- Detects known malware using hash lookups from MalwareBazaar (offline)
- Identifies out-of-place LOLBAS (Living Off the Land Binaries and Scripts)
- Optionally applies YARA rules to selected file types during scan
- Optionally scans the Windows Eventlog for specific event-ids (Security/PowerShell/Sysmon) 🔴 Enhanced in v2.4
- Optionally scans specific Windows Registry Hives 🔴 Expanded in v3.0.1
- NAS AV Scan with all features from this script & YARA scan capabilities. Scan results visible in Streamlit dashboard 🔴 New in v2.3
- Uses PostgreSQL as the database.
Run the setup script, and start scanning! Execute setup.sh with the path to your malwarebazaar.csv file to initialize the environment and databases. Once done, you can use retro-hunter.py to analyze mounted restore points for malware, LOLBAS, and YARA hits, index all executables and scripts, scan the Windows Registry, or scan specific Windows Eventlogs. Finally, open the dashboard at https:// to explore the results.
Important You must add the Linux server on which this script is executed to the backup infrastructure.
Tested on UBUNTU 24.04
Version: 3.0.1
Requires: Veeam Backup & Replication & Linux & Python 3.1+
Author: Stephan "Steve" Herzig
The setup process is simplified with the setup.sh script. You only need to download the malwarebazaar.csv file (See more in the Technical Details of the Scripts and run the script on your Linux host. You will be asked for the Veeam Backup & Replication Server hostname, REST API user, and a password during the setup. The password will be securely encrypted and stored using Fernet. The script now also asks for the SMB password for the NAS Scanner 🔴 New in v2.3
./setup.sh /path/to/malwarebazaar.csv /path/to/retro-hunter-dirA Docker setup is provided to run Nginx, web frontend, backend (REST API) and the PostgreSQL database as a container.
Some preparations are required for this script to run.
- You must add the Linux server on which this script is executed to the backup infrastructure.
- Create a separate user for REST API access. The user must be assigned the “Veeam Backup Administrator” role.
The following Python modules are not part of the standard library and must be installed separately using pip. A requirements.txt contains the necessary modules.
- python3-pip
- python3-venv
- python3-evtx
- python3-yara
- python3-colorama
- python3-requests
- python3-dateutil
- python3-magic
- python3-pefile
- python3-psycopg2
- python3-dotenv
- python3-cryptography (for cryptography.fernet.Fernet)
- and more for the Docker containers (requirements.txt)
Save additional YARA rule files in the script folder directory yara_rules. (File extensions .yar and .yara). A sample rule is stored in the yara_rules directory.
The wepage includes a YARA rule generator that creates rules based on stored executables with high entropy and PE metadata.
The malwarebazaar table in contains the SHA256 values of the malware files. Download the complete data dump and unzip the CSV file as malwarebazaar.csv to the folder where the setup.sh script is stored. The setup.sh script will import the values into the database.
The following parameters must be passed to the script
--host2scanHostname for which the backups must be presented.--repo2scanRepository name for which the hosts and restore points are retreved. Can be combined with --all.--all(optional) Scans the latest restore point of all valid hosts in the specified repository. Recommended to use with --iscsi for better performance. Supported platforms are VMware and Hyper-V.--scanTriggers the malware and threat detection scan by executing the scanner.py script after a restore point has been mounted.--storeCollects the metadata for all relevant binary files by executing the store.py script after a restore point has been mounted.--maxhosts(optional) The maximum number of hosts to be scanned in parallel when using --all. (Default 1)--workers(optional) The number of workers to use for the scanning process. (Default 4)--iscsi(optional) Present the backups using iSCSI. Only filesystems with the NTFS, ext4 and xfs filesystem can be scanned.--yaramode(optional) YARA scan mode - off (default), all, suspicious (scans only files that show indicators of compromise), content (Targets commont document/text files to detecte sensitive data patterns (e.g. PII, credentials), highentropy--evtscan(optional) Enables scanning of Windows Event Logs--evtlogs(optional) Comma-separated list of EVTX log files to scan--days(optional) Limit EVTX parsing to events within N days before restore point timestamp--regscan(optional) Enables scanning of the Windows Registry 🔴 ENHANCED--dryrun(optional) Just shows the available restore points. **🔴 NEW V3.0 **
Some examples of how the script can be executed.
Scan host win-server-01. Restore points are presented using iSCSI.
sudo ./retro-hunter.py --host2scan win-server-01 --iscsi --scanScan the latest restore point of all suppored hosts from Veeam Repository "Repository 01". Triggers a YARA scan, when a suspicious file is found. Restore points are presented using iSCSI.
sudo ./retro-hunter.py --repo2scan "Repository 01" --yaramode suspicious --iscsi --scanSpecific folders are excluded from the scan process. You can adjust the list using the existing DEFAULT_EXCLUDES variable.
The Retro Hunter Python script is not using all the available scanner.py parameters. This can be adjusted if necessary.
- Mount Path to the mounted backup filesystem
Optional Parameters
--Filetypes(optional) Comma-separated list of file extensions (e.g., .exe,.dll)")--Maxsize(optional) Maximum file size in MB--Exclude(optional) Experimental Comma-separated list of directories to exclude (partial paths)--CSV(optional) Save results to this CSV file--Verbose(optional) Print all scanned files, not just matches--Logfile(optional) Path to logfile for matches (might get removed)--yara(optional) YARA scan mode (off, all, suspicious, content)
This script scans the mounted file system and collects detailed metadata for selected files (DEFAULT_BINARY_EXTS list or --filetypes is used). It calculates a SHA-256 hash for each file and stores all data in a local PostgreSQL database. If the database does not exist, it is automatically created during the first run. The metadata stored includes file name, path, size, timestamps, extension, file type, and whether the file is executable. Each entry is tagged with a hostname, restore point ID, and timestamp, making later comparisons across backups possible. The script supports parallel processing and can speed up scanning using multiple CPU cores. Filters can be applied to limit which files are scanned: only specific file types (like .exe or .dll), a maximum file size, and folders to exclude. The script extracts key information for each file that matches the filters and calculates its SHA-256 hash. All collected data is inserted into the database, unless an entry with the same hostname and hash already exists.
The script detects files based on their extensions:
- Executables: .exe, .dll, .bin, .sh, etc.
- Scripts: .py, .js, .ps1, .bat
- Images: .jpg, .png, .gif, etc.
- Documents: .pdf, .docx, .txt, etc.
- Archives: .zip, .tar, .7z, etc.
Entropy is calculated using the Shannon entropy formula, based on the distribution of byte values in the file. The script reads the full content of each file and computes how evenly the byte values (0–255) are distributed, which reflects the randomness of the data. Highly random content is typical for encrypted, packed, or obfuscated files—techniques often used by malware to avoid detection. When such high-entropy files are found in directories like AppData, ProgramData, Temp, Public, Downloads, or Recycle.Bin, they are flagged as potentially suspicious. These locations are frequently abused by attackers to stage or hide payloads, since they are writable and often excluded from routine checks. By correlating entropy levels with file system paths, the dashboard view aims to surface files that deserve further analysis, even if they have not been flagged by signature-based scanners.
| Entropy Range | Description |
|---|---|
| 0.0 – 3.5 | Very low entropy – likely plain text, config files, or uncompressed data. |
| 3.5 – 6.5 | Medium entropy – common for standard executables, DLLs, scripts, etc. |
| 6.5 – 7.5 | Elevated entropy – may indicate mild compression or some obfuscation. |
| > 7.5 | High entropy – potentially packed, encrypted, or malicious (e.g., malware). |
For executable files with high entropy (>= 7.5), store.py automatically extracts additional metadata, including file type signatures (Magic) and Portable Executable (PE) attributes to support deeper malware analysis. It helps detect packed malware, malware droppers with recent compilation dates, and potentially unwanted programs. The results are displayed in the dashboard. (Entropy >= 7.9 and PE Timestamp > 2024-06-15)
--mount(mandatory) Root directory to scan--hostname(mandadory) The name of the host to which this data belongs--restorepoint-id(mandatory) The Veeam restore point ID--rp-timestamp(mandatory) Timestamp of the restore point--Filetypes(optional) Comma-separated file extensions to scan--workers(optional) Number of parallel worker processes to use (default: half of CPU cores)--maxsize(optional) Max file size in MB to include (e.g., skip huge ISO files)--exclude(optional) Comma-separated list of folder names to skip--db(optional) SQLite DB path (default is file_index.db)
The event-parser.py script extracts and analyzes security-related events from Windows event logs. It focuses on a defined set of Event IDs (Security & PowerShell Event Log) known to indicate potential threats, policy changes, or suspicious PowerShell usage.
• Windows Security Event Ids (High Severity) The event-parser.py script extracts and analyzes security-related events from Windows event logs. It focuses on a defined set of Event IDs from the Windows Security and PowerShell event logs that are known to indicate potential threats, policy changes, or suspicious activity. • Windows Security Event IDs 4618, 4649, 4719, 4765, 4766, 4794, 4897, 4964, 5124, 1102 The list is based on official Microsoft recommendations (Events to Monitor). • PowerShell Event IDs 800, 4104 • Sysmon Event IDs The script was extended to also support Sysmon events, allowing host-based telemetry such as process, network, file, and registry activity to be collected when Sysmon is available. 1, 2, 3, 5, 6, 7, 8, 10, 11, 12, 13, 15, 22, 23, 25
To parse multiple log sources, the script can be invoked with:
--evtlogs Security.evtx,Microsoft-Windows-Sysmon%4Operational.evtx,Microsoft-Windows-PowerShell%4Operational.evtxThese events are parsed, stored in the database, and visualized in the dashboard with severity classifications (High / Medium to High).
The registry-scan.py script scans offline Windows Registry hives for security-relevant registry keys and values. It focuses on known persistence mechanisms, execution hijacking techniques, forensic artefacts, and configuration changes commonly used by malware, attackers, or administrative tools.
The script parses the SYSTEM, SOFTWARE, and per-user NTUSER.DAT hives, matches keys against a predefined set of suspicious patterns (ASEPs, services, drivers, Winlogon, Run keys, Defender configuration, user activity, network artefacts).
Same script functionality and parameters as in this script, except that the scan-engines.json also includes the path to YARA (needs to be installed first). The nas-scanner.py script must be run separately from the retro-hunter.py script. All findings are stored in the PostgreSQL database and visualized in the Scans tab of the Streamlit dashboard.
Simply run get-malware-csv.py
This script removes outdated entries from key PostgreSQL tables in your Retro Hunter environment: • files • scan_findings • win_events • registry **🔴 NEW ** • nas_scan_findings (to be added)
Option Description
--days NDelete entries older than N days (based on restore point timestamp fields)--dry-runOnly show what would be deleted, nothing is removed--clean-onlyOnly delete entries that are not flagged as malware/YARA/LOLBAS hits--host HOSTNAMERestrict cleanup to a specific hostname
Dry-run: preview deletions older than 60 days
./db-cleaner.py --days 60 --dry-runFully delete old non-malicious data
./db-cleaner.py --days 90 --clean-onlyCleanup only for a specific host
./db-cleaner.py --days 30 --host WIN-VM1- Many ideas. Stay tuned
- Mark the scanned restore point as infected in Veeam Backup & Replication.
- And a few other nice things that I'm currently researching.
- How about an ISO file with the OS and the setup script? (There is one ready to be used)
- The scripts have been created and tested on Ubuntu 24.04 and Veeam Backup & Replication 12.3.1 and 12.3.2 and 13.0.1. The retro-hunter.py script uses REST API revision 1.3-rev0.
- Only filesystems with the NTFS can be scanned when presenting the restore points using iSCSI
- When mounting NTFS disks, it’s important to know that Ubuntu (from version 24.04 and newer) uses the built-in ntfs3 kernel driver, which provides better performance and more stable access. In contrast, Rocky Linux and other RHEL-based systems usually rely on the older ntfs-3g driver through FUSE, which is slower because it runs in user space. This means that the way NTFS is handled can vary depending on the system. It is technically possible to upgrade Rocky Linux to a newer Kernel (5.15 or higher) to support the native ntfs3 driver. Mounting NTFS volumes works well when using the -t ntfs parameter, especially with iSCSI attached disks. FUSE is not working and there are currently no efforts to conduct further research in this area.
- 3.0.1 (February 28 2026)
- Security hardening
- Performance & stability improvments in all main scripts (retro-hunter, scanner, regscan, evtscan)
- Other preparations for official "launch"
- 3.0 (January 2026)
- Remove Streamlit
- Frontend container for the Web Portal, incl. user management (admin & viewer role)
- Backend container for database queries
- Nginx container for frontend access
- Setup.sh updated
- Update.sh to update the Malwarebazaar table & the code
- 2.4 (January 5 2026)
- Event Parser: Added support for parsing Sysmon event logs
- Streamlit Dashboard fixes and Sysmon event listings
- 2.3 (November 18 2025)
- Windows Registry Scanner enhancements
- NAS AV Scanner incl. results in the Scans tab
- 2.2 (August 13 2025)
- Streamlit Dashboard cleanup. No error messages when a table does not exist (script did not run)
- Streamlit Dashboard now has tabs for a better UI experience (tested on Streamlit 1.48.0)
- DB clean-up script (db-cleaner.py)
- registry-scan.py script improvements. Added additional keys
- event-parser.py script now scans for extended event ids (EXTENDED_EVENT_IDS - Optional)
- registry-analyzer.py script for analyzing Windows Registry entries.
- 2.1 (August 5 2025)
- Windows Registry Scanner
- 2.0 (July 18 2025)
- PostgreSQL as a database.
- 1.3 (July 14 2025)
- The new YARA highentropy mode in scanner.py starts a scan using the stored YARA rules
- The store.py script extracts file type signatures (Magic) and Portable Executable metadata for high-entropy executable file
- Store.py script optimizations
- Streamlit dasboard update to show the High-Entropy Executable files
- YARA rule generator in the Streamlit dashboard on High-Entropy executable files
- 1.2 (July 8 2025)
- The scanner.py now also saves the SHA256 value for found LOLBAS files
- Streamlit Dashboard date filter is now applied to all tables showing the restore point date
- Streamlit Dashboard bugfixes (empty tables)
- The store.py script now performs an entropy analysis (Shannon formula)
- setup.sh script bugfixes
- 1.1 (June 26 2025)
- repo2scan now supports Scale-Out Backup Repositories
- Store specific Windows Event Log entries (Security & PowerShell Event log first)
- Dashboard Update
- 1.0 (June 2025)
- Initial version
This script is not officially supported by Veeam Software. Use it at your own risk.
Made with ❤️, fueled by 🍺, and powered by the Veeam Data Integration API. Inspired by real-world needs, supported by a bit of artificial intelligence for hardening the scripts.
