Egyptian Customs Scraper Toolkit

A public, portfolio-friendly Python scraping project for collecting and structuring publicly accessible Egyptian Customs tariff and legislation pages from customs.gov.eg.

This repository is intentionally code-first: it includes scraper source, helper utilities, lightweight checks, and a tiny sanitized sample output. Full scraped datasets, generated JSON exports, logs, archives, browser caches, and local settings are excluded.

Features

Scrape Egyptian Customs tariff / HS code pages by chapter.
Scrape legislation and circular listings from public Egyptian Customs pages.
Extract and enrich tariff details with Playwright-powered browser automation.
Provide helper scripts for AJAX inspection, pagination checks, ID extraction, and PDF/HTML matching workflows.
Include lightweight test and debugging scripts for validating scraper structure and page behavior.

Project Structure

src/           Scraper and helper source files
tests/         Lightweight checks and test scripts
sample_data/   Tiny sanitized example output
web/           Reserved for optional public-safe demos

Setup

Create and activate a virtual environment:

python -m venv .venv
.\.venv\Scripts\Activate.ps1

Install Python dependencies:

pip install -r requirements.txt

Install Playwright browser binaries:

playwright install

Example Usage

Run the scrapers from the repository root:

python .\src\scrape_all_chapters.py
python .\src\scrape_customs.py
python .\src\scrape_legislations.py

Some scripts may create local output such as JSON files, per-chapter data folders, or logs. These outputs are ignored by Git and are not part of the public repository.

Output

Generated scraper outputs are local-only by default. The repository includes only sample_data/sample_output.json, a tiny sanitized sample that demonstrates the expected shape of output records without publishing the full scraped dataset.

Source Attribution

The scraper references publicly accessible pages from the Egyptian Customs website:

https://customs.gov.eg

Legal Disclaimer

This is an unofficial educational and research project intended as a data-engineering and web-scraping portfolio showcase. It is not an official data source, government service, or commercial customs platform.

The scraper utilities interact with publicly accessible pages of the Egyptian Customs website: https://customs.gov.eg

This repository intentionally excludes:

full scraped datasets generated exports logs archives cached data

Only source code, helper utilities, lightweight tests, and minimal example outputs are included.

Users are solely responsible for ensuring compliance with all applicable laws, website terms of use, robots.txt policies, rate limits, and data usage requirements. Please use respectful request rates and avoid disrupting public services or infrastructure.

Non-Affiliation Statement

This project is not affiliated with, endorsed by, sponsored by, or officially connected to the Egyptian Customs Authority or any government entity.

No government logos, official branding, or complete scraped databases are included in this repository.

License

Released under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Egyptian Customs Scraper Toolkit

Features

Project Structure

Setup

Example Usage

Output

Source Attribution

Legal Disclaimer

Non-Affiliation Statement

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
sample_data		sample_data
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Egyptian Customs Scraper Toolkit

Features

Project Structure

Setup

Example Usage

Output

Source Attribution

Legal Disclaimer

Non-Affiliation Statement

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages