Releases: williajm/forgery
v0.4.0
Highlights
Package Registry Providers
Cross-ecosystem fake data for seeding PyPI, npm, Maven, Cargo, and RubyGems test databases. 22 method pairs (44 Python-visible methods).
- Cross-ecosystem primitives:
commit_sha()/short_commit_sha(),semver()/semver_prerelease(),calver(),spdx_license()(50 common IDs),git_username()(GitHub/GitLab/Bitbucket rules). - Ecosystem-specific versions:
pypi_version()(PEP 440 with pre/post/dev releases),maven_version()(with qualifiers like-SNAPSHOT,.RELEASE,.Final,-RC1). - Version constraints:
pypi_version_specifier()(PEP 440),npm_version_range(),cargo_version_req(),maven_version_range(),gem_version_requirement(). - Package identity:
pypi_package_name()(PEP 503 normalised),npm_package_name()(plain or@scope/pkg),cargo_package_name(),gem_name(),maven_group_id(),maven_artifact_id(),maven_coordinate()(GAV). - Full requirement line:
pypi_requirement()(e.g.requests>=2.0.0,<3.0.0).
Nine of the batch methods accept unique=True for no-duplicate output — see the README's Package Registry Data section for full details.
Parallel Generation
Opt-in multi-threaded batch generation via Rayon. set_parallel(enabled, num_threads=None); ~3.3× speedup at 100K+ items. Deterministic for same seed + thread count.
Streaming file writer
records_to_file(path, n, schema, ...) writes records in chunks, keeping peak memory bounded by chunk_size regardless of n. Formats auto-detected from extension: CSV, NDJSON, SQL, Parquet. Includes estimate_memory() and optional progress callback.
Serialized output formats
records_csv(), records_json(), records_ndjson(), records_parquet(), records_sql() — serialised directly in Rust, skipping Python object materialisation.
Install
pip install forgery==0.4.0