-
Notifications
You must be signed in to change notification settings - Fork 0
Add CI/CD pipeline with GitHub Actions #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- introduce AllWordsTranslatedError and reuse its message - handle the custom type in translate and --count code paths - update the CLI test double to raise the new error class
Introduce `get_pair_file_paths()` and `backup_language_pair_files()` helper functions to simplify file path resolution and backup operations for language pairs.
Introduce `calculate_vocabulary_stats()` to compute total, translated, and pending word counts for a vocabulary file.
Introduce `rename_language_pair()` in config_handler to enable renaming existing language pairs while preserving default pair associations. Include tests for successful rename, validation of existing pairs, and prevention of duplicate pairs.
Introduce `AliasedGroup` class to enable command aliases (e.g., 'pair' → 'pairs') while hiding aliases from help output.
Add `pairs` command group with subcommands: list, add, default, set-default, remove, rename, inspect. Remove deprecated commands (setup, default, show, config default, config remove). Extract `create_language_pair_interactively()` helper from old setup command. Add helper functions for pair selection, parsing, and management. Update all related tests to reflect new command structure.
Update README.md to reflect new `pairs` command group and enhanced command options. Add documentation for `pairs list`, `pairs set-default`, `pairs remove`, `pairs rename`, and `pairs inspect` commands.
Use single quotes for test string literal in test_backup_occurs_before_chat_request to maintain consistency with project coding style.
The test was incorrectly expecting that canceling the pairs add operation would save the entered values. This contradicts the actual behavior where cancellation prevents any state changes. Add initial state setup and verify that the original state is preserved when the user cancels adding a new language pair.
* Add `pyproject.toml` with modern build system configuration
* Use setuptools>=61.0 as build backend
* Migrate all metadata from `setup.py`
* Add dependency version constraints to prevent vulnerable updates:
* `click ~= 8.3` (compatible release for 8.x)
* `openai >= 0.28, <1.0` (preserve <1.0 constraint)
* `tiktoken ~= 0.12` (compatible release for 0.12.x)
* Add dev dependencies under `[project.optional-dependencies]`
* Include comprehensive PyPI classifiers and project URLs
* Remove deprecated packaging files:
* Delete `setup.py`
* Delete `requirements.txt`
* Delete `requirements-dev.txt`
* Add `uv.lock` to `.gitignore` (package, not application)
This modernizes the build system per PEP 517/518 and simplifies
dependency management. Version constraints use compatible release
(`~=`) to allow security patches while preventing breaking changes.
All tests pass (111 passed).
* Update `gpt_integration.py` * Change pricing from per-1k to per-1M tokens * Add support for `gpt-4.1`, `gpt-4o`, `o1`, `o3`, `gpt-5` models * Add fallback to `cl100k_base` encoding for token counting * Change `estimate_prompt_cost()` to accept `model` parameter and return single value * Update `cli.py` * Fix `compute_prompt_estimate()` to use `gpt-4.1` instead of `gpt-3.5-turbo` * Improve output colors (yellow for values, blue for model) * Update wording for better clarity * Update test suite * Fix test mocks to return single cost string * Update assertions to match new output format
* Use whitelist approach with character set validation (simple, Pythonic, secure) * Only allow letters, numbers, underscores, and hyphens * Replace regex with `set(string.ascii_letters + string.digits + '_-')` * User-friendly error messages (no scary security jargon) * Validate language names at all entry points (CLI, config, utils) * Add comprehensive security tests in `test_path_traversal_prevention.py` This prevents attackers from reading/writing files outside the intended data directory by using malicious language pair names like `../../../etc/passwd`. The whitelist blocks ALL dangerous characters that could be used for path traversal: `.`, `/`, `\`, and any special characters. Fixes: PERSO-186
* Add `is_same_language_pair()` to detect matching languages * Add `get_pair_mode()` to return "definition" or "translation" mode * Use case-insensitive comparison via `casefold()` These utilities enable the tool to determine whether a language pair should operate in translation mode (different languages) or definition mode (same language).
* Update prompt system to support definition mode * Update Anki deck naming for definition mode
* Test `is_same_language_pair()` with exact matches and case variations * Test `is_same_language_pair()` correctly identifies different languages * Test `get_pair_mode()` returns "definition" for same-language pairs * Test `get_pair_mode()` returns "translation" for different-language pairs * Verify case-insensitive matching works correctly All tests pass, confirming the mode detection logic works as expected.
Add `test_gpt_integration.py` with 5 tests for `format_prompt()` covering translation mode, definition mode, default behavior, multiple words handling, and system message consistency. Extend `test_csv_handler.py` with 2 tests for same-language pairs verifying definition mode deck naming and case-insensitive behavior. All 34 tests pass.
Create `test_same_language_integration.py` with 6 comprehensive integration tests: * Test full workflow with same-language pairs generates definitions * Verify Anki output uses "definitions" deck name for same-language pairs * Test case-insensitive language matching (English:ENGLISH) * Verify different-language pairs still work with translation mode * Confirm Anki output uses "vocabulary" deck name for translation mode * Test prompt generation correctly detects and applies mode All 129 tests pass.
Add documentation for the new same-language pair feature: * Add feature bullet in Features section explaining definition mode * Add subsection "Definition mode for same-language pairs" with usage example * Update Table of Contents to include new subsection * Explain how same-language pairs work (concise definitions, target language examples, Anki deck naming)
Add \`ensure_csv_has_fieldnames()\` call to \`get_words_to_translate()\` to prevent KeyError when CSV files have incorrect headers (e.g., 'translate' instead of 'translation'). Ensures all vocabulary files have correct fieldnames before reading.
Remove "concise" and "(2-3 words)" constraints from definition mode prompt: * Change "Provide concise definitions" to "Provide definitions" * Remove "(2-3 words)" length specification * Update format example to use "definition" instead of "concise definition" This allows the LLM to generate more detailed, comprehensive definitions when appropriate.
* Add atomic write pattern to prevent config file corruption * Add CSV sanitization to prevent formula injection attacks
* Add directory path validation to prevent system directory access * Add word input validation to prevent CSV corruption and injection
* Reformat code with `ruff format` * Apply automatic fixes with `ruff check --fix`
8bbecb6 to
472f613
Compare
The test was missing the `fake_home` fixture, causing it to write to the real user config file at `~/.config/vocabmaster/config.json` instead of an isolated test location. * Add `fake_home` fixture parameter to test signature * Add explicit config file path monkeypatch for clarity * Tests now properly isolated and don't touch user config The bug was introduced in 8652183 when the path traversal tests were added.
* Remove `uv.lock` from `.gitignore` * Add `uv.lock` file to repository The migration commit (69a0ca2) incorrectly added `uv.lock` to `.gitignore` with the reasoning "(package, not application)". However, per uv's official documentation, the lockfile should be committed for all projects - both libraries and applications - to ensure reproducible environments across development and deployment. References: * https://docs.astral.sh/uv/concepts/projects/layout/#the-lockfile * https://docs.astral.sh/uv/guides/projects/#uvlock
* Add `pytest-cov` for code coverage reporting in CI * Add `ruff` for fast, comprehensive linting and formatting * Configure ruff with 100-character line length * Enable isort-style import sorting (I001) These dependencies support the new CI/CD pipeline workflows.
* **Lint job**: Run `ruff` linter and formatter with GitHub annotations * **Test job**: Matrix testing across Python 3.10-3.12 and Ubuntu/macOS/Windows * **Security job**: Run `pip-audit` for vulnerability scanning * **Build job**: Build distribution packages with metadata validation * **Install test job**: Verify package installation across all platforms * **All checks pass job**: Aggregate status check using alls-green Key features: * Concurrency control to cancel outdated runs * Optimized caching strategy for `uv` dependencies * Coverage reporting to Codecov * JUnit XML test results and HTML coverage reports * SLSA Level 3 build provenance attestation * Retention policies for artifacts (30-90 days) The workflow runs on push to `main`, pull requests, and manual dispatch.
* **Validate tag job**: Verify semver format and match with `pyproject.toml` * **Test job**: Run full test suite before releasing * **Build job**: Create distribution packages with hash generation * **Generate provenance job**: SLSA Level 3 supply chain security * **Publish PyPI job**: Automated publishing with trusted publishing (OIDC) * **Create release job**: GitHub release with auto-generated changelog * **Announce job**: Success summary with links * **Rollback job**: Failure handling and recovery instructions Key features: * Semver validation (v*.*.*) with pyproject.toml version matching * Manual dispatch support for flexible releases * Grouped changelog by feature/fix/other * Prerelease detection for alpha/beta/rc tags * SLSA provenance attached to releases for verification * Comprehensive error handling and rollback guidance Triggered on tags matching `v*.*.*` or manual workflow dispatch.
* Check for outdated dependencies using `uv pip list --outdated` * Create or update GitHub issue when outdated packages found * Run every Monday at 9:00 UTC * Manual dispatch available * Smart issue management (avoid duplicates) The workflow automatically creates issues labeled `dependencies` and `automated` for tracking updates.
* Overview of all workflows (CI, release, dependencies) * Setup instructions for PyPI trusted publishing * Manual trigger examples using `gh` CLI * Release process documentation * Status badge snippets for README * Local testing commands * Troubleshooting guide This documentation helps maintainers understand and operate the CI/CD pipeline effectively.
Add security workflow with three scanning jobs: * **Dependency scan** - Check for vulnerable dependencies with `pip-audit` and `safety` * **Secret scan** - Detect leaked credentials using Gitleaks * **CodeQL analysis** - Semantic security vulnerability detection Runs on push/PR to main, daily at 2:00 UTC, and supports manual dispatch. Standalone approach provides better security coverage than the integrated job in `ci.yml`.
Remove the integrated security audit job from `ci.yml` since security checks are now handled by the standalone `security.yml` workflow. This separates security concerns from the main CI pipeline. * Remove `security` job (dependency scan with `pip-audit`) * Update `build` and `all-checks-pass` jobs to remove `security` from `needs` arrays
Update `.github/workflows/README.md` to document the new standalone security workflow: * Add section 4 describing `security.yml` workflow and its three scanning jobs * Add `security.yml` to manual workflow triggers * Update troubleshooting section to reference the standalone security workflow * Replace "Security Audit Issues" with "Security Scan Issues"
472f613 to
f9117cb
Compare
|
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
Summary
This PR introduces a comprehensive CI/CD pipeline using GitHub Actions.
Changes
1. Development Dependencies (
pyproject.toml)pytest-cov ~= 6.0for code coverageruff ~= 0.14for linting and formatting2. CI Workflow (
.github/workflows/ci.yml)Features:
3. Release Workflow (
.github/workflows/release.yml)Features:
pyproject.tomlversion4. Dependency Check Workflow (
.github/workflows/dependencies.yml)5. Documentation (
.github/workflows/README.md)Key Features
Security
Developer Experience
Automation
Prerequisites
Before merging, configure PyPI trusted publishing:
release.ymlpypiTesting
All workflows have been validated:
The CI will run automatically on this PR to validate everything works!
Author: AI 🤖