Skip to content

Release-prep v0.2.1: denial-reason hyphen fix + CI hardening#1

Merged
Jaypatel1511 merged 3 commits into
mainfrom
fix/denial-reason-hyphen-and-ci
May 29, 2026
Merged

Release-prep v0.2.1: denial-reason hyphen fix + CI hardening#1
Jaypatel1511 merged 3 commits into
mainfrom
fix/denial-reason-hyphen-and-ci

Conversation

@Jaypatel1511

Copy link
Copy Markdown
Owner

Summary

Release-prep for v0.2.1 on top of the verified denial-reason hyphen fix and CI hardening. Opening this PR exercises test.yml on a real Actions runner across Python 3.9–3.12 (so far only validated locally).

The fix being released

denial_reasons_by_race() returned empty on every live CFPB dataset: the Data Browser CSV names enumerated fields with hyphens (denial_reason-1), but _clean() only lowercased/stripped, so the underscore name the analysis expected never matched and the function silently returned empty. _clean() now replaces hyphens with underscores; live and synthetic data take the same path. A regression test (test_denial_reasons_by_race_handles_cfpb_hyphenated_columns) reproduces the v0.2.0 bug.

Audit outcome

Hostile audit passed: no code defects, denial-reason fix live-verified, pipeline hardened. This PR applies four release-prep edits surfaced by that audit.

The four edits

  1. Version bumppyproject.toml 0.2.0 → 0.2.1 (single source; __version__ still derives via importlib.metadata). CHANGELOG [Unreleased] promoted to [0.2.1] - 2026-05-29.
  2. Least-privilege CI tokens — top-level permissions: contents: read added to both workflows. The publish job keeps its own id-token: write block untouched (it does not inherit the top-level default).
  3. Test hardening (audit L4) — removed the synthetic-only "Unknown absent" assertion (on real CFPB data "Unknown" legitimately appears) and replaced it with a positive check that result labels intersect DENIAL_REASONS.values(). The not result.empty bug-catcher is retained.
  4. README test count (audit L1) — 28 → 35, verified by pytest.

Local verification

  • Full suite: 35 passed, 0 skipped (fresh venv, --import-mode=importlib).
  • Wheel built + installed in fresh venv: importlib.metadata.version, __version__, and wheel METADATA all read 0.2.1.
  • Both workflows: top-level contents: read present; publish still id-token: write only; all action uses: remain full 40-char SHAs; both YAMLs parse.

Scope / non-goals

A PR to main triggers test.yml only. release.yml is tag-triggered (v[0-9]*) and cannot publish from this PR. No merge, tag, or publish here — those are the owner's gated steps.

- Delete setup.py; add [tool.setuptools.packages.find] to pyproject.toml
  so both hmda_analyzer (shim) and hmdaanalyzer (+ all subpackages) ship
  in the wheel without a separate setup.py.
- hmdaanalyzer/__init__.py: replace hardcoded __version__ with
  importlib.metadata.version("hmda-analyzer") so pyproject.toml is the
  single source of truth. hmda_analyzer shim re-exports unchanged.
- Add [tool.pytest.ini_options] with --import-mode=importlib.
- Add [project.optional-dependencies] dev extras (pytest, build, twine).
- Fix pyproject.toml license field to SPDX string (requires setuptools>=77).
- Add .github/workflows/test.yml: push/PR matrix across Python 3.9-3.12.
- Add .github/workflows/release.yml: tag-triggered pipeline with
  verify-version (tomllib guard), build, test-wheel (site-packages
  assertion + importlib mode), publish via OIDC trusted publishing.
  All five actions SHA-pinned.
- Add CONTRIBUTING.md: release runbook and anti-patterns ported from
  fair-lending-screener, adapted for single-source version.
The CFPB Data Browser CSV names enumerated fields with hyphens
(denial_reason-1, applicant_race-1, etc.), but _clean() only lowercased
and stripped column names. The underscore form that denial_reasons_by_race
expected never matched, so every live-data call silently returned an
empty DataFrame. The existing synthetic test was falsely green because
load_sample() emitted underscore directly.

- _clean(): replace '-' with '_' after lowercase+strip, so live and
  synthetic data take the same path.
- load_sample(): emit raw 'denial_reason-1' (hyphenated) so the synthetic
  fixture exercises the real normalization path. Observable output
  unchanged (column is still denial_reason_1 after _clean).
- tests/test_disparity.py:
  - New regression test test_denial_reasons_by_race_handles_cfpb_hyphenated_columns
    builds a raw frame with the hyphenated CFPB column, runs through _clean,
    and asserts non-empty result with mapped labels. Confirmed FAILED pre-fix
    with 'denial_reasons_by_race returned empty for CFPB-style hyphenated input'.
  - Strengthened test_denial_reasons_by_race: was isinstance-only (falsely
    green); now asserts non-empty, documented columns, no 'Unknown' labels.

- .github/workflows/release.yml: fix root-layout sys.path trap. The
  'Confirm installed version' and 'Assert site-packages' steps use
  `python -c`, which puts cwd into sys.path[0]. When run from the
  checked-out repo root, that shadowed the wheel-installed packages with
  the source tree, defeating the test. Both steps now `working-directory: /tmp`
  and the site-packages assertion now covers both hmdaanalyzer and
  hmda_analyzer. The pytest step still runs from the repo root because the
  console-script entry point sets sys.path[0] to /tmp/wheel_test_venv/bin
  (verified empirically), and --import-mode=importlib prevents pytest from
  prepending rootdir.

- CHANGELOG.md: Unreleased section documenting hyphen fix, fixture
  fidelity change, new release/test CI, and version-source consolidation.
…rdening

- pyproject.toml: version 0.2.0 -> 0.2.1 (single source; __version__ still
  derives via importlib.metadata, unchanged).
- CHANGELOG.md: promote [Unreleased] -> [0.2.1] - 2026-05-29; keep a fresh
  empty [Unreleased] heading above it.
- release.yml / test.yml: add top-level `permissions: contents: read`
  (least-privilege GITHUB_TOKEN). Publish job keeps its own id-token: write
  block untouched; action SHAs unchanged.
- tests/test_disparity.py: replace the synthetic-only "Unknown absent"
  assertion with a positive check that labels intersect DENIAL_REASONS.values();
  keep `not result.empty` as the bug-catcher. Same change in the hyphen
  regression test.
- README.md: test count 28 -> 35 (verified by pytest).
@Jaypatel1511 Jaypatel1511 merged commit f31e6cd into main May 29, 2026
4 checks passed
@Jaypatel1511 Jaypatel1511 deleted the fix/denial-reason-hyphen-and-ci branch May 29, 2026 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant