diff --git a/docs/design/azure-devops-decisions.md b/docs/design/azure-devops-decisions.md new file mode 100644 index 0000000..fe30706 --- /dev/null +++ b/docs/design/azure-devops-decisions.md @@ -0,0 +1,212 @@ +# Azure DevOps Integration — Phase 0 Decision Record + +**Date:** 2026-05-05 +**Status:** Accepted +**Scope:** Decisions blocking Phase 1 (refactor + config schema + stub adapter) of the Azure DevOps PM integration. Companion to [azure-devops-integration-plan.md](./azure-devops-integration-plan.md). + +This document codifies four decisions reached during planning. It exists so Phase 1–9 contributors do not re-litigate the same trade-offs and so the plan's `§2 Open decisions` table has a durable rationale trail. + +Decisions deferred to later phases (HTTP client, sprint enumeration, CMMI mapping coverage, cache layout, native commit-link enrichment, identity sync depth) are not in scope here — they will be recorded in their own ADRs as those phases begin. + +--- + +## Decision 1 — Authentication method: PAT only for v1 + +### Context + +Azure DevOps Services supports three auth modes for API clients: + +1. Personal Access Tokens (HTTP Basic with empty username, base64-encoded `:`). +2. Microsoft Entra ID (formerly AAD) OAuth 2.0 — device flow or auth-code flow. +3. Azure AD service principal (client_id + client_secret + tenant_id). + +The `AzureDevOpsConfig` dataclass shape and the loader's env-var resolution behavior depend on which modes are supported. + +### Decision + +**v1 ships PAT-only.** Single config field `personal_access_token: str` resolved from `${AZURE_DEVOPS_PAT}` environment variable. + +OAuth and Azure AD service principal are **deferred** to v1.1. When added, they will be selected by an additive `auth_method: Literal["pat", "oauth", "service_principal"] = "pat"` discriminator on the existing dataclass — no breaking change to v1 configs. + +### Rationale + +- gitflow-analytics is a single-user CLI tool today. OAuth's primary benefit (avoid handing a long-lived shared credential to every team member) does not apply. +- Every existing PM integration in this codebase uses an API token (JIRA `api_token`, Confluence `api_token`). PAT is consistent. +- OAuth (device flow) adds ~200–400 LOC: token cache file, refresh-token handling, and either a localhost callback listener or device-code polling. Out of proportion for v1. +- The default `"pat"` discriminator value makes v1.1's OAuth opt-in fully backward-compatible. + +### Consequences + +- Users in security-conscious orgs that prohibit long-lived PATs cannot use v1. They will need to wait for v1.1's OAuth path or rotate PATs manually. +- `AZURE_DEVOPS_PAT` becomes a documented required environment variable. The setup wizard prompts for it; the loader raises `EnvironmentVariableError("AZURE_DEVOPS_PAT", "AzureDevOps", config_path)` if missing — pattern parallels JIRA's `JIRA_API_TOKEN` handling at `loader_sections.py:113`. +- Required PAT scopes (documented in `docs/configuration/azure-devops.md`): + - `vso.work` — read work items, queries, area/iteration paths (required) + - `vso.identity` — read user identities (recommended; needed for `get_users`) + - `vso.graph` — Graph API user enumeration (optional; degrades gracefully if absent) + +### Schema impact (Phase 1) + +```python +@dataclass +class AzureDevOpsConfig: + enabled: bool = True + organization_url: str = "" + personal_access_token: str = "" # resolved from ${AZURE_DEVOPS_PAT} + # ... (other fields per plan §3.3) +``` + +No `auth_method` field in v1 — its absence is the v1 implicit default. v1.1 adds it without migrating existing configs. + +--- + +## Decision 2 — On-premises Azure DevOps Server: rejected at config load + +### Context + +Azure DevOps Services (cloud) and Azure DevOps Server (on-premises, formerly TFS) share most of their REST API surface but diverge on: + +- URL pattern: `https://dev.azure.com/{org}` or `https://{org}.visualstudio.com` (cloud) vs. `https://{server}/{collection}/...` (on-prem). +- API version compatibility: Server lags Services by 6–12 months; `api-version=7.1` requires ADO Server 2022+. +- Authentication: Server commonly uses NTLM/Kerberos in addition to PAT, especially behind reverse proxies. +- TLS: self-signed certs are common in enterprise on-prem deployments. + +Silent acceptance of on-prem URLs would let users configure ADO Server, hit auth or API-version errors mid-fetch, and file confusing bug reports. + +### Decision + +**v1 rejects on-prem URLs at config load time** with an actionable error message. + +The loader validates `organization_url` against a cloud-host allowlist (`dev.azure.com`, `*.visualstudio.com`). URLs containing `/tfs/`, `/DefaultCollection/`, or any other host are rejected with: + +``` +Azure DevOps Server (on-premises) is not supported in v1. +Only Azure DevOps Services (cloud) is supported. Configure +either https://dev.azure.com/{org} or https://{org}.visualstudio.com. +On-premises support is roadmapped for v1.2; see +docs/design/azure-devops-integration-plan.md §8. +``` + +`AzureDevOpsConfig.is_on_premise: bool = False` is **reserved as a forward-compatibility marker**. Setting it to `True` is rejected by the loader with the same message in v1; v1.2 will flip the rejection to a feature-gated implementation. + +### Rationale + +- Full on-prem support adds an estimated 3–5 engineering days to v1 plus a real ADO Server test environment. Out of scope for the v1 timeline. +- Optimistic acceptance ("let it fail at runtime") wastes user troubleshooting hours on cryptic NTLM/SSL/api-version errors that the loader can preempt in milliseconds. +- Reserving `is_on_premise` in the schema now means v1.2 can ship without a config migration. + +### Consequences + +- Enterprise customers running ADO Server are explicitly locked out of v1 with a clear message pointing at the v1.2 roadmap. +- The loader gains a small URL-pattern validator (~10 LOC). +- The setup wizard's PAT-validation step in Phase 7 includes a pre-check for on-prem URLs to catch user error before the PAT prompt. + +--- + +## Decision 3 — Default ticket regex: strict `AB#` with per-config override + +### Context + +Azure Boards recognizes two commit-message reference forms: + +- `AB#1234` — the explicit "Azure Boards" prefix Microsoft injects via the official GitHub-↔-Azure-Boards integration. Cross-platform-safe. +- `#1234` — bare numeric reference. Auto-detected by Azure Repos but **collides catastrophically** with GitHub PR/issue numbers in mixed-platform repos. + +The default regex is wired into `Config.get_effective_ticket_platforms()` (`schema.py:813`), which runs even when no PM platform is explicitly configured. A loose default has cross-codebase blast radius — every gitflow-analytics user would see different ticket-extraction behavior on upgrade. + +### Decision + +- **Default regex:** `r"AB#(\d+)"` registered at `TicketDetectionConfig.patterns["azure_devops"]` (`schema.py:386`). +- **Per-config override:** `AzureDevOpsConfig.ticket_regex_override: Optional[str] = None`. When set, replaces the default for that deployment only. +- **Documented opt-in pattern** for Azure-Repos-only shops who want bare `#1234` support: `ticket_regex_override: "(?:AB#|#)(\\d+)"`. Documented with an explicit GitHub-collision warning in `docs/configuration/azure-devops.md`. +- **Case sensitivity:** `AB#` is case-sensitive by Microsoft convention. Phase 6 adds `azure_devops` to the case-sensitive platform list at `extractors/tickets.py:212` (currently `platform != "jira"`). + +### Rationale + +- `AB#` is the only cross-platform-unambiguous form. It is what Microsoft's own tooling emits. +- A loose default would silently reclassify every `#NNN` PR reference in every existing user's repos as an ADO work item — unacceptable side effect of an additive feature. +- The override exists for the legitimate Azure-Repos-only use case where users *do* want bare-`#` matching, accepting that it conflicts with GitHub. + +### Consequences + +- Users in Azure-Repos-only shops who type bare `#1234` see zero correlations until they set `ticket_regex_override`. The setup wizard prompts about this explicitly. +- Migration concern: existing JIRA-only users see no behavior change (ADO regex registers but is only consulted when ADO is configured or in the no-config fallback list, where it harmlessly fails to match anything in practice). +- Phase 6 work item: validate that adding `azure_devops` to the case-sensitive list does not regress any existing JIRA/Linear/ClickUp tests. + +--- + +## Decision 4 — `UnifiedIssue.story_points` type: change to `Optional[float]` + +**Status:** **Accepted — implemented and merged.** Bob landed PR #56 (commit `bea075f`, released as v3.15.0 on 2026-05-05) which widened the model to `Optional[float]`, fixed the JIRA adapter's `int(float(value))` coercion, and updated the SQLAlchemy / SQLite schemas. The implementation matches this ADR's design and extends it to additional sites we'd missed in our audit: +- `UnifiedSprint.planned_story_points` and `completed_story_points` also widened. +- `_DayStats.story_points` dataclass field widened. +- `pipeline_report.py` had its own `int()` truncation that was also corrected. +- A v12.0 `schema_version` migration helper was added (SQLite dynamic type affinity means no DDL rebuild needed). +- New test `tests/integrations/test_jira_adapter_story_points.py` covers 5 cases (3.5, 1.5 string, 0.5, integer-as-float, unparseable). + +The ADO adapter (Phase 3) inherits the corrected behavior automatically; no follow-up work is needed in this codebase to consume the change. + +### Context + +`UnifiedIssue.story_points` is currently `Optional[int]` (`pm_framework/models.py:150`). The JIRA adapter's converter coerces values via `int(float(value))` at `jira_adapter_converters.py:261`, silently dropping fractional precision. Story-point estimates of `3.5` round to `3`. + +Azure DevOps natively stores story points as `double`: +- `Microsoft.VSTS.Scheduling.StoryPoints` — Agile process +- `Microsoft.VSTS.Scheduling.Effort` — Scrum process +- `Microsoft.VSTS.Scheduling.Size` — CMMI process + +Modified Fibonacci scales (`0.5, 1, 2, 3, 5, 8, 13`) are common in real teams. Shipping ADO with `int` coercion would silently mangle these. + +### Decision + +**Change `UnifiedIssue.story_points` from `Optional[int]` to `Optional[float]` in Phase 1.** + +Same change applies to `BasePlatformAdapter._extract_story_points` return type (`base.py:333`) and any cache schema columns persisting story points (`INTEGER` → `REAL` in SQLite). + +The JIRA adapter's `int(float(value))` coercion is **fixed in the same PR** as part of the model change — `int(...)` becomes `float(...)`. This eliminates the existing precision bug. + +### Rationale + +- Doing the change now, while only one adapter exists, is the cleanest moment. Deferring to Phase 3 means revisiting the JIRA adapter's coercion code after committing to a downstream behavior. +- Python's duck typing means most aggregations (`sum()`, division for ratios) silently accept the wider type without consumer changes. +- A separate `story_points_decimal` field (the rejected option C from the planning conversation) creates two sources of truth and is strongly discouraged. + +### Consequences + +- Phase 1 includes a `grep -rn "story_points" src/gitflow_analytics` audit. Any consumer with explicit `int` typing is updated to `float`. SQLite `INTEGER` columns become `REAL`. +- A new test asserts `UnifiedIssue(story_points=3.5).story_points == 3.5` round-trips through the JIRA adapter's converter without precision loss. This test would fail today. +- Reports formatting story points as `f"{n}"` continue to work (Python prints `3.5` correctly). Reports formatting as `f"{n:d}"` (forcing integer) would break — none found in initial grep but the audit must confirm. +- This is a behavior change for existing JIRA users with fractional story points: they will now see `3.5` in reports where they previously saw `3`. Documented in `CHANGELOG.md` under "Changed" for v3.15.0. + +--- + +## Cross-cutting Phase 1 implications + +These decisions together imply the following Phase 1 deliverables (recap from plan §4): + +1. `AzureDevOpsConfig` dataclass with PAT-only auth field, reserved `is_on_premise` flag, and `ticket_regex_override` field. +2. URL validator in the loader rejecting on-prem patterns. +3. `r"AB#(\d+)"` default registered at `schema.py:386`. +4. ~~`UnifiedIssue.story_points` type change~~ — **Landed independently** in Bob's PR #56 / v3.15.0 (see Decision 4 status). Phase 1 inherits it from main; no Phase 1 model edits required. +5. The legacy-stack hardcoded-`"jira"` cleanup (independent of the four decisions but Phase 1's primary safety-net work — see plan §4 Phase 1 file list). +6. Stub `AzureDevOpsAdapter` registered with the orchestrator; every data method raises `NotImplementedError("Azure DevOps adapter: — implemented in Phase X")`. +7. New tests: `tests/config/test_azure_devops_config.py`, JIRA-adapter no-regression sweep. (Story-point precision regression test deferred with Decision 4.) + +## Decisions deferred to later phases + +Recorded here so contributors don't ask twice: + +| Phase | Decision | Recommendation (not yet locked) | +|-------|----------|--------------------------------| +| 2 | HTTP client (requests vs official SDK) | Hand-rolled `requests` + `urllib3.Retry`, mirror JIRA pattern | +| 3 | Custom process mapping coverage | Ship Agile + Scrum + Basic + CMMI tables; inherited custom processes fall through to UNKNOWN with config override | +| 3 | Cache layout (per-platform DB vs unified) | Per-platform `azure_devops_tickets.db` for v1; unified `pm_tickets.db` revisited before adding 3rd platform | +| 4 | Sprint enumeration scope | Project-level classification node tree (`/wit/classificationnodes/Iterations`); team-scoped capacity out of scope for v1 | +| 5 | Native commit-link correlation | Use `POST /_apis/wit/artifactUriQuery` as secondary correlation source on top of `AB#` message scanning | +| 7 | Identity sync depth | Defer full sync to v1.1; v1 ships `ado-identity-doctor` diagnostic only | + +--- + +## Change log + +- 2026-05-05 — Initial draft. All four decisions accepted in planning conversation. +- 2026-05-06 — Decision 4 status flipped from "Deferred" to "Accepted — implemented and merged" after Bob shipped PR #56 / v3.15.0 with the int→float widening. Phase 1 implications updated to reflect the model change is no longer in our scope. diff --git a/docs/design/azure-devops-integration-plan.md b/docs/design/azure-devops-integration-plan.md new file mode 100644 index 0000000..68b6f73 --- /dev/null +++ b/docs/design/azure-devops-integration-plan.md @@ -0,0 +1,518 @@ +# Azure DevOps PM Integration — Implementation Plan + +**Status:** Draft (planning hive output, awaiting user decisions on §2 open questions) +**Target version:** `3.16.0` (originally reserved `3.15.0`, but that was consumed by Bob's PR #56 / story-points float widening on 2026-05-05) +**Estimated effort:** 8–11 engineering days (1 engineer); 6–7 days with parallel doc/test work +**Prior art:** `src/gitflow_analytics/pm_framework/adapters/jira_adapter.py` (713 LOC) + converters mixin (714 LOC) + cache (495 LOC) ≈ **1,920 LOC of reference implementation** + +This plan synthesizes outputs from four parallel planning agents (API research, architecture, phasing, risk review). Where they disagreed, the disagreement is called out and a recommendation given. + +--- + +## 1. Executive summary + +Add `AzureDevOpsAdapter` parallel to the existing `JIRAAdapter` under `pm_framework/adapters/`, wire it into the orchestrator, config schema, ticket extractor, setup wizard, identity sync, and reports. Ship behind an opt-in config block; zero behavior change for existing JIRA users. + +**Strong consensus across agents:** +- PAT-only auth for v1 (HTTP Basic, empty username, base64-encoded `:`). +- Hand-rolled `requests` + `urllib3.Retry`, mirroring JIRA. Do **not** vendor the official `azure-devops` SDK. +- Three-file split: `azure_devops_adapter.py` + `azure_devops_cache.py` + `azure_devops_converters.py` (mixin). +- Two-stage fetch: WIQL (IDs only) → `POST /_apis/wit/workitemsbatch` (200 IDs/chunk). +- Default ticket regex: `r"AB#(\d+)"`. **Do not** use bare `r"#(\d+)"` — collides with GitHub PR/issue numbers. +- Map work-item status by **State Category** (`Proposed/InProgress/Resolved/Completed/Removed`), not state name. Custom processes (CMMI, inherited) rename states. +- Defer to v1.1 / v2: OAuth, Azure DevOps Server (on-prem), full identity sync, write operations, ADO test plans. +- Cloud only for v1 (`https://dev.azure.com/{org}` and the legacy `{org}.visualstudio.com`). + +**The single highest-leverage finding** (review agent): adding ADO only to `pm_framework/adapters/` is **not enough**. There are ~40 hardcoded `"jira"` strings in legacy enrichment paths (`core/data_fetcher*`, `pipeline_collect.py`, `cli_fetch.py`, `core/data_fetcher_processing.py:328` literally has `platform="jira" # Assuming JIRA for now`). If we ship ADO without addressing this, ADO data will be silently dropped on the legacy path. **Phase 1 must include a refactor of the platform-tag plumbing** before any ADO-specific code lands. + +--- + +## 2. Open decisions (blocking — please confirm before Phase 1 starts) + +These are the questions where agents diverged or where user input is required. + +| # | Decision | Recommendation | Why it matters | +|---|----------|---------------|----------------| +| 1 | **Auth method** | PAT only for v1 | OAuth doubles complexity (callback server, token refresh). | +| 2 | **On-prem (ADO Server / TFS)** | Out of scope for v1; reject with clear error | NTLM/Kerberos + version skew + SSL self-signed = weeks of work. | +| 3 | **HTTP client** | Hand-rolled `requests` (mirror JIRA) | Official SDK lags releases and pulls ~25 transitive deps. | +| 4 | **Ticket regex default** | `r"AB#(\d+)"` only; document `(?:AB#|#)(\d+)` as opt-in override | Bare `#1234` collides with GitHub PR/issue refs in mixed-platform repos. | +| 5 | **Sprint enumeration** | Project-level **iteration classification node tree** (`/wit/classificationnodes/Iterations?$depth=10`), de-dup by full path | Iterations are team-scoped. Per-team enumeration multiplies API calls and creates duplicates; classification node tree is project-scoped and stable. (Architecture agent and reviewer disagreed; reviewer's project-tree approach wins on simplicity and parity.) | +| 6 | **Custom process support** | Ship Agile + Scrum + Basic + CMMI mapping tables out-of-the-box | Mapping tables are small and free at runtime. Inherited custom processes fall through to UNKNOWN with a config override hook. | +| 7 | **`UnifiedIssue.story_points` type** | **Landed in v3.15.0** (Bob's PR #56, commit `bea075f`). Model widened to `Optional[float]`, JIRA coercion fixed, SQL columns flipped to `REAL`. Phase 3 ADO converter inherits the corrected behavior. | ADO `Effort` (Scrum) and `StoryPoints` (Agile) are natively float; the precision bug is gone. | +| 8 | **Cache layout** | Keep per-platform SQLite for v1 (`azure_devops_tickets.db`), schedule unified `pm_tickets.db` for v2 | Reviewer flagged the per-platform DB sprawl as a design smell. Refactoring now would balloon scope; revisit before adding the 3rd platform. | +| 9 | **Native commit ↔ work-item links** | Use `POST /_apis/wit/artifactUriQuery` as a **secondary** correlation source on top of `AB#` message scanning. Both `correlation_method="ticket_reference"` and `correlation_method="native_link"` recorded; native wins on tie | Catches commits made via VS/VSCode "Link work item" UI without `AB#` in the message. | +| 10 | **Identity sync** | Defer full sync to v1.1; v1 ships an `ado-identity-doctor` diagnostic that lists unmatched assignees | ADO `uniqueName` (UPN, `alice@tenant.onmicrosoft.com`) frequently differs from git commit email. Auto-merge is its own project. | +| 11 | **Method naming** | Extend existing `Config.get_effective_ticket_platforms()` (`schema.py:778`); do not invent `get_pm_platforms` | Original brief used the wrong method name. | +| 12 | **Feature flag** | Ship behind `pm.azure_devops.enabled` (already required); do **not** add a second `azure_devops_v1_beta` flag | One opt-in is enough; double-gating creates support confusion. | + +--- + +## 3. Architecture (condensed) + +### 3.1 New files + +| Path | Purpose | +|------|---------| +| `src/gitflow_analytics/pm_framework/adapters/azure_devops_adapter.py` | `AzureDevOpsAdapter` class, HTTP session, abstract method implementations (~600–750 LOC). | +| `src/gitflow_analytics/pm_framework/adapters/azure_devops_cache.py` | `AzureDevOpsTicketCache` (SQLite, mirror `JiraTicketCache`). | +| `src/gitflow_analytics/pm_framework/adapters/azure_devops_converters.py` | `AzureDevOpsConvertersMixin` — work-item → `UnifiedIssue`, identity ref normalization, type/state/priority mappers. | +| `src/gitflow_analytics/integrations/azure_devops_identity_sync.py` | **v1.1 stub** — logs "deferred to v2" until identity work lands. | +| `tests/pm_framework/adapters/test_azure_devops_adapter.py` | Auth, projects, WIQL, batch fetch, error paths. | +| `tests/pm_framework/adapters/test_azure_devops_converters.py` | Pure-function mapper tests, no network. | +| `tests/pm_framework/adapters/test_azure_devops_cache.py` | Cache TTL, eviction, schema. | +| `tests/pm_framework/adapters/test_azure_devops_get_issues.py` | End-to-end WIQL → batch flow with `responses`-mocked fixtures, including pagination. | +| `tests/pm_framework/adapters/test_azure_devops_links.py` | Native commit-link correlation. | +| `tests/config/test_azure_devops_config.py` | Schema loading, env-var resolution, regex defaults. | +| `tests/extractors/test_tickets_azure_devops.py` | `AB#` extraction, mixed-platform commit messages. | +| `tests/pm_framework/adapters/fixtures/azure_devops/` | JSON fixtures: `connection_data.json`, `projects_list.json`, `wiql_query_response.json`, `workitems_batch.json`, `workitems_batch_capped.json`, `workitems_batch_paginated_2.json`, `iterations_list.json`, `graph_users.json`, `comments.json`, `fields.json`, `relations.json`. | +| `docs/configuration/azure-devops.md` | User-facing config reference. | +| `docs/examples/azure-devops-configuration.md` | Sample YAML block. | + +### 3.2 Class surface + +``` +class AzureDevOpsAdapter(AzureDevOpsConvertersMixin, BasePlatformAdapter): + def _get_platform_name(self) -> str # "azure_devops" + def _get_capabilities(self) -> PlatformCapabilities # see §3.4 + def authenticate(self) -> bool # GET _apis/connectionData; checks PAT scopes + def test_connection(self) -> dict # diagnostic with org, projects, types, scopes + def get_projects(self) -> list[UnifiedProject] + def get_issues(self, project_id, since=None, issue_types=None) -> list[UnifiedIssue] + def get_sprints(self, project_id) -> list[UnifiedSprint] # via classification node tree + def get_users(self, project_id) -> list[UnifiedUser] # Graph API; degrades to assignee-extract on 403 + def get_issue_comments(self, issue_key) -> list[dict] + def get_custom_fields(self, project_id) -> dict + def get_commit_links(self, commit_shas: list[str]) -> list[tuple[str, str]] # ADO-specific via artifactUriQuery +``` + +### 3.3 Config schema (new) + +`AzureDevOpsConfig` dataclass added to `src/gitflow_analytics/config/schema.py` near `JIRAConfig` (~line 470): + +``` +@dataclass +class AzureDevOpsConfig: + enabled: bool = True + organization_url: str = "" # https://dev.azure.com/{org} + personal_access_token: str = "" # ${AZURE_DEVOPS_PAT} + project: Optional[str] = None # default project; multi-project via list later + work_item_types: Optional[list[str]] = None # allowlist; None=canonical 6 + area_paths: list[str] = [] # AreaPath UNDER filters (OR-joined) + story_point_fields: list[str] = ["Microsoft.VSTS.Scheduling.StoryPoints", + "Microsoft.VSTS.Scheduling.Effort", + "Microsoft.VSTS.Scheduling.Size"] # Agile, Scrum, CMMI + custom_fields: dict[str, str] = {} # friendly_name -> reference_name + api_version: str = "7.1" + batch_size: int = 200 # ADO hard cap + rate_limit_delay: float = 0.2 + verify_ssl: bool = True + cache_ttl_hours: int = 168 # 7 days + dns_timeout: int = 10 + connection_timeout: int = 30 + max_retries: int = 3 + backoff_factor: float = 1.0 + state_category_overrides: dict[str, str] = {} # "Approved" -> "Proposed" + work_item_type_overrides: dict[str, str] = {} # "Custom Type" -> "story" + ticket_regex_override: Optional[str] = None + is_on_premise: bool = False # reserved; v1 rejects with error +``` + +YAML shape (canonical): +```yaml +pm: + azure_devops: + enabled: true + organization_url: "https://dev.azure.com/myorg" + project: "MyProject" + personal_access_token: "${AZURE_DEVOPS_PAT}" + work_item_types: ["User Story","Bug","Task","Feature","Epic"] +``` + +The loader auto-synthesizes the matching `pm_integration.platforms.azure_devops` block (mirror `_process_jira_pm_config` at `loader_sections.py:531`). **No top-level `azure_devops:` block** — that fragments the config space the way `jira:` and `jira_integration:` already do. + +### 3.4 Capability flags + +``` +supports_projects = True +supports_issues = True +supports_sprints = True +supports_time_tracking = True # OriginalEstimate / RemainingWork / CompletedWork +supports_story_points = True +supports_custom_fields = True +supports_issue_linking = True +supports_comments = True +supports_attachments = False # not used in v1 +supports_workflows = True # state categories +supports_bulk_operations = True # workitemsbatch +supports_cursor_pagination = True # x-ms-continuationtoken +rate_limit_requests_per_hour = 1500 # approximation; ADO uses TSTU +rate_limit_burst_size = 200 +max_results_per_page = 200 # workitemsbatch hard limit +``` + +### 3.5 Mapping tables + +**Work-item type → IssueType** (covers Agile, Scrum, Basic, CMMI): + +| ADO type | IssueType | +|----------|-----------| +| Epic | EPIC | +| Feature | FEATURE | +| User Story / Product Backlog Item / Requirement / Issue (Basic) | STORY | +| Task | TASK | +| Bug | BUG | +| Issue (Agile/CMMI) | INCIDENT | +| Change Request | IMPROVEMENT | +| Risk / Review / Test Case / Test Plan | TASK with descriptive label | +| Impediment (Scrum) | TASK with `blocked` label | +| anything else | UNKNOWN (with `work_item_type_overrides` config hook) | + +**State category → IssueStatus**: + +| Category | IssueStatus | +|----------|-------------| +| Proposed | TODO | +| InProgress | IN_PROGRESS | +| Resolved | IN_REVIEW | +| Completed | DONE | +| Removed | CANCELLED | +| (null) | name-fallback heuristic, then UNKNOWN | + +**Priority** (`Microsoft.VSTS.Common.Priority`, integer 1–4): 1→CRITICAL, 2→HIGH, 3→MEDIUM, 4→LOW, 0/null→UNKNOWN. Reuse `BasePlatformAdapter._map_priority` (already handles `"1"`–`"4"` strings). + +--- + +## 4. Phased plan + +### Phase 0 — Foundation & decisions (0.5 day) + +**Goal:** lock open decisions in §2; prepare dependency/secret plumbing. + +**Deliverables:** +- ADR document resolving §2 open questions, committed to `docs/adr/` if that dir exists, otherwise `docs/design/azure-devops-decisions.md`. +- `.env.example` updated with `AZURE_DEVOPS_ORG_URL`, `AZURE_DEVOPS_PAT`, optional `AZURE_DEVOPS_PROJECT`. +- `pyproject.toml` left unchanged (requests-only path). + +**Acceptance:** ADR merged. No code yet. + +**Dependencies:** none. + +--- + +### Phase 1 — Refactor platform-tag plumbing + config schema + stub adapter (1.5 days) + +**Goal:** make Azure DevOps registrable, parseable, and **silently fixable** in legacy paths before any real ADO code lands. This phase is the project's risk insurance. + +**Files touched (refactor slice — fixes hardcoded `"jira"` hotspots):** +- `src/gitflow_analytics/core/data_fetcher_processing.py:170,183,328,343` — replace literal `platform="jira"` with platform tag from the orchestrator's adapter map. +- `src/gitflow_analytics/core/data_fetcher.py:17,126,287`, `data_fetcher_parallel.py:14,278,514` — generalize `jira_integration` parameter to `pm_orchestrator` (accepting `PMFrameworkOrchestrator`); add a deprecated alias keeping the old kwarg working. +- `src/gitflow_analytics/core/analyze_pipeline.py:284,293-296,360`, `pipeline_collect.py:66,73,78,143`, `cli_fetch.py:170,213-214,359` — same generalization. +- `src/gitflow_analytics/core/cache.py:172` — per-platform breakdown for diagnostics. +- `src/gitflow_analytics/core/schema_version.py:94` — schema fingerprint registry now keyed by platform list, accepts `azure_devops`. +- `src/gitflow_analytics/integrations/orchestrator.py:241,340` — remove JIRA special-case; route through registered adapters by name. +- `src/gitflow_analytics/reports/ticketing_activity_report.py:261,357`, `metrics/activity_scoring.py:207` — iterate `cfg.get_effective_ticket_platforms()` instead of hardcoded `"jira"`. + +**Files touched (config + stub slice):** +- `src/gitflow_analytics/config/schema.py:386` — add `"azure_devops": r"AB#(\d+)"` to `TicketDetectionConfig.patterns` defaults. +- `src/gitflow_analytics/config/schema.py:~470` — add `AzureDevOpsConfig` dataclass. +- `src/gitflow_analytics/config/schema.py:~717` — add optional `azure_devops: Optional[AzureDevOpsConfig] = None` to `Config`. +- `src/gitflow_analytics/config/schema.py:778-815` — `get_effective_ticket_platforms()` includes `"azure_devops"` when the config block is present. +- `src/gitflow_analytics/config/loader_sections.py` — add `_process_azure_devops_pm_config` parallel to `_process_jira_pm_config:531`. Auto-synthesize `pm_integration.platforms.azure_devops`. Resolve `${AZURE_DEVOPS_PAT}` env var; raise `EnvironmentVariableError` on missing creds. +- `src/gitflow_analytics/pm_framework/adapters/azure_devops_adapter.py` (NEW) — `AzureDevOpsAdapter` stub; `__init__` reads config; every other method raises `NotImplementedError("Azure DevOps adapter: — implemented in Phase X")`. All capability flags `False` for now. +- `src/gitflow_analytics/pm_framework/adapters/__init__.py` — export `AzureDevOpsAdapter`. +- `src/gitflow_analytics/pm_framework/orchestrator.py:115` — uncomment `register_adapter("azure_devops", AzureDevOpsAdapter)`. +- `tests/config/test_azure_devops_config.py` (NEW), regression sweep over JIRA tests. + +**Acceptance:** +- All existing tests still pass after the refactor (zero JIRA regressions). +- Loading a config with `pm.azure_devops` parses without error. +- Orchestrator reports `azure_devops` as a registered adapter. +- `core/data_fetcher_processing.py` no longer has `platform="jira" # Assuming JIRA` literal. +- A new test asserts that `get_effective_ticket_platforms()` includes `"azure_devops"` when configured. + +**Estimated effort:** 1.5 days. (The refactor slice is the bulk; budget aggressively because legacy code paths have hidden test coverage.) + +**Dependencies:** Phase 0. + +--- + +### Phase 2 — Auth + read-only project listing (1 day) + +**Goal:** prove HTTP/auth/session work end-to-end with the smallest possible read surface. + +**Files touched:** `azure_devops_adapter.py` — implement `_create_session`, `_ensure_session`, `_build_url` (cloud-only; reject `tfs/` collection URLs at config load time with explicit "ADO Server unsupported in v1"), `authenticate`, `test_connection`, `get_projects`. Auth header: `Authorization: Basic ")>`. + +**Tests:** `responses`-mocked unit tests covering happy path + 401/403/404/5xx-with-retry. Integration test gated by `GFA_AZURE_DEVOPS_INTEGRATION_TEST=1`. + +**Acceptance:** `gitflow-analytics test-pm-connection azure_devops` (added in Phase 7) succeeds against a real org. PAT scope diagnostic prints required scopes (`vso.work`, optionally `vso.identity`/`vso.graph`). + +**Estimated effort:** 1 day. **Dependencies:** Phase 1. + +--- + +### Phase 3 — Work items via WIQL + batch fetch (2 days) + +**Goal:** core feature parity with JIRA's `get_issues`. + +**Files touched:** +- `azure_devops_adapter.py` — `get_issues`, `_build_wiql`, `_fetch_work_items_batch`, `_get_default_fields`, `_get_state_category` (with state metadata cache per work item type). +- `azure_devops_converters.py` (NEW) — `_convert_work_item`, `_convert_identity_ref` (handles both string and dict shapes), `_map_work_item_type`, `_map_state_category` (with name-fallback heuristic), `_map_priority_value`, `_extract_story_points_ado`, `_unified_issue_to_dict` / `_dict_to_unified_issue`. +- `azure_devops_cache.py` (NEW) — SQLite-backed `AzureDevOpsTicketCache`; same interface as `JiraTicketCache`. + +**WIQL pagination:** WIQL has a result cap (research agent reported 20,000; review agent reported 1,000 — both can be true depending on which limit you hit first). **Implement defensive windowing regardless**: sort by `[System.ChangedDate]` ascending; if a window returns ≥ `WIQL_RESULT_LIMIT` (configurable, default 1000), binary-split the time window and recurse. Add an explicit assertion that the final union covers `[since, now]` with no gaps. + +**Tests:** mapper tests for every row in §3.5. Pagination test with synthetic 1500-row fixture. Cache round-trip preserves all `UnifiedIssue` fields including `story_points: float`. + +**Acceptance:** ≥ 85% line coverage on `azure_devops_adapter.py` + `azure_devops_converters.py`. Pulling 4 weeks of work items from a fixture produces well-formed `UnifiedIssue` objects. + +**Estimated effort:** 2 days. **Dependencies:** Phase 2. + +--- + +### Phase 4 — Sprints (Iterations) + users (1 day) + +**Goal:** fill in capability flags so velocity reports populate. + +**Files touched:** +- `azure_devops_adapter.py` — `get_sprints` via `GET /_apis/wit/classificationnodes/Iterations?$depth=10` (project-level; deduplicated by classification node path). `get_users` via Graph API on `vssps.dev.azure.com`; on 403 (PAT lacks `vso.graph`) degrades gracefully — extract unique assignees from the cached work-item set, log the degradation, drop `supports_users` flag to `False` for that session. + +**Acceptance:** Sprint timeFrame (`past`/`current`/`future`) → (`is_active`, `is_completed`) round-trips. `get_users` returns `[]` cleanly when scope is missing. + +**Estimated effort:** 1 day. **Dependencies:** Phase 3. + +--- + +### Phase 5 — Comments, custom fields, native commit links (1 day) + +**Goal:** match JIRA optional surfaces and add ADO's unique value-add (native links). + +**Files touched:** +- `azure_devops_adapter.py` — `get_issue_comments` (`/comments?api-version=7.1-preview.4`), `get_custom_fields` (filtered to `Custom.*` and `WEF.*` reference names), `get_commit_links(commit_shas)` via `POST /_apis/wit/artifactUriQuery` batched at 50 SHAs/request. +- `pm_framework/orchestrator.py` — in `correlate_issues_with_commits`, add a `hasattr(adapter, 'get_commit_links')`-gated branch that contributes `correlation_method="native_link"` results with confidence 1.0. + +**Acceptance:** Commits without `AB#` references but with ADO native artifact links produce correlations. JIRA correlation tests still pass (the new branch is duck-typed). + +**Estimated effort:** 1 day. **Dependencies:** Phase 3. + +--- + +### Phase 6 — Ticket extraction in commit messages (0.5 day) + +**Goal:** make `AB#1234` resolve through the existing extractor pipeline. + +**Files touched:** +- `src/gitflow_analytics/extractors/tickets.py:176-192` — register `"azure_devops"` regex (default `r"AB#(\d+)"`). +- `tickets.py:212` — explicit case-sensitivity rule for ADO (`AB#` is case-sensitive convention). +- `tickets.py:533` — extend platform-normalization branch (currently `"jira" or "linear"`) to include `"azure_devops"`. +- `tickets.py:708` — `_format_ticket_id` returns `f"AB#{ticket_id}"` for the `azure_devops` platform. +- `src/gitflow_analytics/extractors/tickets_analysis.py:150` — broaden platform inference. + +**Tests:** Mixed-platform commit message test (`"Fix login PROJ-5 + AB#9"` extracts both with correct platform tags). + +**Acceptance:** `_correlate_by_ticket_references` in the orchestrator produces ADO correlations from commits with `AB#` refs. + +**Estimated effort:** 0.5 day. **Dependencies:** Phase 3. + +--- + +### Phase 7 — Setup wizard + reports + diagnostic CLI (1 day) + +**Goal:** UX surface parity. + +**Files touched:** +- `src/gitflow_analytics/cli_wizards/install_wizard_pm.py` — `_setup_azure_devops`, `_validate_azure_devops`, `_store_azure_devops_config`, `_discover_azure_devops_fields` (mirror `_setup_jira:22`). Validate PAT scopes by hitting `connectionData` and probing `_apis/wit/workitems`. +- `src/gitflow_analytics/cli_wizards/install_wizard.py:43-79,184-212,366-380` — generalize the wizard flow to register ADO alongside JIRA. +- `src/gitflow_analytics/cli_wizards/install_wizard_output.py:277` — emit `ticket_platforms.append("azure_devops")` when configured. +- `src/gitflow_analytics/cli_setup.py:212,250,459` — extend `discover-fields` and the example block. +- **New CLI command:** `gitflow-analytics test-pm-connection [platform]` (vendor-neutral; works for JIRA today + ADO new). Wire through `PMFrameworkOrchestrator.test_connection`. +- `src/gitflow_analytics/integrations/azure_devops_identity_sync.py` — **stub only** that logs "deferred to v1.1"; full implementation is a follow-up. Add `gitflow-analytics ado-identity-doctor` diagnostic that lists ADO assignees missing from `developer_identities`. + +**Acceptance:** Interactive `gitflow-analytics install` wizard offers Azure DevOps as a PM option, validates the PAT, and writes a working YAML. + +**Estimated effort:** 1 day. **Dependencies:** Phases 2, 5. + +--- + +### Phase 8 — Documentation (0.5 day) + +**Files touched:** +- `docs/configuration/azure-devops.md` (NEW) — full config reference, required PAT scopes (`vso.work`, optional `vso.identity`/`vso.graph`), env-var resolution, troubleshooting (401/403 PAT scope errors, on-prem rejection, WIQL windowing behavior, identity matching caveats). +- `docs/examples/azure-devops-configuration.md` (NEW) — paste-ready YAML. +- `docs/configuration/configuration.md` — add ADO to platform reference table. +- `README.md` — list ADO under PM integrations. +- `CHANGELOG.md` — `[Unreleased] ### Added` entry. + +**Acceptance:** `tests/docs/test_azure_devops_doc_example.py` (NEW) parses the doc YAML and validates against the schema. + +**Estimated effort:** 0.5 day. **Dependencies:** Phase 7. + +--- + +### Phase 9 — Test, validate, ship (1 day) + +**Goal:** release-ready. + +**Steps:** +1. Full `pytest -q` green (zero JIRA regressions; verified against `tests/integrations/test_jira_activity_integration.py` + `tests/test_pm_env_resolution.py` baseline). +2. Run `pytest -m integration` against a real ADO org with `GFA_AZURE_DEVOPS_INTEGRATION_TEST=1` env vars set. +3. Manual end-to-end: clone a repo with `AB#` commits → `gitflow-analytics analyze` → confirm correlations populate, no double-counting with co-existing JIRA refs. +4. Performance benchmark: 4-week / 500-issue pull completes in <30s cold, <5s warm. +5. `src/gitflow_analytics/_version.py` → `3.16.0` (3.15.0 was consumed by PR #56). +6. `CHANGELOG.md` — promote `[Unreleased]` to `[3.16.0] - `. +7. Commits: `feat: add Azure DevOps PM platform integration` + `chore: bump version to 3.16.0`. + +**Acceptance:** all gates in §6 below pass. + +**Estimated effort:** 1 day. **Dependencies:** all prior phases. + +--- + +## 5. Risk register (top 10) + +| # | Risk | L | I | Mitigation | Phase | +|---|------|---|---|------------|-------| +| 1 | Dual-stack (`pm_framework/` vs `integrations/`) — ~40 hardcoded `"jira"` strings in legacy enrichment paths silently drop ADO data | H | H | Phase 1 refactor slice. **This is the project's biggest hidden risk.** | 1 | +| 2 | WIQL row cap (1k or 20k depending on which you hit first) silently truncates large pulls | H | H | Defensive `[ChangedDate]` windowing with binary-split + gap-detection assertion | 3 | +| 3 | Custom process templates (CMMI, inherited) rename types and states; naive name-mapping misclassifies >50% | H | H | Map by `StateCategory`, not state name. Ship Agile + Scrum + Basic + CMMI tables. Config overrides for inherited custom processes. | 3 | +| 4 | Ticket regex `#(\d+)` collides with GitHub PR/issue numbers | H | H | Default `r"AB#(\d+)"`. Validator warns if both ADO + GitHub enabled with bare `#` | 1, 6 | +| 5 | Rate-limit / TSTU exhaustion under burst loads (ADO TSTU model differs from JIRA hourly cap) | M | H | Custom retry middleware reading `Retry-After` and `X-RateLimit-Delay` headers; configurable max-backoff (default 5 min); log throttle events with `X-RateLimit-Resource` | 2 | +| 6 | On-prem TFS / ADO Server URL parsing silently breaking against Services-only assumptions | M | H | v1 explicitly rejects `tfs/` collection URLs at config-load time | 1 | +| 7 | ADO `uniqueName` (UPN) ≠ git commit email → identity merge failures | H | M | Defer full sync to v1.1. v1 ships `ado-identity-doctor` diagnostic + `identity_aliases` config block | 7 | +| 8 | Deleted/recycled work items return 404 mid-run; cache poisoning if soft-deleted item is later restored | M | M | Negative-cache 404s with separate short TTL (1h, configurable) | 3 | +| 9 | Team-scoped iterations either double-count or miss data depending on enumeration strategy | M | M | Use project-level classification node tree; document team-scoped capacity data as out-of-scope for v1 | 4 | +| 10 | Story points stored as float (`3.5`) lose precision via `int(float(value))` | — | — | **Resolved** in v3.15.0 (Bob's PR #56). No remaining mitigation work for ADO. | — (closed) | + +--- + +## 6. Validation / acceptance gates + +Before merging: + +1. **Regression:** All existing tests pass; `tests/integrations/test_jira_activity_integration.py`, `tests/test_pm_env_resolution.py`, ticketing-activity-report tests unchanged. +2. **Unit coverage:** ≥85% for `azure_devops_adapter.py` + `azure_devops_converters.py`; ≥75% combined new-code coverage. +3. **Edge case suite:** All 25 cases in §7 covered by named tests (use `responses` library; fixtures scrubbed from real ADO responses). +4. **Integration:** `pytest -m integration` with `GFA_AZURE_DEVOPS_INTEGRATION_TEST=1` against a real org; `len(get_issues(project, since=...))` matches an independent WIQL count within 0–1 (allowing for in-flight modifications during the run). +5. **Performance:** 4-week / 500-issue pull ≤30s cold, ≤5s warm; cache hit-rate ≥80% on second run. +6. **Cross-platform correlation:** A repo with both `PROJ-123` (JIRA) and `AB#456` (ADO) refs produces two distinct correlations, neither shadowing the other. +7. **Dual-stack consistency:** After Phase 1 refactor, run analyze pipeline twice (legacy enrichment path + `pm_framework` path) and assert ticket counts match per platform. +8. **Diagnostic:** `gitflow-analytics test-pm-connection azure_devops` exits 0 on valid config with PAT scopes printed; non-zero with remediation message on failure. +9. **Migration safety:** Running v3.15.0 against an existing v3.14.x cache directory does not corrupt JIRA data; `core/schema_version.py` migration coexists. + +--- + +## 7. Edge cases the test plan must cover + +1. Empty project (0 work items) → `get_issues` returns `[]`. +2. Project with exactly 1000 work items → no truncation; 1001 → still no truncation (windowing kicks in). +3. Project with 50,000 work items + `since` → completes in <5 min, ≤200 batch GETs. +4. Work item with `null System.AssignedTo` → `assignee=None`. +5. `System.AssignedTo` as legacy string `"Alice "` vs modern dict → both normalize to `UnifiedUser`. +6. Custom `System.State="Awaiting Review"` + `StateCategory="InProgress"` → maps to `IN_PROGRESS`. +7. `StateCategory` null (legacy CMMI) → name-fallback, then `UNKNOWN`, never crashes. +8. Story points `3.5` (float) → preserved as `3.5` (requires `Optional[float]` model change). +9. Scrum `Effort` field but no `StoryPoints` → fallback resolves correctly. +10. CMMI `Size` field instead → still resolves. +11. 429 with `Retry-After: 5` → adapter sleeps 5s, retries, total elapsed matches. +12. 429 with `Retry-After: 120` (over default cap) → respects header up to configured max. +13. 401 mid-pagination (PAT expired) → fast-fail with explicit "PAT expired or revoked" message. +14. 203 with HTML body (login redirect) → raises auth error; HTML never passed to JSON parser. +15. Network timeout at page 5/10 → partial results discarded, exception with context. +16. WIQL with apostrophe in project name → properly escaped. +17. Circular parent-child link → `linked_issues` extraction terminates. +18. Iteration with same name across two teams → de-dup by full classification path. +19. Sprint with null start/end (planning-only) → `is_active=False, is_completed=False`. +20. Commit with `AB#1234` and `#5678` → ADO match `1234`, GitHub match `5678`, no double-counting. +21. Commit with `AB#1234` only, ADO not configured → no spurious correlation. +22. Mixed-platform commit `PROJ-1 + AB#1` → both correlations recorded with correct `platform`. +23. Cache hit of older schema → `_dict_to_unified_issue` upgrades or invalidates, never crashes. +24. ADO org with 25 projects, only 2 enabled → only those 2 fetched. +25. PAT scoped only to `Work Items (read)` → `test_connection` succeeds; sprint fetch fails with "PAT lacks Project & Team (read) scope" message rather than generic 403. + +--- + +## 8. Out of scope for v1 (explicit non-goals) + +- Write operations (create/update work items, comments, links). +- Azure DevOps Server / TFS on-prem (rejected at config load). +- Active Directory / NTLM / Kerberos auth. +- OAuth 2.0 / Azure AD app authentication. +- TFVC version control (Git only). +- Test plans, test runs, test results, test cases as first-class entities. +- Full ADO identity sync into `IdentityCache` (deferred to v1.1; diagnostic-only in v1). +- Team-scoped iteration capacity data. +- Multi-org configurations (single org per config in v1). +- Boards-Boards integration / cross-org work-item links. + +--- + +## 9. Suggested next action + +1. **User review of §2 open decisions.** Twelve decisions; each is a one-word answer in most cases. Without these, Phase 1 cannot start cleanly. +2. **Spawn a `coder` agent for Phase 1** (refactor + config + stub) as a self-contained PR. This phase is purely additive in user-facing behavior and gives us the safety net before any ADO-specific code. +3. After Phase 1 lands, parallelize Phases 2 + 3 + 4 (data path) on one engineer and Phases 6 + 7 (extractor + wizard) on another. + +--- + +## 10. Reference file index + +Files the implementing engineer will edit, grouped by category: + +**PM framework (new):** +- `src/gitflow_analytics/pm_framework/adapters/azure_devops_adapter.py` +- `src/gitflow_analytics/pm_framework/adapters/azure_devops_cache.py` +- `src/gitflow_analytics/pm_framework/adapters/azure_devops_converters.py` +- `src/gitflow_analytics/pm_framework/adapters/__init__.py` +- `src/gitflow_analytics/pm_framework/orchestrator.py:115` + +**Reference (read, do not modify):** +- `src/gitflow_analytics/pm_framework/base.py` +- `src/gitflow_analytics/pm_framework/models.py` (modify only for `Optional[float]` story_points decision) +- `src/gitflow_analytics/pm_framework/registry.py` +- `src/gitflow_analytics/pm_framework/adapters/jira_adapter.py` +- `src/gitflow_analytics/pm_framework/adapters/jira_adapter_converters.py` +- `src/gitflow_analytics/pm_framework/adapters/jira_cache.py` + +**Config:** +- `src/gitflow_analytics/config/schema.py:386,470,717,778` +- `src/gitflow_analytics/config/loader.py:295` +- `src/gitflow_analytics/config/loader_sections.py:531` + +**Extractor / correlation:** +- `src/gitflow_analytics/extractors/tickets.py:176,212,533,708` +- `src/gitflow_analytics/extractors/tickets_analysis.py:150` + +**Legacy stack refactor (Phase 1):** +- `src/gitflow_analytics/core/data_fetcher.py:17,126,287` +- `src/gitflow_analytics/core/data_fetcher_parallel.py:14,278,514` +- `src/gitflow_analytics/core/data_fetcher_processing.py:14,170,183,328,343` +- `src/gitflow_analytics/core/analyze_pipeline.py:284,293-296,360` +- `src/gitflow_analytics/pipeline_collect.py:66,73,78,143` +- `src/gitflow_analytics/cli_fetch.py:170,213-214,359` +- `src/gitflow_analytics/core/cache.py:172` +- `src/gitflow_analytics/core/schema_version.py:94` +- `src/gitflow_analytics/integrations/orchestrator.py:241,340` +- `src/gitflow_analytics/reports/ticketing_activity_report.py:261,357` +- `src/gitflow_analytics/metrics/activity_scoring.py:207` + +**CLI / wizard:** +- `src/gitflow_analytics/cli_wizards/install_wizard.py:43-79,184-212,366-380` +- `src/gitflow_analytics/cli_wizards/install_wizard_pm.py:22` +- `src/gitflow_analytics/cli_wizards/install_wizard_output.py:277` +- `src/gitflow_analytics/cli_setup.py:212,250,459` + +**Identity (stub for v1):** +- `src/gitflow_analytics/integrations/azure_devops_identity_sync.py` (NEW, stub) + +**Release plumbing:** +- `pyproject.toml` +- `.env.example` +- `src/gitflow_analytics/_version.py` → `3.15.0` +- `CHANGELOG.md` + +**Docs:** +- `docs/configuration/azure-devops.md` (NEW) +- `docs/examples/azure-devops-configuration.md` (NEW) +- `docs/configuration/configuration.md` +- `README.md` +- `docs/design/azure-devops-decisions.md` (NEW — Phase 0 ADR) diff --git a/src/gitflow_analytics/cli_fetch.py b/src/gitflow_analytics/cli_fetch.py index ba04f5d..e3a135c 100644 --- a/src/gitflow_analytics/cli_fetch.py +++ b/src/gitflow_analytics/cli_fetch.py @@ -209,9 +209,11 @@ def fetch( # Initialize integrations for ticket fetching orchestrator = IntegrationOrchestrator(cfg, cache) - # Narrow the integration union type to JIRAIntegration | None for type safety + # Generic PM integration handle. Today only JIRA is wired into the + # legacy enrichment path; new platforms (Azure DevOps, …) plug in here + # via the same orchestrator.integrations map. _raw_jira = orchestrator.integrations.get("jira") - jira_integration = _raw_jira if isinstance(_raw_jira, JIRAIntegration) else None + pm_integration = _raw_jira if isinstance(_raw_jira, JIRAIntegration) else None # Discovery organization repositories if needed repositories_to_fetch = cfg.repositories @@ -356,7 +358,7 @@ def progress_callback(message: str): project_key=project_key, weeks_back=weeks, branch_patterns=branch_patterns, - jira_integration=jira_integration, + pm_integration=pm_integration, progress_callback=progress_callback, start_date=start_date, end_date=end_date, diff --git a/src/gitflow_analytics/config/loader.py b/src/gitflow_analytics/config/loader.py index 6376e2f..6a9b73d 100644 --- a/src/gitflow_analytics/config/loader.py +++ b/src/gitflow_analytics/config/loader.py @@ -304,8 +304,41 @@ def load(cls, config_path: Union[Path, str]) -> Config: qualitative_data = data["analysis"].get("qualitative", {}) qualitative_config = cls._process_qualitative_config(qualitative_data) - pm_config = cls._process_pm_config(data.get("pm", {})) - pm_integration_config = cls._process_pm_integration_config(data.get("pm_integration", {})) + pm_config = cls._process_pm_config(data.get("pm", {}), config_path) + + # Per plan §3.3, ADO configuration lives only at ``pm.azure_devops``. + # We do NOT accept a top-level ``azure_devops:`` block — that would + # repeat the dual-stack mistake of ``jira:`` + ``jira_integration:``. + # The auto-synthesis below mirrors ``pm.azure_devops`` onto the + # ``pm_integration.platforms.azure_devops`` block so the orchestrator + # picks it up without users needing to write the synthesis manually. + ado_block = getattr(pm_config, "azure_devops", None) if pm_config is not None else None + + # Auto-synthesize a matching pm_integration.platforms.azure_devops + # entry whenever an Azure DevOps block was provided (mirrors the + # JIRA auto-synthesis pattern). Uses a copy to avoid mutating user + # data when pm_integration was supplied explicitly. + pm_integration_data = dict(data.get("pm_integration", {}) or {}) + if ado_block is not None: + platforms_data = dict(pm_integration_data.get("platforms", {}) or {}) + if "azure_devops" not in platforms_data: + platforms_data["azure_devops"] = { + "enabled": ado_block.enabled, + "platform_type": "azure_devops", + "config": { + "organization_url": ado_block.organization_url, + "personal_access_token": ado_block.personal_access_token, + "project": ado_block.project, + "api_version": ado_block.api_version, + "story_point_fields": ado_block.story_point_fields, + }, + } + pm_integration_data["platforms"] = platforms_data + # Activate pm_integration for the run when the user gave us an + # Azure DevOps block but no explicit pm_integration section. + pm_integration_data.setdefault("enabled", True) + + pm_integration_config = cls._process_pm_integration_config(pm_integration_data) # Velocity report config (top-level key "velocity") velocity_data = data.get("velocity", {}) diff --git a/src/gitflow_analytics/config/loader_sections.py b/src/gitflow_analytics/config/loader_sections.py index ca89cdf..60848a3 100644 --- a/src/gitflow_analytics/config/loader_sections.py +++ b/src/gitflow_analytics/config/loader_sections.py @@ -13,10 +13,11 @@ import click -from .errors import EnvironmentVariableError +from .errors import ConfigurationError, EnvironmentVariableError from .schema import ( ActivityScoringConfig, AIDetectionConfig, + AzureDevOpsConfig, BoilerplateFilterConfig, CacheConfig, Config, @@ -36,6 +37,17 @@ ) from .validator import ConfigValidator +# ADR Decision 2: Azure DevOps Server (on-premises) is rejected at config-load +# time in v1. The validator below performs the URL allowlist check; the +# message is reproduced verbatim from docs/design/azure-devops-decisions.md. +_AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE = ( + "Azure DevOps Server (on-premises) is not supported in v1.\n" + "Only Azure DevOps Services (cloud) is supported. Configure\n" + "either https://dev.azure.com/{org} or https://{org}.visualstudio.com.\n" + "On-premises support is roadmapped for v1.2; see\n" + "docs/design/azure-devops-integration-plan.md §8." +) + # Regex for embedded env-var references in config strings. # Matches both ``${VAR}`` and ``$VAR`` (latter only for ASCII identifiers). # WHY: Some users write ``api_token: "${CONFLUENCE_API_TOKEN}"`` but others may @@ -513,14 +525,17 @@ def _parse_member(m: Any) -> TeamMemberConfig: return TeamsConfig(teams=teams, enabled=enabled) @classmethod - def _process_pm_config(cls, pm_data: dict[str, Any]) -> Optional[Any]: + def _process_pm_config( + cls, pm_data: dict[str, Any], config_path: Optional[Path] = None + ) -> Optional[Any]: """Process PM configuration section. Args: - pm_data: PM configuration data + pm_data: PM configuration data. + config_path: Path to the config file, used in error messages. Returns: - PM configuration object or None + PM configuration object or None. """ if not pm_data: return None @@ -546,8 +561,158 @@ def _process_pm_config(cls, pm_data: dict[str, Any]) -> Optional[Any]: )() pm_config.jira = jira_sub_config # type: ignore[misc] + # Parse Azure DevOps section within PM (Phase 1 — config + validation only) + if "azure_devops" in pm_data: + ado_sub_config = cls._process_azure_devops_pm_config( + pm_data["azure_devops"], config_path + ) + pm_config.azure_devops = ado_sub_config # type: ignore[misc] + return pm_config + @classmethod + def _process_azure_devops_pm_config( + cls, + ado_data: dict[str, Any], + config_path: Optional[Path] = None, + ) -> AzureDevOpsConfig: + """Process the ``pm.azure_devops`` configuration block. + + Resolves environment variables, validates the organization URL + against the cloud-host allowlist (ADR decision 2), and constructs + the :class:`AzureDevOpsConfig` dataclass. + + Args: + ado_data: Raw mapping from the ``pm.azure_devops`` YAML block. + config_path: Path to the config file, used to enrich loader + errors with a file location. + + Returns: + Populated :class:`AzureDevOpsConfig` instance. + + Raises: + EnvironmentVariableError: If ``personal_access_token`` references + an unset environment variable (typically + ``${AZURE_DEVOPS_PAT}``). + ConfigurationError: If ``organization_url`` matches the + on-premises pattern or ``is_on_premise`` is ``True``. + """ + # Resolve credentials. The PAT must come from an environment variable + # in normal operation; raise a friendly error if it is missing. + raw_pat = ado_data.get("personal_access_token", "") + resolved_pat = cls._resolve_env_var(raw_pat) if raw_pat else "" + if raw_pat and not resolved_pat: + # Identify the env var name from the original string when present. + # NOTE: Do NOT collapse the if/else into a ternary — the previous + # form ``env_match.group(1) or env_match.group(2) if env_match else ...`` + # had an operator-precedence bug (parsed as ``group(1) or (group(2) + # if env_match else default)`` which raises ``AttributeError`` when + # ``env_match is None``). Keep the explicit if/else. + env_match = _ENV_VAR_PATTERN.search(raw_pat) + if env_match: # noqa: SIM108 — see comment above + env_var = env_match.group(1) or env_match.group(2) + else: + env_var = "AZURE_DEVOPS_PAT" + raise EnvironmentVariableError(env_var, "AzureDevOps", config_path) + # Reject empty / whitespace-only PATs (silent-empty trap). An + # ``AZURE_DEVOPS_PAT=""`` env var resolves to an empty string but + # ``raw_pat`` is truthy (the literal ``"${AZURE_DEVOPS_PAT}"``), so + # the env-var error above would not fire. Catch it explicitly here. + if not resolved_pat or not resolved_pat.strip(): + raise EnvironmentVariableError("AZURE_DEVOPS_PAT", "AzureDevOps", config_path) + + organization_url = cls._resolve_env_var(ado_data.get("organization_url", "")) or "" + is_on_premise = bool(ado_data.get("is_on_premise", False)) + + cls._validate_azure_devops_url(organization_url, is_on_premise, config_path) + + return AzureDevOpsConfig( + enabled=bool(ado_data.get("enabled", True)), + organization_url=organization_url, + personal_access_token=resolved_pat or "", + project=cls._resolve_env_var(ado_data.get("project")), + work_item_types=ado_data.get("work_item_types"), + area_paths=list(ado_data.get("area_paths", []) or []), + story_point_fields=list( + ado_data.get( + "story_point_fields", + [ + "Microsoft.VSTS.Scheduling.StoryPoints", + "Microsoft.VSTS.Scheduling.Effort", + "Microsoft.VSTS.Scheduling.Size", + ], + ) + ), + custom_fields=dict(ado_data.get("custom_fields", {}) or {}), + api_version=str(ado_data.get("api_version", "7.1")), + batch_size=int(ado_data.get("batch_size", 200)), + rate_limit_delay=float(ado_data.get("rate_limit_delay", 0.2)), + verify_ssl=bool(ado_data.get("verify_ssl", True)), + cache_ttl_hours=int(ado_data.get("cache_ttl_hours", 168)), + dns_timeout=int(ado_data.get("dns_timeout", 10)), + connection_timeout=int(ado_data.get("connection_timeout", 30)), + max_retries=int(ado_data.get("max_retries", 3)), + backoff_factor=float(ado_data.get("backoff_factor", 1.0)), + state_category_overrides=dict(ado_data.get("state_category_overrides", {}) or {}), + work_item_type_overrides=dict(ado_data.get("work_item_type_overrides", {}) or {}), + is_on_premise=is_on_premise, + ) + + @staticmethod + def _validate_azure_devops_url( + organization_url: str, + is_on_premise: bool, + config_path: Optional[Path] = None, + ) -> None: + """Validate an Azure DevOps organization URL against the cloud allowlist. + + Implements ADR decision 2: v1 supports only Azure DevOps Services + (cloud). On-premises (TFS / ADO Server) URLs and the + ``is_on_premise=True`` opt-in marker are both rejected with a + verbatim error message. + + Args: + organization_url: Resolved URL string from the config. + is_on_premise: The ``is_on_premise`` flag from the config. + config_path: Optional config-file path included in the raised + :class:`ConfigurationError`. + + Raises: + ConfigurationError: If the URL is missing, fails the cloud-host + allowlist, or ``is_on_premise`` is ``True``. + """ + if is_on_premise: + raise ConfigurationError(_AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE, config_path) + + if not organization_url: + raise ConfigurationError( + "Azure DevOps configuration requires 'organization_url'.", + config_path, + suggestion=( + "Set organization_url to your cloud Azure DevOps URL,\n" + "for example https://dev.azure.com/myorg." + ), + ) + + # Quick on-prem fingerprints: TFS collection URLs and the explicit + # ``DefaultCollection`` segment are unambiguous markers. + lowered = organization_url.lower() + if "/tfs/" in lowered or "/defaultcollection" in lowered: + raise ConfigurationError(_AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE, config_path) + + # Parse the URL and check the host against the cloud allowlist. + from urllib.parse import urlparse + + parsed = urlparse(organization_url) + host = (parsed.hostname or "").lower() + if not host: + raise ConfigurationError(_AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE, config_path) + + is_dev_azure = host == "dev.azure.com" + is_visualstudio = host.endswith(".visualstudio.com") + if not (is_dev_azure or is_visualstudio): + raise ConfigurationError(_AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE, config_path) + @classmethod def _process_pm_integration_config( cls, pm_integration_data: dict[str, Any] @@ -688,9 +853,7 @@ def _resolve_config_dict(cls, config_dict: dict[str, Any]) -> dict[str, Any]: ( cls._resolve_env_var(item) if isinstance(item, str) - else cls._resolve_config_dict(item) - if isinstance(item, dict) - else item + else cls._resolve_config_dict(item) if isinstance(item, dict) else item ) for item in value ] diff --git a/src/gitflow_analytics/config/schema.py b/src/gitflow_analytics/config/schema.py index edef4de..b0af934 100644 --- a/src/gitflow_analytics/config/schema.py +++ b/src/gitflow_analytics/config/schema.py @@ -387,6 +387,10 @@ class TicketDetectionConfig: default_factory=lambda: { "jira": r"([A-Z]{2,10}-\d+)", "github": r"(?:closes|fixes|resolves)\s+#(\d+)", + # Azure DevOps "AB#" prefix is the only cross-platform-safe form. + # Bare ``#1234`` collides with GitHub PR/issue refs and is opt-in + # per ADO config (``ticket_regex_override``). See ADR decision 3. + "azure_devops": r"AB#(\d+)", } ) @@ -481,6 +485,89 @@ class JIRAConfig: base_url: Optional[str] = None +@dataclass +class AzureDevOpsConfig: + """Azure DevOps PM platform configuration. + + Phase 1 schema for the Azure DevOps integration. Real network behaviour + arrives in Phases 2–5; this dataclass exists so the loader can validate + YAML, the orchestrator can register the adapter, and downstream code + can branch on ``Config.pm.azure_devops`` being non-None. Per plan §3.3, + the only canonical home for ADO config is ``pm.azure_devops`` — there + is no top-level ``azure_devops:`` block. + + Attributes: + enabled: Whether the adapter is active for this run. + organization_url: Cloud-only Azure DevOps organization URL. Must + match ``https://dev.azure.com/{org}`` or + ``https://{org}.visualstudio.com``. On-premises (TFS / ADO + Server) URLs are rejected at config-load time per ADR + decision 2. + personal_access_token: PAT resolved from ``${AZURE_DEVOPS_PAT}``. + project: Optional default project name; multi-project support + arrives in a later phase. + work_item_types: Optional allowlist of work-item type names to + collect. ``None`` means the canonical six (User Story, Bug, + Task, Feature, Epic, Issue). + area_paths: Iteration / area-path filters joined with OR. + story_point_fields: Reference names tried in order when extracting + story points (Agile / Scrum / CMMI). + custom_fields: Map of friendly-name to ADO reference-name for + user-defined custom fields. + api_version: ADO REST API version to target. + batch_size: Maximum work-items per batch fetch (ADO hard cap 200). + rate_limit_delay: Seconds to sleep between API calls to stay under + ADO's TSTU rate budget. + verify_ssl: Whether to verify TLS certificates. + cache_ttl_hours: TTL for the ADO ticket cache (Phase 3). + dns_timeout: DNS resolution timeout in seconds. + connection_timeout: HTTP connection timeout in seconds. + max_retries: Retry count for transient HTTP failures. + backoff_factor: ``urllib3.Retry`` backoff factor. + state_category_overrides: ADO state name → ``StateCategory`` map + for inherited custom processes. + work_item_type_overrides: Custom work-item-type → unified + ``IssueType`` map for inherited templates. + is_on_premise: Reserved forward-compatibility marker. v1 rejects + ``True`` at config load time per ADR decision 2; v1.2 will + flip the rejection to a feature-gated implementation. + + Note: + A ``ticket_regex_override`` field intentionally does not exist in + Phase 1. The default ``r"AB#(\\d+)"`` pattern is registered in + :class:`TicketDetectionConfig` defaults. Per-config override will + be added in Phase 6 when the override is actually wired into the + ticket extractor pipeline. + """ + + enabled: bool = True + organization_url: str = "" + personal_access_token: str = "" + project: Optional[str] = None + work_item_types: Optional[list[str]] = None + area_paths: list[str] = field(default_factory=list) + story_point_fields: list[str] = field( + default_factory=lambda: [ + "Microsoft.VSTS.Scheduling.StoryPoints", + "Microsoft.VSTS.Scheduling.Effort", + "Microsoft.VSTS.Scheduling.Size", + ] + ) + custom_fields: dict[str, str] = field(default_factory=dict) + api_version: str = "7.1" + batch_size: int = 200 + rate_limit_delay: float = 0.2 + verify_ssl: bool = True + cache_ttl_hours: int = 168 + dns_timeout: int = 10 + connection_timeout: int = 30 + max_retries: int = 3 + backoff_factor: float = 1.0 + state_category_overrides: dict[str, str] = field(default_factory=dict) + work_item_type_overrides: dict[str, str] = field(default_factory=dict) + is_on_premise: bool = False + + @dataclass class JIRAIntegrationConfig: """JIRA integration specific configuration.""" @@ -704,6 +791,10 @@ class Config: cache: CacheConfig jira: Optional[JIRAConfig] = None jira_integration: Optional[JIRAIntegrationConfig] = None + # Note: there is no top-level ``Config.azure_devops`` field. Per plan + # §3.3, ADO configuration lives only at ``Config.pm.azure_devops``. + # JIRA's dual-stack (``Config.jira`` + ``Config.pm.jira``) is the + # mistake we are not repeating. pm: Optional[Any] = None # Modern PM framework config pm_integration: Optional[PMIntegrationConfig] = None qualitative: Optional["QualitativeConfig"] = None @@ -799,6 +890,8 @@ def get_effective_ticket_platforms(self) -> list[str]: platforms.append("linear") if hasattr(self.pm, "clickup") and self.pm.clickup: platforms.append("clickup") + if hasattr(self.pm, "azure_devops") and self.pm.azure_devops: + platforms.append("azure_devops") # Check legacy JIRA config if (self.jira or self.jira_integration) and "jira" not in platforms: @@ -808,7 +901,12 @@ def get_effective_ticket_platforms(self) -> list[str]: if self.github.token: platforms.append("github") - # If nothing configured, fall back to common platforms + # If nothing configured, fall back to common platforms. + # Note: ``azure_devops`` is intentionally NOT in this fallback list. + # Adding it would silently turn ``AB#NNN`` references in commit + # messages into ADO-tagged tickets for users who never configured + # ADO. The fallback only includes platforms whose regexes are + # safe to apply universally. if not platforms: platforms = ["jira", "github", "clickup", "linear"] diff --git a/src/gitflow_analytics/core/analyze_pipeline.py b/src/gitflow_analytics/core/analyze_pipeline.py index 182242f..a1f9c87 100644 --- a/src/gitflow_analytics/core/analyze_pipeline.py +++ b/src/gitflow_analytics/core/analyze_pipeline.py @@ -18,7 +18,7 @@ from dataclasses import dataclass from datetime import datetime, timedelta, timezone from pathlib import Path -from typing import Any, Callable +from typing import Any, Callable, Optional logger = logging.getLogger(__name__) @@ -280,20 +280,20 @@ def fetch_repositories_batch( data_fetcher = GitDataFetcher( cache=cache, branch_mapping_rules=getattr(cfg.analysis, "branch_mapping_rules", {}), - allowed_ticket_platforms=getattr( - cfg.analysis, "ticket_platforms", ["jira", "github", "clickup", "linear"] - ), + allowed_ticket_platforms=cfg.get_effective_ticket_platforms(), exclude_paths=getattr(cfg.analysis, "exclude_paths", None), exclude_merge_commits=cfg.analysis.exclude_merge_commits, ticket_detection_config=getattr(cfg.analysis, "ticket_detection", None), ) orchestrator = IntegrationOrchestrator(cfg, cache) - # Narrow union dict value to JIRAIntegration | None for downstream callers. + # Resolve a PM integration object for ticket enrichment. Today the legacy + # path only wires up JIRA; new platforms (Azure DevOps, …) plug in here + # via the same orchestrator.integrations map without further changes. from ..integrations.jira_integration import JIRAIntegration _jira_candidate = orchestrator.integrations.get("jira") - jira_integration: JIRAIntegration | None = ( + pm_integration: Optional[Any] = ( _jira_candidate if isinstance(_jira_candidate, JIRAIntegration) else None ) @@ -357,7 +357,7 @@ def fetch_repositories_batch( project_key=project_key, weeks_back=weeks, branch_patterns=branch_patterns, - jira_integration=jira_integration, + pm_integration=pm_integration, progress_callback=progress_callback, start_date=start_date, end_date=end_date, diff --git a/src/gitflow_analytics/core/cache.py b/src/gitflow_analytics/core/cache.py index 7a61f43..9d0b7e8 100644 --- a/src/gitflow_analytics/core/cache.py +++ b/src/gitflow_analytics/core/cache.py @@ -168,11 +168,25 @@ def get_cache_stats(self) -> dict[str, Any]: total_prs = session.query(PullRequestCache).count() total_issues = session.query(IssueCache).count() - # Platform-specific issue counts - jira_issues = session.query(IssueCache).filter(IssueCache.platform == "jira").count() - github_issues = ( - session.query(IssueCache).filter(IssueCache.platform == "github").count() + # Platform-specific issue counts. + # The legacy ``cached_jira_issues`` / ``cached_github_issues`` keys are + # preserved verbatim for downstream consumers; the new + # ``cached_issues_by_platform`` map gives a generic per-platform + # breakdown so additional platforms (e.g. ``azure_devops``) appear + # automatically once they start writing rows to ``IssueCache``. + from sqlalchemy import func as _sql_func + + issues_by_platform_rows = ( + session.query(IssueCache.platform, _sql_func.count(IssueCache.platform)) + .group_by(IssueCache.platform) + .all() ) + issues_by_platform: dict[str, int] = { + (platform_name or "unknown"): int(count) + for platform_name, count in issues_by_platform_rows + } + jira_issues = issues_by_platform.get("jira", 0) + github_issues = issues_by_platform.get("github", 0) # Stale entries # Bug 1 fix: use timezone-aware UTC datetime instead of naive utcnow() @@ -236,6 +250,8 @@ def get_cache_stats(self) -> dict[str, Any]: "cached_issues": total_issues, "cached_jira_issues": jira_issues, "cached_github_issues": github_issues, + # Generic per-platform breakdown (additive). + "cached_issues_by_platform": issues_by_platform, # Freshness analysis "stale_commits": stale_commits, "stale_prs": stale_prs, diff --git a/src/gitflow_analytics/core/data_fetcher.py b/src/gitflow_analytics/core/data_fetcher.py index ae39eee..73e69aa 100644 --- a/src/gitflow_analytics/core/data_fetcher.py +++ b/src/gitflow_analytics/core/data_fetcher.py @@ -7,6 +7,7 @@ import logging import threading +import warnings from datetime import datetime, timedelta, timezone from pathlib import Path from typing import TYPE_CHECKING, Any, Optional @@ -14,7 +15,7 @@ from sqlalchemy import func from ..constants import BatchSizes, Timeouts -from ..integrations.jira_integration import JIRAIntegration +from ..integrations.jira_integration import JIRAIntegration # noqa: F401 (legacy re-export) from ..models.database import ( DailyCommitBatch, ) @@ -123,11 +124,12 @@ def fetch_repository_data( project_key: str, weeks_back: int = 4, branch_patterns: Optional[list[str]] = None, - jira_integration: Optional[JIRAIntegration] = None, + jira_integration: Optional[Any] = None, progress_callback: Optional[callable] = None, start_date: Optional[datetime] = None, end_date: Optional[datetime] = None, force: bool = False, + pm_integration: Optional[Any] = None, ) -> dict[str, Any]: """Fetch all data for a repository and organize by day. @@ -143,19 +145,42 @@ def fetch_repository_data( - When force=True every week is re-fetched regardless of cache status. Args: - repo_path: Path to the Git repository - project_key: Project identifier - weeks_back: Number of weeks to analyze (used only if start_date/end_date not provided) - branch_patterns: Branch patterns to include - jira_integration: JIRA integration for ticket data - progress_callback: Optional callback for progress updates - start_date: Optional explicit start date (overrides weeks_back calculation) - end_date: Optional explicit end date (overrides weeks_back calculation) + repo_path: Path to the Git repository. + project_key: Project identifier. + weeks_back: Number of weeks to analyze (used only if + ``start_date``/``end_date`` not provided). + branch_patterns: Branch patterns to include. + jira_integration: Deprecated alias for ``pm_integration``. + Accepts any object exposing ``get_issue`` and an optional + ``platform_name`` attribute. Kept for backwards compatibility + with external callers; new callers should pass + ``pm_integration`` instead. + progress_callback: Optional callback for progress updates. + start_date: Optional explicit start date (overrides + ``weeks_back`` calculation). + end_date: Optional explicit end date (overrides ``weeks_back`` + calculation). force: If True, re-fetch all weeks even if already cached. + pm_integration: PM integration object used to enrich tickets. + Generalized replacement for ``jira_integration``: any + integration that exposes ``get_issue`` (and optionally + ``platform_name``) is acceptable. When both are provided + ``pm_integration`` wins. Returns: - Dictionary containing fetch results and statistics + Dictionary containing fetch results and statistics. """ + # Backwards-compat shim: prefer the modern ``pm_integration`` kwarg, + # fall back to the deprecated ``jira_integration`` alias. + if pm_integration is None and jira_integration is not None: + warnings.warn( + "The 'jira_integration' keyword is deprecated; pass " + "'pm_integration' instead. The alias will be removed in a " + "future release.", + DeprecationWarning, + stacklevel=2, + ) + pm_integration = jira_integration logger.debug("🔍 DEBUG: ===== FETCH METHOD CALLED =====") logger.info(f"Starting data fetch for project {project_key} at {repo_path}") logger.debug(f"🔍 DEBUG: weeks_back={weeks_back}, repo_path={repo_path}") @@ -284,17 +309,32 @@ def fetch_repository_data( ticket_ids = self._extract_all_ticket_references(daily_commits) logger.debug(f"🔍 DEBUG: Extracted {len(ticket_ids)} ticket IDs") - if jira_integration and ticket_ids: + # Resolve the platform tag from the integration so non-JIRA + # adapters (e.g. Azure DevOps) record the correct platform on + # detailed-ticket and correlation rows. + # TODO(phase-3): drop the ``"jira"`` fallback once every integration + # exposes ``platform_name``; right now JIRAIntegration may not set + # it on older instances, so we default rather than tag silently. + integration_platform = getattr(pm_integration, "platform_name", None) or "jira" + + if pm_integration and ticket_ids: logger.info( - f"Fetching {len(ticket_ids)} unique tickets from JIRA for {project_key}..." + f"Fetching {len(ticket_ids)} unique tickets from " + f"{integration_platform} for {project_key}..." ) self._fetch_detailed_tickets( - ticket_ids, jira_integration, project_key, progress_callback + ticket_ids, + pm_integration, + project_key, + progress_callback, + platform=integration_platform, ) # Build commit-ticket correlations logger.info(f"Building commit-ticket correlations for {project_key}...") - correlations_created = self._build_commit_ticket_correlations(daily_commits, repo_path) + correlations_created = self._build_commit_ticket_correlations( + daily_commits, repo_path, platform=integration_platform + ) progress.update(repo_progress_ctx) # Step 3: Store daily commit batches diff --git a/src/gitflow_analytics/core/data_fetcher_parallel.py b/src/gitflow_analytics/core/data_fetcher_parallel.py index 2c7d1aa..e96b77e 100644 --- a/src/gitflow_analytics/core/data_fetcher_parallel.py +++ b/src/gitflow_analytics/core/data_fetcher_parallel.py @@ -3,6 +3,7 @@ import logging import threading import time +import warnings from concurrent.futures import ThreadPoolExecutor, as_completed from datetime import datetime from pathlib import Path @@ -11,7 +12,7 @@ import git from ..constants import Timeouts -from ..integrations.jira_integration import JIRAIntegration +from ..integrations.jira_integration import JIRAIntegration # noqa: F401 (legacy re-export) from ..types import CommitStats from ..utils.commit_utils import is_merge_commit from .git_timeout_wrapper import GitOperationTimeout, HeartbeatLogger @@ -275,24 +276,40 @@ def process_repositories_parallel( self, repositories: list[dict], weeks_back: int = 4, - jira_integration: Optional[JIRAIntegration] = None, + jira_integration: Optional[Any] = None, start_date: Optional[datetime] = None, end_date: Optional[datetime] = None, max_workers: int = 3, + pm_integration: Optional[Any] = None, ) -> dict[str, Any]: """Process multiple repositories in parallel with proper timeout protection. Args: - repositories: List of repository configurations - weeks_back: Number of weeks to analyze - jira_integration: Optional JIRA integration for ticket data - start_date: Optional explicit start date - end_date: Optional explicit end date - max_workers: Maximum number of parallel workers + repositories: List of repository configurations. + weeks_back: Number of weeks to analyze. + jira_integration: Deprecated alias for ``pm_integration``. Kept + for backwards compatibility with external callers. + start_date: Optional explicit start date. + end_date: Optional explicit end date. + max_workers: Maximum number of parallel workers. + pm_integration: PM integration (any object exposing + ``get_issue`` and an optional ``platform_name`` attribute) + used to enrich tickets across repositories. Generalized + replacement for ``jira_integration``. Returns: - Dictionary containing processing results and statistics + Dictionary containing processing results and statistics. """ + # Backwards-compat shim: accept the legacy ``jira_integration`` kwarg. + if pm_integration is None and jira_integration is not None: + warnings.warn( + "The 'jira_integration' keyword is deprecated; pass " + "'pm_integration' instead. The alias will be removed in a " + "future release.", + DeprecationWarning, + stacklevel=2, + ) + pm_integration = jira_integration logger.info( f"🚀 Starting parallel processing of {len(repositories)} repositories with {max_workers} workers" ) @@ -331,7 +348,7 @@ def process_repositories_parallel( project_key, weeks_back, branch_patterns, - jira_integration, + pm_integration, start_date, end_date, ) @@ -511,7 +528,7 @@ def _process_repository_with_timeout( project_key: str, weeks_back: int = 4, branch_patterns: Optional[list[str]] = None, - jira_integration: Optional[JIRAIntegration] = None, + pm_integration: Optional[Any] = None, start_date: Optional[datetime] = None, end_date: Optional[datetime] = None, timeout_per_operation: int = Timeouts.DEFAULT_GIT_OPERATION, @@ -519,17 +536,19 @@ def _process_repository_with_timeout( """Process a single repository with comprehensive timeout protection. Args: - repo_path: Path to the repository - project_key: Project identifier - weeks_back: Number of weeks to analyze - branch_patterns: Branch patterns to include - jira_integration: JIRA integration for ticket data - start_date: Optional explicit start date - end_date: Optional explicit end date - timeout_per_operation: Timeout for individual git operations + repo_path: Path to the repository. + project_key: Project identifier. + weeks_back: Number of weeks to analyze. + branch_patterns: Branch patterns to include. + pm_integration: PM integration for ticket enrichment (any object + exposing ``get_issue`` and an optional ``platform_name`` + attribute). Generalized replacement for ``jira_integration``. + start_date: Optional explicit start date. + end_date: Optional explicit end date. + timeout_per_operation: Timeout for individual git operations. Returns: - Repository processing results or None if failed + Repository processing results or None if failed. """ try: # Track this repository in progress @@ -546,7 +565,7 @@ def _process_repository_with_timeout( project_key=project_key, weeks_back=weeks_back, branch_patterns=branch_patterns, - jira_integration=jira_integration, + pm_integration=pm_integration, progress_callback=None, # We handle progress at a higher level start_date=start_date, end_date=end_date, diff --git a/src/gitflow_analytics/core/data_fetcher_processing.py b/src/gitflow_analytics/core/data_fetcher_processing.py index a59199f..c5bab49 100644 --- a/src/gitflow_analytics/core/data_fetcher_processing.py +++ b/src/gitflow_analytics/core/data_fetcher_processing.py @@ -167,20 +167,41 @@ def _extract_all_ticket_references(self, daily_commits: dict[str, Any]) -> set[s def _fetch_detailed_tickets( self, ticket_ids: set[str], - jira_integration: JIRAIntegration, + jira_integration: Any, project_key: str, progress_callback: Optional[callable] = None, + platform: str = "jira", ) -> None: - """Fetch detailed ticket information and store in database.""" + """Fetch detailed ticket information and store in database. + + Args: + ticket_ids: Set of ticket IDs to fetch. + jira_integration: Integration object that exposes ``get_issue``. + The parameter name is kept for backwards compatibility, but any + integration with a compatible interface (and a ``platform_name`` + attribute) may be passed. + project_key: Project identifier used for record scoping. + progress_callback: Optional callable invoked after each ticket fetch. + platform: Platform tag stored on the resulting ticket records. + Defaults to ``"jira"`` for backwards compatibility; callers + that pass a non-JIRA integration should pass the platform + name explicitly. When the integration object exposes a + ``platform_name`` attribute it overrides this default. + """ session = self.database.get_session() + # Prefer the integration-declared platform when available so that + # PM-framework adapters (JIRA, Azure DevOps, …) tag rows correctly + # without requiring callers to pass the platform explicitly. + effective_platform = getattr(jira_integration, "platform_name", None) or platform + try: - # Check which tickets we already have + # Check which tickets we already have for this platform existing_tickets = ( session.query(DetailedTicketData) .filter( DetailedTicketData.ticket_id.in_(ticket_ids), - DetailedTicketData.platform == "jira", + DetailedTicketData.platform == effective_platform, ) .all() ) @@ -209,13 +230,13 @@ def _fetch_detailed_tickets( for ticket_id in batch: try: - # Fetch ticket from JIRA + # Fetch ticket from the configured integration issue_data = jira_integration.get_issue(ticket_id) if issue_data: # Create detailed ticket record detailed_ticket = self._create_detailed_ticket_record( - issue_data, project_key, "jira" + issue_data, project_key, effective_platform ) session.add(detailed_ticket) @@ -285,13 +306,24 @@ def _create_detailed_ticket_record( ) def _build_commit_ticket_correlations( - self, daily_commits: dict[str, Any], repo_path: Path + self, + daily_commits: dict[str, Any], + repo_path: Path, + platform: str = "jira", ) -> int: """Build and store commit-ticket correlations. BUG 3 FIX: Accepts the lightweight summary-dict format returned by _fetch_commits_by_day (each day entry has a "commit_ticket_pairs" list) as well as the legacy full-list format for backward compatibility. + + Args: + daily_commits: Day-keyed mapping of commits/summaries. + repo_path: Repository path used for correlation scoping. + platform: Platform tag stored on each correlation record. + Defaults to ``"jira"`` for backwards compatibility. Callers + that drive enrichment through a non-JIRA integration should + pass the platform name explicitly. """ session = self.database.get_session() correlations_created = 0 @@ -320,12 +352,14 @@ def _build_commit_ticket_correlations( for ticket_id in ticket_refs: try: - # Create correlation record + # Create correlation record. The platform tag comes + # from the parameter so non-JIRA enrichment paths + # (e.g. Azure DevOps) record the correct platform. correlation = CommitTicketCorrelation( commit_hash=commit_hash, repo_path=str(repo_path), ticket_id=ticket_id, - platform="jira", # Assuming JIRA for now + platform=platform, project_key=commit["project_key"], correlation_type="direct", confidence=1.0, @@ -333,14 +367,14 @@ def _build_commit_ticket_correlations( matching_pattern=None, # Could add pattern detection ) - # Check if correlation already exists + # Check if correlation already exists for this platform existing = ( session.query(CommitTicketCorrelation) .filter( CommitTicketCorrelation.commit_hash == commit_hash, CommitTicketCorrelation.repo_path == str(repo_path), CommitTicketCorrelation.ticket_id == ticket_id, - CommitTicketCorrelation.platform == "jira", + CommitTicketCorrelation.platform == platform, ) .first() ) diff --git a/src/gitflow_analytics/core/schema_version.py b/src/gitflow_analytics/core/schema_version.py index 8123b20..d80fbb4 100644 --- a/src/gitflow_analytics/core/schema_version.py +++ b/src/gitflow_analytics/core/schema_version.py @@ -95,6 +95,19 @@ class SchemaVersionManager: "version": "1.0", "fields": ["story_point_fields", "project_keys", "base_url", "issue_data"], }, + # Azure DevOps schema fingerprint (Phase 1 stub; the adapter has no + # cached data yet, but the registry must accept the platform key so + # ``has_schema_changed("azure_devops")`` does not raise). + "azure_devops": { + "version": "1.0", + "fields": [ + "story_point_fields", + "organization_url", + "project", + "work_item_types", + "issue_data", + ], + }, } def __init__(self, cache_dir: Path): diff --git a/src/gitflow_analytics/integrations/jira_integration.py b/src/gitflow_analytics/integrations/jira_integration.py index ae6c392..54ddbe4 100644 --- a/src/gitflow_analytics/integrations/jira_integration.py +++ b/src/gitflow_analytics/integrations/jira_integration.py @@ -15,7 +15,17 @@ class JIRAIntegration: - """Integrate with JIRA API for ticket and story point data.""" + """Integrate with JIRA API for ticket and story point data. + + The ``platform_name`` class attribute tags every cache row and ticket + correlation produced through this integration. It is the canonical + contract that ``data_fetcher.py`` reads via ``getattr(integration, + "platform_name", ...)`` to derive the platform tag for non-JIRA + integrations as well. Any future ``*Integration`` class must declare + its own ``platform_name`` to participate in the per-platform routing. + """ + + platform_name: str = "jira" def __init__( self, diff --git a/src/gitflow_analytics/metrics/activity_scoring.py b/src/gitflow_analytics/metrics/activity_scoring.py index 1c880c1..9ce3f29 100644 --- a/src/gitflow_analytics/metrics/activity_scoring.py +++ b/src/gitflow_analytics/metrics/activity_scoring.py @@ -40,7 +40,9 @@ class ActivityScorer: PR_BASE_SCORE = 50 # Each PR worth base 50 points (5x commit) OPTIMAL_PR_SIZE = 200 # Research shows PRs under 200 lines are optimal - # Per-event ticketing weights (same scale as TicketingActivityReport) + # Per-event ticketing weights (same scale as TicketingActivityReport). + # Azure DevOps weights mirror JIRA's so existing downstream consumers + # treat ADO events with parity once Phase 3+ starts emitting them. TICKETING_EVENT_WEIGHTS = { "issues_opened": 1.0, "issues_closed": 1.0, @@ -50,6 +52,9 @@ class ActivityScorer: "jira_issues_opened": 1.5, "jira_issues_closed": 2.0, "jira_comments_posted": 0.5, + "azure_devops_issues_opened": 1.5, + "azure_devops_issues_closed": 2.0, + "azure_devops_comments_posted": 0.5, } def __init__( @@ -211,6 +216,17 @@ def _ticketing_event_key(platform: str, item_type: str, action: str) -> Optional return "jira_issues_closed" if item_type == "comment": return "jira_comments_posted" + elif platform == "azure_devops": + # Azure DevOps activity events are not yet emitted by any + # integration (Phase 3+). The branch is registered now so the + # downstream scoring map accepts the platform once the adapter + # starts populating ``ticketing_activity_cache`` rows. + if item_type == "issue_created": + return "azure_devops_issues_opened" + if item_type == "issue_closed": + return "azure_devops_issues_closed" + if item_type == "comment": + return "azure_devops_comments_posted" return None def get_ticketing_score(self, developer_id: Optional[str]) -> float: diff --git a/src/gitflow_analytics/pipeline_collect.py b/src/gitflow_analytics/pipeline_collect.py index 560f017..c5a6668 100644 --- a/src/gitflow_analytics/pipeline_collect.py +++ b/src/gitflow_analytics/pipeline_collect.py @@ -59,24 +59,23 @@ def _emit(msg: str) -> None: _emit(f"Collect period: {start_date.strftime('%Y-%m-%d')} to {end_date.strftime('%Y-%m-%d')}") cache = GitAnalysisCache(cfg.cache.directory) + effective_ticket_platforms = cfg.get_effective_ticket_platforms() data_fetcher = GitDataFetcher( cache=cache, branch_mapping_rules=getattr(cfg.analysis, "branch_mapping_rules", {}), - allowed_ticket_platforms=getattr( - cfg.analysis, "ticket_platforms", ["jira", "github", "clickup", "linear"] - ), + allowed_ticket_platforms=effective_ticket_platforms, exclude_paths=getattr(cfg.analysis, "exclude_paths", None), exclude_merge_commits=cfg.analysis.exclude_merge_commits, ticket_detection_config=getattr(cfg.analysis, "ticket_detection", None), ) orchestrator = IntegrationOrchestrator(cfg, cache) - jira_integration = orchestrator.integrations.get("jira") + # Generic PM integration handle: today the legacy path only wires JIRA, but + # any registered integration with a ``get_issue`` interface plugs in here. + pm_integration = orchestrator.integrations.get("jira") config_hash = cache.generate_config_hash( branch_mapping_rules=getattr(cfg.analysis, "branch_mapping_rules", {}), - ticket_platforms=getattr( - cfg.analysis, "ticket_platforms", ["jira", "github", "clickup", "linear"] - ), + ticket_platforms=effective_ticket_platforms, exclude_paths=getattr(cfg.analysis, "exclude_paths", None), ml_categorization_enabled=False, additional_config={"weeks": weeks}, @@ -140,7 +139,7 @@ def _progress_cb(message: str) -> None: project_key=project_key, weeks_back=weeks, branch_patterns=branch_patterns, - jira_integration=jira_integration, + pm_integration=pm_integration, progress_callback=_progress_cb, start_date=start_date, end_date=end_date, diff --git a/src/gitflow_analytics/pm_framework/adapters/__init__.py b/src/gitflow_analytics/pm_framework/adapters/__init__.py index cd38148..7a5529f 100644 --- a/src/gitflow_analytics/pm_framework/adapters/__init__.py +++ b/src/gitflow_analytics/pm_framework/adapters/__init__.py @@ -26,16 +26,17 @@ """ # Import available adapters +from .azure_devops_adapter import AzureDevOpsAdapter from .jira_adapter import JIRAAdapter # Placeholder for future adapter imports -# from .azure_devops_adapter import AzureDevOpsAdapter # from .linear_adapter import LinearAdapter # from .asana_adapter import AsanaAdapter # from .github_issues_adapter import GitHubIssuesAdapter # from .clickup_adapter import ClickUpAdapter __all__: list[str] = [ + "AzureDevOpsAdapter", "JIRAAdapter", # Platform adapters will be added here as they are implemented ] diff --git a/src/gitflow_analytics/pm_framework/adapters/azure_devops_adapter.py b/src/gitflow_analytics/pm_framework/adapters/azure_devops_adapter.py new file mode 100644 index 0000000..8c4d294 --- /dev/null +++ b/src/gitflow_analytics/pm_framework/adapters/azure_devops_adapter.py @@ -0,0 +1,228 @@ +"""Azure DevOps PM platform adapter (Phase 1 stub). + +This module provides the registration surface for Azure DevOps inside the +PM framework. The adapter is intentionally a stub at this phase: it parses +its configuration, advertises a no-capabilities profile, and raises a +phase-tagged :class:`NotImplementedError` from every data method so the +orchestrator can register it without enabling network access. + +Real behaviour is delivered by later phases of the integration plan +(``docs/design/azure-devops-integration-plan.md``): + +- Phase 2: ``authenticate`` / ``test_connection`` / ``get_projects``. +- Phase 3: ``get_issues`` (WIQL + batch fetch) and the cache. +- Phase 4: ``get_sprints`` and ``get_users``. +- Phase 5: ``get_issue_comments``, ``get_custom_fields`` and native + commit-link correlation. +""" + +from __future__ import annotations + +from datetime import datetime +from typing import Any + +from ..base import BasePlatformAdapter, PlatformCapabilities +from ..models import IssueType, UnifiedIssue, UnifiedProject, UnifiedSprint, UnifiedUser + + +class AzureDevOpsAdapter(BasePlatformAdapter): + """Azure DevOps Services adapter (Phase 1 stub). + + The full adapter ships incrementally across Phases 2–5. Phase 1 only + registers the platform key with the orchestrator; every data method + raises :class:`NotImplementedError` with the originating phase tag so + test failures and runtime errors are self-describing. + + Attributes inherited from :class:`BasePlatformAdapter`: + config: Configuration mapping the orchestrator handed to the + adapter (typically the resolved ``pm_integration.platforms. + azure_devops.config`` block). + platform_name: Always ``"azure_devops"`` for this adapter. + capabilities: A :class:`PlatformCapabilities` instance with all + ``supports_*`` flags set to ``False`` until later phases land. + """ + + def __init__(self, config: dict[str, Any]) -> None: + """Initialise the adapter with its config block. + + Args: + config: Resolved configuration mapping. Keys mirror the fields + of :class:`gitflow_analytics.config.schema.AzureDevOpsConfig` + (``organization_url``, ``personal_access_token``, + ``project``, ``api_version``, ``story_point_fields`` …). + Missing keys fall back to safe defaults for the stub. + """ + super().__init__(config) + self.organization_url: str = str(config.get("organization_url", "") or "") + self.personal_access_token: str = str(config.get("personal_access_token", "") or "") + self.project: str | None = config.get("project") + self.api_version: str = str(config.get("api_version", "7.1")) + self.story_point_fields: list[str] = list( + config.get( + "story_point_fields", + [ + "Microsoft.VSTS.Scheduling.StoryPoints", + "Microsoft.VSTS.Scheduling.Effort", + "Microsoft.VSTS.Scheduling.Size", + ], + ) + ) + + def _get_platform_name(self) -> str: + """Return the canonical platform identifier. + + Returns: + The literal string ``"azure_devops"``. + """ + return "azure_devops" + + def _get_capabilities(self) -> PlatformCapabilities: + """Return capability flags for the Phase 1 stub. + + All ``supports_*`` flags are forced to ``False`` until the + corresponding behaviours land in later phases. + + Returns: + A :class:`PlatformCapabilities` with every feature flag + disabled and conservative rate-limit defaults. + """ + caps = PlatformCapabilities() + # Phase 1: explicitly disable every capability so callers do not + # accidentally exercise unimplemented paths via base-class default + # implementations. + caps.supports_projects = False + caps.supports_issues = False + caps.supports_sprints = False + caps.supports_time_tracking = False + caps.supports_story_points = False + caps.supports_custom_fields = False + caps.supports_issue_linking = False + caps.supports_comments = False + caps.supports_attachments = False + caps.supports_workflows = False + caps.supports_bulk_operations = False + caps.supports_cursor_pagination = False + return caps + + # ------------------------------------------------------------------ + # Stub data methods. Each raises NotImplementedError tagged with the + # phase that will deliver the real implementation. + # ------------------------------------------------------------------ + + def authenticate(self) -> bool: + """Report a successful Phase-1 stub authentication. + + Phase 1 ships only the registration surface. Returning ``True`` (with + a one-time advisory log line) lets the orchestrator's + ``_initialize_platforms`` flow complete without logging an error + every run for users who have already configured ``pm.azure_devops``. + + Returns: + ``True`` — the stub is always "authenticated" since no network + call occurs. Real PAT validation arrives in Phase 2. + """ + self.logger.info( + "Azure DevOps adapter is a Phase 1 stub; configuration is parsed " + "and registered but no work items will be collected until Phase 2." + ) + return True + + def test_connection(self) -> dict[str, Any]: + """Return a stub-status diagnostic dictionary. + + Returns: + A dictionary with ``status="connected"`` (so the orchestrator's + connection-test gate passes) plus stub markers (``stub=True``, + ``phase=2``) so callers that want to distinguish stub from + real adapters can. Real diagnostics arrive in Phase 2. + """ + return { + "status": "connected", + "stub": True, + "phase": 2, + "platform": "azure_devops", + "message": ( + "Azure DevOps adapter is a Phase 1 stub. " + "Real authentication and project listing arrive in Phase 2." + ), + } + + def get_projects(self) -> list[UnifiedProject]: + """List accessible Azure DevOps projects. + + Raises: + NotImplementedError: Always; the real implementation arrives + in Phase 2. + """ + raise NotImplementedError("Azure DevOps adapter: get_projects — implemented in Phase 2") + + def get_issues( + self, + project_id: str, + since: datetime | None = None, + issue_types: list[IssueType] | None = None, + ) -> list[UnifiedIssue]: + """Fetch work items for a project. + + Args: + project_id: Azure DevOps project identifier (id or name). + since: Optional lower-bound for ``System.ChangedDate``. + issue_types: Optional unified issue-type filter. + + Raises: + NotImplementedError: Always; the WIQL + batch fetch + implementation arrives in Phase 3. + """ + raise NotImplementedError("Azure DevOps adapter: get_issues — implemented in Phase 3") + + def get_sprints(self, project_id: str) -> list[UnifiedSprint]: + """Enumerate iterations (sprints) for a project. + + Args: + project_id: Azure DevOps project identifier. + + Raises: + NotImplementedError: Always; iteration enumeration arrives + in Phase 4. + """ + raise NotImplementedError("Azure DevOps adapter: get_sprints — implemented in Phase 4") + + def get_users(self, project_id: str) -> list[UnifiedUser]: + """Enumerate users for a project. + + Args: + project_id: Azure DevOps project identifier. + + Raises: + NotImplementedError: Always; Graph-API enumeration arrives + in Phase 4. + """ + raise NotImplementedError("Azure DevOps adapter: get_users — implemented in Phase 4") + + def get_issue_comments(self, issue_key: str) -> list[dict[str, Any]]: + """Fetch the comment history for a work item. + + Args: + issue_key: Azure DevOps work-item identifier. + + Raises: + NotImplementedError: Always; comments support arrives in + Phase 5. + """ + raise NotImplementedError( + "Azure DevOps adapter: get_issue_comments — implemented in Phase 5" + ) + + def get_custom_fields(self, project_id: str) -> dict[str, Any]: + """Retrieve custom field definitions for a project. + + Args: + project_id: Azure DevOps project identifier. + + Raises: + NotImplementedError: Always; custom-field discovery arrives + in Phase 5. + """ + raise NotImplementedError( + "Azure DevOps adapter: get_custom_fields — implemented in Phase 5" + ) diff --git a/src/gitflow_analytics/pm_framework/orchestrator.py b/src/gitflow_analytics/pm_framework/orchestrator.py index 3355487..21550b5 100644 --- a/src/gitflow_analytics/pm_framework/orchestrator.py +++ b/src/gitflow_analytics/pm_framework/orchestrator.py @@ -107,12 +107,14 @@ def _register_builtin_adapters(self) -> None: logger.debug("Registering built-in platform adapters...") # Register available adapters - from .adapters import JIRAAdapter + from .adapters import AzureDevOpsAdapter, JIRAAdapter self.registry.register_adapter("jira", JIRAAdapter) logger.debug("Registered JIRA adapter") - # self.registry.register_adapter('azure_devops', AzureDevOpsAdapter) + self.registry.register_adapter("azure_devops", AzureDevOpsAdapter) + logger.debug("Registered Azure DevOps adapter (Phase 1 stub)") + # self.registry.register_adapter('linear', LinearAdapter) # self.registry.register_adapter('asana', AsanaAdapter) diff --git a/src/gitflow_analytics/reports/ticketing_activity_report.py b/src/gitflow_analytics/reports/ticketing_activity_report.py index 6510a84..9f8196d 100644 --- a/src/gitflow_analytics/reports/ticketing_activity_report.py +++ b/src/gitflow_analytics/reports/ticketing_activity_report.py @@ -33,6 +33,13 @@ class TicketingActivityReport: "jira_issues_opened": 1.5, "jira_issues_closed": 2.0, "jira_comments_posted": 0.5, + # Azure DevOps weights mirror JIRA's. The adapter does not yet emit + # ticketing-cache rows (Phase 3+); registering the keys now keeps the + # combined summary forward-compatible without changing existing + # behaviour for JIRA-only deployments. + "azure_devops_issues_opened": 1.5, + "azure_devops_issues_closed": 2.0, + "azure_devops_comments_posted": 0.5, } def __init__( diff --git a/tests/config/__init__.py b/tests/config/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/tests/config/test_azure_devops_config.py b/tests/config/test_azure_devops_config.py new file mode 100644 index 0000000..8702c7a --- /dev/null +++ b/tests/config/test_azure_devops_config.py @@ -0,0 +1,378 @@ +"""Tests for Azure DevOps Phase 1 configuration plumbing. + +Covers: +- Round-tripping a YAML fixture with a ``pm.azure_devops`` block. +- Effective ticket-platform inference picking up ``"azure_devops"``. +- Environment-variable resolution for ``${AZURE_DEVOPS_PAT}``. +- URL validator rejection of on-prem patterns and ``is_on_premise: true``. +- URL validator acceptance of cloud hosts. +""" + +from __future__ import annotations + +import os +import tempfile +from collections.abc import Iterator +from pathlib import Path + +import pytest + +from gitflow_analytics.config import ConfigLoader +from gitflow_analytics.config.errors import ( + ConfigurationError, + EnvironmentVariableError, +) +from gitflow_analytics.config.loader_sections import ConfigLoaderSectionsMixin +from gitflow_analytics.config.schema import AzureDevOpsConfig + +_BASE_YAML = """\ +version: "1.0" +github: + token: "ghp_dummy_token_value" + owner: "octocat" +repositories: + - name: "demo" + path: "/tmp/demo" +""" + + +def _write_yaml(content: str) -> Path: + """Write ``content`` to a temp YAML file and return the path.""" + with tempfile.NamedTemporaryFile( + mode="w", suffix=".yaml", delete=False, encoding="utf-8" + ) as tmp: + tmp.write(content) + return Path(tmp.name) + + +@pytest.fixture +def ado_pat_env() -> Iterator[None]: + """Set the AZURE_DEVOPS_PAT env var for the duration of a test.""" + original = os.environ.get("AZURE_DEVOPS_PAT") + os.environ["AZURE_DEVOPS_PAT"] = "test-pat-value" + try: + yield + finally: + if original is None: + os.environ.pop("AZURE_DEVOPS_PAT", None) + else: + os.environ["AZURE_DEVOPS_PAT"] = original + + +class TestAzureDevOpsConfigLoading: + """Loader-level tests for the Azure DevOps Phase 1 wiring.""" + + def test_loads_pm_azure_devops_block(self, ado_pat_env: None) -> None: + """A YAML config with pm.azure_devops should round-trip cleanly.""" + config_yaml = _BASE_YAML + ( + "pm:\n" + " azure_devops:\n" + " enabled: true\n" + ' organization_url: "https://dev.azure.com/myorg"\n' + ' project: "MyProject"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + ' work_item_types: ["User Story", "Bug"]\n' + ) + path = _write_yaml(config_yaml) + try: + cfg = ConfigLoader.load(path) + finally: + path.unlink() + + # Per plan §3.3, ``cfg.pm.azure_devops`` is the only canonical + # home for ADO configuration. There is no top-level ``cfg.azure_devops``. + assert cfg.pm is not None + ado = getattr(cfg.pm, "azure_devops", None) + assert isinstance(ado, AzureDevOpsConfig) + assert ado.enabled is True + assert ado.organization_url == "https://dev.azure.com/myorg" + assert ado.project == "MyProject" + assert ado.personal_access_token == "test-pat-value" + assert ado.work_item_types == ["User Story", "Bug"] + assert ado.api_version == "7.1" + + def test_top_level_azure_devops_block_is_ignored(self, ado_pat_env: None) -> None: + """Per plan §3.3, top-level ``azure_devops:`` is NOT a supported key. + + Only ``pm.azure_devops:`` is parsed. A YAML doc with the top-level + key (and no ``pm.azure_devops``) loads cleanly but produces no ADO + config — guarding against re-introducing the ``jira:`` / + ``jira_integration:`` dual-stack mistake. + """ + config_yaml = _BASE_YAML + ( + "azure_devops:\n" + ' organization_url: "https://myorg.visualstudio.com"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + ) + path = _write_yaml(config_yaml) + try: + cfg = ConfigLoader.load(path) + finally: + path.unlink() + + # No ADO config should be picked up; the top-level key is intentionally not a parse target. + assert cfg.pm is None or getattr(cfg.pm, "azure_devops", None) is None + + def test_get_effective_ticket_platforms_includes_azure_devops(self, ado_pat_env: None) -> None: + """``get_effective_ticket_platforms`` should include 'azure_devops'.""" + config_yaml = _BASE_YAML + ( + "pm:\n" + " azure_devops:\n" + ' organization_url: "https://dev.azure.com/myorg"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + ) + path = _write_yaml(config_yaml) + try: + cfg = ConfigLoader.load(path) + finally: + path.unlink() + + platforms = cfg.get_effective_ticket_platforms() + assert "azure_devops" in platforms + + def test_get_effective_ticket_platforms_fallback_excludes_ado(self) -> None: + """Fallback (no PM/JIRA configured) does NOT include ``azure_devops``. + + The fallback ``["jira", "github", "clickup", "linear"]`` is applied + when no platform is configured. Adding ``azure_devops`` to that + fallback would silently turn ``AB#NNN`` references in commits into + ADO-tagged tickets for users who never configured ADO. ADO opt-in + only — see schema.py ``get_effective_ticket_platforms`` comment. + """ + from gitflow_analytics.config.schema import ( + AnalysisConfig, + CacheConfig, + Config, + GitHubConfig, + OutputConfig, + ) + + cfg = Config( + repositories=[], + github=GitHubConfig(token=None), # type: ignore[arg-type] + analysis=AnalysisConfig(), + output=OutputConfig(), + cache=CacheConfig(), + ) + platforms = cfg.get_effective_ticket_platforms() + assert "azure_devops" not in platforms + assert platforms == ["jira", "github", "clickup", "linear"] + + def test_missing_pat_env_raises(self) -> None: + """Unresolved ${AZURE_DEVOPS_PAT} should raise EnvironmentVariableError.""" + # Make sure the env var is *not* set. + original = os.environ.pop("AZURE_DEVOPS_PAT", None) + try: + config_yaml = _BASE_YAML + ( + "pm:\n" + " azure_devops:\n" + ' organization_url: "https://dev.azure.com/myorg"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + ) + path = _write_yaml(config_yaml) + try: + with pytest.raises(EnvironmentVariableError): + ConfigLoader.load(path) + finally: + path.unlink() + finally: + if original is not None: + os.environ["AZURE_DEVOPS_PAT"] = original + + +class TestAzureDevOpsUrlValidator: + """Direct unit tests for the on-prem URL validator.""" + + def test_rejects_tfs_collection_url(self) -> None: + with pytest.raises(ConfigurationError) as exc: + ConfigLoaderSectionsMixin._validate_azure_devops_url( + "https://tfs.example.com/tfs/MyCollection", + is_on_premise=False, + ) + assert "on-premises" in str(exc.value).lower() + assert "v1.2" in str(exc.value) + + def test_rejects_default_collection_url(self) -> None: + with pytest.raises(ConfigurationError): + ConfigLoaderSectionsMixin._validate_azure_devops_url( + "https://server.example.com/DefaultCollection/proj", + is_on_premise=False, + ) + + def test_rejects_arbitrary_host(self) -> None: + with pytest.raises(ConfigurationError): + ConfigLoaderSectionsMixin._validate_azure_devops_url( + "https://example.com/myorg", + is_on_premise=False, + ) + + def test_rejects_is_on_premise_true(self) -> None: + """is_on_premise=True must always be rejected with the v1.2 message.""" + with pytest.raises(ConfigurationError) as exc: + ConfigLoaderSectionsMixin._validate_azure_devops_url( + "https://dev.azure.com/myorg", + is_on_premise=True, + ) + assert "on-premises" in str(exc.value).lower() + + def test_accepts_dev_azure_com(self) -> None: + # No exception means the URL passes the allowlist. + ConfigLoaderSectionsMixin._validate_azure_devops_url( + "https://dev.azure.com/myorg", + is_on_premise=False, + ) + + def test_accepts_visualstudio_com(self) -> None: + ConfigLoaderSectionsMixin._validate_azure_devops_url( + "https://myorg.visualstudio.com", + is_on_premise=False, + ) + + def test_rejects_empty_url(self) -> None: + with pytest.raises(ConfigurationError): + ConfigLoaderSectionsMixin._validate_azure_devops_url("", is_on_premise=False) + + def test_loader_rejects_on_prem_yaml(self, ado_pat_env: None) -> None: + """Loader should surface the on-prem rejection when YAML is loaded. + + Verbatim ADR-decision-2 message check (not just substring): a + future refactor that softens the message must update the ADR + and this test together. + """ + from gitflow_analytics.config.loader_sections import ( + _AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE, + ) + + config_yaml = _BASE_YAML + ( + "pm:\n" + " azure_devops:\n" + ' organization_url: "https://tfs.example.com/tfs/MyCollection"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + ) + path = _write_yaml(config_yaml) + try: + with pytest.raises(ConfigurationError) as exc: + ConfigLoader.load(path) + # Verbatim assertion: the rejection message must match the ADR text. + assert _AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE in str(exc.value) + finally: + path.unlink() + + def test_loader_rejects_is_on_premise_yaml(self, ado_pat_env: None) -> None: + """Loader should reject is_on_premise=true with the verbatim ADR message.""" + from gitflow_analytics.config.loader_sections import ( + _AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE, + ) + + config_yaml = _BASE_YAML + ( + "pm:\n" + " azure_devops:\n" + ' organization_url: "https://dev.azure.com/myorg"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + " is_on_premise: true\n" + ) + path = _write_yaml(config_yaml) + try: + with pytest.raises(ConfigurationError) as exc: + ConfigLoader.load(path) + assert _AZURE_DEVOPS_ONPREM_REJECTION_MESSAGE in str(exc.value) + finally: + path.unlink() + + def test_accepts_dev_azure_com_with_trailing_slash(self) -> None: + """Trailing-slash URLs should pass the allowlist check.""" + ConfigLoaderSectionsMixin._validate_azure_devops_url( + "https://dev.azure.com/myorg/", + is_on_premise=False, + ) + + def test_accepts_dev_azure_com_uppercase(self) -> None: + """Host comparison should be case-insensitive.""" + ConfigLoaderSectionsMixin._validate_azure_devops_url( + "https://DEV.AZURE.COM/myorg", + is_on_premise=False, + ) + + +class TestAzureDevOpsPATValidation: + """Tests for the silent-empty PAT trap and EnvironmentVariableError contract.""" + + def test_empty_pat_env_raises(self) -> None: + """``AZURE_DEVOPS_PAT=""`` (empty string) must raise, not silently pass. + + ``${AZURE_DEVOPS_PAT}`` is a non-empty literal so the early + env-var-error guard does not fire. The dedicated empty-string + check after env-var resolution must catch it. + """ + original = os.environ.get("AZURE_DEVOPS_PAT") + os.environ["AZURE_DEVOPS_PAT"] = "" + try: + config_yaml = _BASE_YAML + ( + "pm:\n" + " azure_devops:\n" + ' organization_url: "https://dev.azure.com/myorg"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + ) + path = _write_yaml(config_yaml) + try: + with pytest.raises(EnvironmentVariableError): + ConfigLoader.load(path) + finally: + path.unlink() + finally: + if original is None: + os.environ.pop("AZURE_DEVOPS_PAT", None) + else: + os.environ["AZURE_DEVOPS_PAT"] = original + + def test_whitespace_pat_env_raises(self) -> None: + """Whitespace-only PAT must also be rejected.""" + original = os.environ.get("AZURE_DEVOPS_PAT") + os.environ["AZURE_DEVOPS_PAT"] = " " + try: + config_yaml = _BASE_YAML + ( + "pm:\n" + " azure_devops:\n" + ' organization_url: "https://dev.azure.com/myorg"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + ) + path = _write_yaml(config_yaml) + try: + with pytest.raises(EnvironmentVariableError): + ConfigLoader.load(path) + finally: + path.unlink() + finally: + if original is None: + os.environ.pop("AZURE_DEVOPS_PAT", None) + else: + os.environ["AZURE_DEVOPS_PAT"] = original + + def test_environment_variable_error_carries_var_name_and_platform(self) -> None: + """The raised ``EnvironmentVariableError`` must name the var + platform. + + ADR Decision 1 names ``AZURE_DEVOPS_PAT`` and the platform string + ``"AzureDevOps"`` as part of the error contract. A refactor that + renames either must update this assertion alongside. + """ + original = os.environ.pop("AZURE_DEVOPS_PAT", None) + try: + config_yaml = _BASE_YAML + ( + "pm:\n" + " azure_devops:\n" + ' organization_url: "https://dev.azure.com/myorg"\n' + ' personal_access_token: "${AZURE_DEVOPS_PAT}"\n' + ) + path = _write_yaml(config_yaml) + try: + with pytest.raises(EnvironmentVariableError) as exc: + ConfigLoader.load(path) + # The error renders the var name and platform as part of the + # message produced by ``errors.py:EnvironmentVariableError``. + assert "AZURE_DEVOPS_PAT" in str(exc.value) + assert "AzureDevOps" in str(exc.value) + finally: + path.unlink() + finally: + if original is not None: + os.environ["AZURE_DEVOPS_PAT"] = original diff --git a/tests/core/test_data_fetcher_pm_alias.py b/tests/core/test_data_fetcher_pm_alias.py new file mode 100644 index 0000000..d8a907e --- /dev/null +++ b/tests/core/test_data_fetcher_pm_alias.py @@ -0,0 +1,186 @@ +"""Regression tests for the Phase 1 ``jira_integration → pm_integration`` rename. + +The Phase 1 refactor generalised the ticket-enrichment plumbing across +``core/data_fetcher.py``, ``data_fetcher_parallel.py``, and +``data_fetcher_processing.py`` so that any PM integration (not only +JIRA) can flow through. Two contracts must hold: + +1. ``jira_integration=`` callers still work and emit ``DeprecationWarning``. +2. The platform tag stored on cache rows is derived from the integration + object's ``platform_name`` attribute, with ``"jira"`` as the fallback + when the attribute is missing (transitional — see the TODO at + ``data_fetcher.py:312-318``). + +These tests deliberately avoid running the full Git/cache pipeline; they +exercise the shim and derivation logic in isolation. +""" + +from __future__ import annotations + +import warnings + +import pytest + + +class _FakeIntegrationWithPlatformName: + """Stub integration that declares ``platform_name``.""" + + platform_name: str = "azure_devops" + + def get_issue(self, ticket_id: str) -> None: # pragma: no cover - not exercised + return None + + +class _FakeIntegrationWithoutPlatformName: + """Stub integration without a ``platform_name`` attribute (legacy shape).""" + + def get_issue(self, ticket_id: str) -> None: # pragma: no cover - not exercised + return None + + +class TestPlatformTagDerivation: + """Unit tests for ``getattr(integration, "platform_name", None) or "jira"``. + + The expression is the load-bearing line at + ``data_fetcher.py:317`` and ``data_fetcher_processing.py:196``. A + typo or accidental rename of either site would silently mis-tag ADO + cache rows as JIRA — exactly the bug Phase 1's refactor is meant to + prevent. Locking the contract with explicit unit tests so a + regression fails loudly. + """ + + def test_integration_with_platform_name_returns_declared_value(self) -> None: + integration = _FakeIntegrationWithPlatformName() + derived = getattr(integration, "platform_name", None) or "jira" + assert derived == "azure_devops" + + def test_integration_without_platform_name_falls_back_to_jira(self) -> None: + integration = _FakeIntegrationWithoutPlatformName() + derived = getattr(integration, "platform_name", None) or "jira" + assert derived == "jira" + + def test_none_integration_falls_back_to_jira(self) -> None: + derived = getattr(None, "platform_name", None) or "jira" + assert derived == "jira" + + def test_jira_integration_class_declares_platform_name(self) -> None: + """``JIRAIntegration.platform_name`` must equal ``"jira"``. + + Architecture review M1: without this class attribute the + derivation above would always fall back to the literal ``"jira"`` + for any future ADO integration that subclasses or duck-types + the same surface. Locking the contract here. + """ + from gitflow_analytics.integrations.jira_integration import JIRAIntegration + + assert JIRAIntegration.platform_name == "jira" + + def test_azure_devops_adapter_declares_platform_name(self) -> None: + """``AzureDevOpsAdapter`` exposes the canonical ``"azure_devops"`` tag. + + The adapter inherits ``platform_name`` from + :class:`BasePlatformAdapter` which sets it from + ``_get_platform_name()`` in ``__init__``. Lock the contract. + """ + from gitflow_analytics.pm_framework.adapters.azure_devops_adapter import ( + AzureDevOpsAdapter, + ) + + adapter = AzureDevOpsAdapter({"organization_url": "https://dev.azure.com/x"}) + assert adapter.platform_name == "azure_devops" + + +class TestDeprecationAlias: + """Tests for the ``jira_integration → pm_integration`` deprecation shim. + + Phase 1 generalised the kwarg name from ``jira_integration`` to + ``pm_integration`` while keeping the old name as a deprecated alias + so external callers do not break. Without these tests, a refactor + that drops the alias would silently break callers. + """ + + def test_shim_emits_deprecation_warning_and_passes_through(self) -> None: + """Calling with ``jira_integration=`` emits a DeprecationWarning. + + Tests the shim logic in isolation by reproducing the exact + conditional from ``data_fetcher.py:175-183``. Keeping the test + narrow avoids dragging in the full fetch pipeline (which needs + a real Git repo, cache, etc.). + """ + # Reproduce the shim behaviour from data_fetcher.py:175-183. + integration = _FakeIntegrationWithPlatformName() + + pm_integration = None + jira_integration = integration + + with warnings.catch_warnings(record=True) as captured: + warnings.simplefilter("always") + if pm_integration is None and jira_integration is not None: + warnings.warn( + "The 'jira_integration' keyword is deprecated; pass " + "'pm_integration' instead. The alias will be removed in a " + "future release.", + DeprecationWarning, + stacklevel=2, + ) + pm_integration = jira_integration + + assert pm_integration is integration + assert any( + issubclass(w.category, DeprecationWarning) and "jira_integration" in str(w.message) + for w in captured + ) + + def test_pm_integration_wins_when_both_passed(self) -> None: + """When both kwargs are provided, ``pm_integration`` takes precedence. + + Tests the docstring contract at ``data_fetcher.py:167``: "When + both are provided ``pm_integration`` wins." + """ + ado = _FakeIntegrationWithPlatformName() + legacy = _FakeIntegrationWithoutPlatformName() + + # Reproducing the conditional: only override pm_integration when + # it is None. With both supplied, pm_integration stays. + pm_integration = ado + jira_integration = legacy + + if pm_integration is None and jira_integration is not None: + pm_integration = jira_integration # pragma: no cover - branch not taken + + assert pm_integration is ado, "pm_integration must win when both are passed" + + def test_data_fetcher_signature_keeps_alias(self) -> None: + """``GitDataFetcher.fetch_repository_data`` still accepts both kwargs. + + Inspects the function signature (cheap, no execution) to confirm + both ``jira_integration`` and ``pm_integration`` parameters + exist. A future refactor that drops one would fail this test. + """ + import inspect + + from gitflow_analytics.core.data_fetcher import GitDataFetcher + + sig = inspect.signature(GitDataFetcher.fetch_repository_data) + assert "jira_integration" in sig.parameters + assert "pm_integration" in sig.parameters + # Both should default to None so callers can pass either / neither. + assert sig.parameters["jira_integration"].default is None + assert sig.parameters["pm_integration"].default is None + + +class TestParallelFetcherSignature: + """Same signature parity check for the parallel fetcher.""" + + def test_parallel_fetcher_keeps_alias(self) -> None: + """``ParallelFetcherMixin`` exposes both ``jira_integration`` and ``pm_integration``.""" + import inspect + + from gitflow_analytics.core.data_fetcher_parallel import ParallelFetcherMixin + + for method_name in ("process_repositories_parallel", "_process_repository_with_timeout"): + method = getattr(ParallelFetcherMixin, method_name, None) + if method is None: + pytest.skip(f"{method_name} not found") + sig = inspect.signature(method) + assert "pm_integration" in sig.parameters, f"{method_name} missing pm_integration kwarg" diff --git a/tests/pm_framework/__init__.py b/tests/pm_framework/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/tests/pm_framework/adapters/__init__.py b/tests/pm_framework/adapters/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/tests/pm_framework/adapters/test_azure_devops_adapter_stub.py b/tests/pm_framework/adapters/test_azure_devops_adapter_stub.py new file mode 100644 index 0000000..c332331 --- /dev/null +++ b/tests/pm_framework/adapters/test_azure_devops_adapter_stub.py @@ -0,0 +1,146 @@ +"""Tests for the Phase 1 Azure DevOps adapter stub. + +The adapter is intentionally a registration stub at this phase: it must +register with the orchestrator, advertise an all-False capability set, +and raise phase-tagged ``NotImplementedError`` from every data method. +""" + +from __future__ import annotations + +import pytest + +from gitflow_analytics.pm_framework.adapters import AzureDevOpsAdapter +from gitflow_analytics.pm_framework.orchestrator import PMFrameworkOrchestrator + + +def _make_adapter() -> AzureDevOpsAdapter: + """Construct an adapter instance with a minimal stub config.""" + return AzureDevOpsAdapter( + { + "organization_url": "https://dev.azure.com/myorg", + "personal_access_token": "fake-pat", + "project": "MyProject", + } + ) + + +class TestAzureDevOpsAdapterStub: + """Behaviour expected from the Phase 1 stub adapter.""" + + def test_platform_name(self) -> None: + adapter = _make_adapter() + assert adapter.platform_name == "azure_devops" + + def test_capabilities_all_false(self) -> None: + """Every supports_* flag must be False until later phases land.""" + adapter = _make_adapter() + caps = adapter.capabilities + assert caps.supports_projects is False + assert caps.supports_issues is False + assert caps.supports_sprints is False + assert caps.supports_time_tracking is False + assert caps.supports_story_points is False + assert caps.supports_custom_fields is False + assert caps.supports_issue_linking is False + assert caps.supports_comments is False + assert caps.supports_attachments is False + assert caps.supports_workflows is False + assert caps.supports_bulk_operations is False + assert caps.supports_cursor_pagination is False + + @pytest.mark.parametrize( + ("method_name", "args", "phase"), + [ + ("get_projects", (), "Phase 2"), + ("get_issues", ("MyProject",), "Phase 3"), + ("get_sprints", ("MyProject",), "Phase 4"), + ("get_users", ("MyProject",), "Phase 4"), + ("get_issue_comments", ("AB#1",), "Phase 5"), + ("get_custom_fields", ("MyProject",), "Phase 5"), + ], + ) + def test_data_methods_raise_not_implemented( + self, method_name: str, args: tuple, phase: str + ) -> None: + """Each stub *data* method must raise NotImplementedError with phase tag. + + Note: ``authenticate`` and ``test_connection`` deliberately do + NOT raise — they return success-with-stub-status so the + orchestrator's ``_initialize_platforms`` flow does not log an + error on every ADO-configured run. That contract is verified by + :meth:`test_authenticate_returns_true_for_stub` and + :meth:`test_test_connection_returns_stub_status` below. + """ + adapter = _make_adapter() + method = getattr(adapter, method_name) + with pytest.raises(NotImplementedError) as exc: + method(*args) + assert "Azure DevOps adapter" in str(exc.value) + assert phase in str(exc.value) + + def test_authenticate_returns_true_for_stub(self) -> None: + """Stub ``authenticate()`` must return ``True`` (not raise). + + Architecture review B1: raising ``NotImplementedError`` from + ``authenticate`` causes the orchestrator's + ``_initialize_platforms`` to log an ERROR with stack trace on + every ADO-configured run. The Phase 1 stub returns ``True`` (and + logs an advisory) so configured deployments stay quiet until the + real authentication arrives in Phase 2. + """ + adapter = _make_adapter() + assert adapter.authenticate() is True + + def test_test_connection_returns_stub_status(self) -> None: + """Stub ``test_connection()`` must return a connected-stub diagnostic. + + The dict shape is the contract the orchestrator inspects: + ``status="connected"`` lets the connection-test gate pass; + ``stub=True`` and ``phase=2`` mark the adapter as a Phase-1 + placeholder for any caller that wants to distinguish it from + a real adapter. + """ + adapter = _make_adapter() + result = adapter.test_connection() + assert result["status"] == "connected" + assert result["stub"] is True + assert result["phase"] == 2 + assert result["platform"] == "azure_devops" + + def test_adapter_registered_in_orchestrator(self) -> None: + """Orchestrator must register 'azure_devops' even when disabled.""" + orchestrator = PMFrameworkOrchestrator( + { + "pm_platforms": {}, + "analysis": {"pm_integration": {"enabled": False}}, + } + ) + available = orchestrator.registry.get_available_platforms() + assert "azure_devops" in available + assert "jira" in available + + def test_registry_resolves_adapter_class(self) -> None: + """Registered class lookup should resolve to AzureDevOpsAdapter. + + ``PlatformRegistry.create_adapter`` triggers ``authenticate`` to + fail-fast on credential issues; the Phase 1 stub deliberately + raises there, so we verify the class lookup directly. + """ + orchestrator = PMFrameworkOrchestrator( + { + "pm_platforms": {}, + "analysis": {"pm_integration": {"enabled": False}}, + } + ) + # The registry stores the class; verify by instantiating directly. + adapter_class = orchestrator.registry._adapters["azure_devops"] + assert adapter_class is AzureDevOpsAdapter + + adapter = adapter_class( + { + "organization_url": "https://dev.azure.com/myorg", + "personal_access_token": "fake-pat", + } + ) + assert isinstance(adapter, AzureDevOpsAdapter) + assert adapter.platform_name == "azure_devops"