feat(correlation): add workflow-aware confidence scoring#2762
feat(correlation): add workflow-aware confidence scoring#2762cerencamkiran wants to merge 3 commits into
Conversation
Greptile code reviewThis repo uses Greptile for automated review. Before merge, aim for Confidence Score: 5/5 with zero unresolved review threads — see CONTRIBUTING.md. Run a review — add a PR comment with: Give it ~5-10 minutes (sometimes longer) for results, then fix feedback and re-trigger until you reach Confidence Score: 5/5. Optional: automate with the greploop skill. |
Greptile SummaryThis PR introduces workflow-aware confidence scoring to the correlation pipeline by adding a shared
Confidence Score: 5/5Safe to merge; changes are additive and the findings do not affect current runtime behaviour. All findings are style and performance observations with no current runtime breakage. The redundant YAML re-reads degrade performance but do not produce wrong results. The frozen-dataclass-with-dict issue is latent — nothing in the current call graph hashes an UpstreamCandidate with a populated evidence_breakdown. app/agent/correlation/runtime.py (config loaded per metric) and app/agent/correlation/models.py (evidence_breakdown field type) are worth a follow-up, but neither blocks merging. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[build_runtime_correlation] --> B[Extract endpoint hint]
A --> C[For each upstream metric]
C --> D[score_time_window_correlation]
C --> E[score_topology_adjacency]
C --> F[score_periodic_spikes]
C --> G[_runtime_feature_keywords\nreads YAML per iteration]
G --> H{OPENSRE_FEATURE_WORKFLOW_CONFIG set?}
H -- Yes --> I[load_feature_workflow_config]
H -- No --> J[return empty tuple]
I --> K[resolve_feature_keywords]
K --> L[candidate_keywords]
L --> M[score_feature_workflow_hypothesis]
D & E & F & M --> N[score_candidate_correlation]
N --> O[build_shared_confidence]
O --> P[UpstreamCandidate with confidence_label and evidence_breakdown]
P --> Q[rank_upstream_candidates]
Q --> R[correlation_report_to_payload]
Reviews (4): Last reviewed commit: "feat(correlation): add workflow-aware co..." | Re-trigger Greptile |
| hint for hint in matched_hints if "workflow" in hint.lower() or "scheduled" in hint.lower() | ||
| ) | ||
|
|
||
| score = 1.0 if matched_hints else 0.0 |
There was a problem hiding this comment.
Binary 0/1 scoring inflates confidence for common metric name tokens
score is set to 1.0 whenever any candidate_keyword appears anywhere in any operator hint string, which accounts for the full 15% feature_workflow weight. Short or generic tokens extracted from metric names (e.g., "web", "api", "rds") can easily substring-match loosely-written hint strings, giving a full 1.0 score for what is effectively a loose partial match. Consider a proportional score (e.g., len(matched_hints) / len(operator_hints)) or a minimum keyword length guard beyond the existing len(token) > 2 filter.
dd4bcd0 to
f68ce16
Compare
|
@greptile review |
Fixes #1441
Summary
Adds workflow-aware reasoning and shared confidence scoring to the correlation pipeline.
Previously, correlation ranking relied primarily on time-window correlation, topology adjacency, periodicity, and operator hints. This PR introduces a feature/workflow hypothesis layer that allows the system to associate likely initiating features or workflows with correlated candidates and explain why a candidate was ranked highly.
The implementation also adds a lightweight file-based configuration mechanism for endpoint-to-feature mapping, feature-to-service mapping, and optional operator hints such as recently shipped features or scheduled workflows.
What Changed
Shared Confidence Scoring
Added a new shared confidence model that aggregates evidence from:
Each contribution includes:
The final runtime payload now includes:
confidence_labelevidence_breakdownallowing downstream consumers to understand what drove the ranking decision.
Feature / Workflow Hypothesis Layer
Added feature workflow scoring that:
File-Based Feature Configuration
Added a lightweight YAML configuration layer supporting:
This keeps workflow attribution configurable without requiring code changes.
Tests
Added coverage for:
Validation
ruff format app/agent/correlation tests/synthetic/rds_postgres/correlation ruff check app/agent/correlation tests/synthetic/rds_postgres/correlation pytest tests/synthetic/rds_postgres/correlation \ tests/synthetic/rds_postgres/test_observation_correlation.py -qResult: