feat(correlation): add workflow-aware confidence scoring by cerencamkiran · Pull Request #2762 · Tracer-Cloud/opensre

cerencamkiran · 2026-06-05T17:29:28Z

Summary

Adds workflow-aware reasoning and shared confidence scoring to the correlation pipeline.

Previously, correlation ranking relied primarily on time-window correlation, topology adjacency, periodicity, and operator hints. This PR introduces a feature/workflow hypothesis layer that allows the system to associate likely initiating features or workflows with correlated candidates and explain why a candidate was ranked highly.

The implementation also adds a lightweight file-based configuration mechanism for endpoint-to-feature mapping, feature-to-service mapping, and optional operator hints such as recently shipped features or scheduled workflows.

What Changed

Shared Confidence Scoring

Added a new shared confidence model that aggregates evidence from:

Correlation evidence
Topology adjacency evidence
Periodicity evidence
Feature/workflow hypothesis evidence

Each contribution includes:

Source
Score
Weight
Rationale

The final runtime payload now includes:

confidence_label
evidence_breakdown

allowing downstream consumers to understand what drove the ranking decision.

Feature / Workflow Hypothesis Layer

Added feature workflow scoring that:

Matches candidate services against workflow-related operator hints
Produces feature/workflow evidence
Contributes to shared confidence scoring
Generates explainable rationale for ranked candidates

File-Based Feature Configuration

Added a lightweight YAML configuration layer supporting:

Endpoint → feature tags
Feature → service mapping
Optional operator hints

This keeps workflow attribution configurable without requiring code changes.

Tests

Added coverage for:

Shared confidence evidence breakdown
Feature/workflow hypothesis scoring
File-based configuration loading
Endpoint-to-feature resolution
Feature-to-service resolution
Operator hint influence on candidate ranking

Validation

ruff format app/agent/correlation tests/synthetic/rds_postgres/correlation
ruff check app/agent/correlation tests/synthetic/rds_postgres/correlation

pytest tests/synthetic/rds_postgres/correlation \
       tests/synthetic/rds_postgres/test_observation_correlation.py -q

Result:

24 passed

github-actions · 2026-06-05T17:29:39Z

Greptile code review

This repo uses Greptile for automated review. Before merge, aim for Confidence Score: 5/5 with zero unresolved review threads — see CONTRIBUTING.md.

Run a review — add a PR comment with:

@greptile review

Give it ~5-10 minutes (sometimes longer) for results, then fix feedback and re-trigger until you reach Confidence Score: 5/5.

Optional: automate with the greploop skill.

greptile-apps · 2026-06-05T17:33:22Z

Greptile Summary

This PR introduces workflow-aware confidence scoring to the correlation pipeline by adding a shared SharedConfidence model, a feature/workflow hypothesis scorer, and a YAML-based feature config layer. The divergent weight formula issue from the previous review cycle is resolved — final_confidence now derives directly from shared_confidence.score.

Shared confidence model (confidence.py, scoring.py): build_shared_confidence replaces the inline weighted sum, unifying ranking and label computation under a single formula with four named evidence contributions.
Feature/workflow layer (feature_config.py, feature_workflow.py, runtime.py): Endpoint-to-feature and feature-to-service mappings are loaded from a YAML file specified via OPENSRE_FEATURE_WORKFLOW_CONFIG; the resulting keywords are merged with metric-name tokens before scoring.
Runtime wiring (runtime.py): _runtime_feature_keywords is called once per upstream metric inside the loop, re-reading the config file on every iteration; loading should happen once before the loop. The evidence_breakdown field added to UpstreamCandidate uses tuple[dict[str, object], ...] inside a frozen=True dataclass, which silently breaks hashability when the tuple is non-empty.

Confidence Score: 5/5

Safe to merge; changes are additive and the findings do not affect current runtime behaviour.

All findings are style and performance observations with no current runtime breakage. The redundant YAML re-reads degrade performance but do not produce wrong results. The frozen-dataclass-with-dict issue is latent — nothing in the current call graph hashes an UpstreamCandidate with a populated evidence_breakdown.

app/agent/correlation/runtime.py (config loaded per metric) and app/agent/correlation/models.py (evidence_breakdown field type) are worth a follow-up, but neither blocks merging.

Important Files Changed

Filename	Overview
app/agent/correlation/confidence.py	New shared confidence model; correctly normalises weighted scores by total_weight and assigns labels — clean implementation.
app/agent/correlation/feature_config.py	New YAML config loader and keyword resolver; logic is sound but FeatureWorkflowConfig is declared frozen=True despite holding mutable dict fields.
app/agent/correlation/feature_workflow.py	New feature/workflow hypothesis scorer; binary 0/1 scoring is a known limitation already tracked.
app/agent/correlation/models.py	Adds confidence_label and evidence_breakdown to UpstreamCandidate; evidence_breakdown uses tuple[dict] which breaks hashability of the frozen dataclass when populated.
app/agent/correlation/runtime.py	Wires feature config and workflow scoring into the correlation pipeline; _runtime_feature_keywords reads and parses YAML once per metric in the inner loop instead of once per build_runtime_correlation call.
app/agent/correlation/scoring.py	Replaces separate final_confidence formula with shared_confidence.score, resolving the previously noted weight divergence; operator_hint_score removed cleanly.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[build_runtime_correlation] --> B[Extract endpoint hint]
    A --> C[For each upstream metric]
    C --> D[score_time_window_correlation]
    C --> E[score_topology_adjacency]
    C --> F[score_periodic_spikes]
    C --> G[_runtime_feature_keywords\nreads YAML per iteration]
    G --> H{OPENSRE_FEATURE_WORKFLOW_CONFIG set?}
    H -- Yes --> I[load_feature_workflow_config]
    H -- No --> J[return empty tuple]
    I --> K[resolve_feature_keywords]
    K --> L[candidate_keywords]
    L --> M[score_feature_workflow_hypothesis]
    D & E & F & M --> N[score_candidate_correlation]
    N --> O[build_shared_confidence]
    O --> P[UpstreamCandidate with confidence_label and evidence_breakdown]
    P --> Q[rank_upstream_candidates]
    Q --> R[correlation_report_to_payload]

_{Reviews (4): Last reviewed commit: "feat(correlation): add workflow-aware co..." | Re-trigger Greptile}

greptile-apps · 2026-06-05T17:33:29Z

+        hint for hint in matched_hints if "workflow" in hint.lower() or "scheduled" in hint.lower()
+    )
+
+    score = 1.0 if matched_hints else 0.0


Binary 0/1 scoring inflates confidence for common metric name tokens

score is set to 1.0 whenever any candidate_keyword appears anywhere in any operator hint string, which accounts for the full 15% feature_workflow weight. Short or generic tokens extracted from metric names (e.g., "web", "api", "rds") can easily substring-match loosely-written hint strings, giving a full 1.0 score for what is effectively a loose partial match. Consider a proportional score (e.g., len(matched_hints) / len(operator_hints)) or a minimum keyword length guard beyond the existing len(token) > 2 filter.

cerencamkiran · 2026-06-05T18:39:41Z

@greptile review

feat(correlation): add workflow-aware confidence scoring

9285a51

greptile-apps Bot reviewed Jun 5, 2026

View reviewed changes

cerencamkiran marked this pull request as draft June 5, 2026 17:34

feat(correlation): add workflow-aware confidence scoring

918a729

greptile-apps Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread app/agent/correlation/runtime.py Outdated

greptile-apps Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread app/agent/correlation/runtime.py

feat(correlation): add workflow-aware confidence scoring

f68ce16

cerencamkiran force-pushed the feat/workflow-aware-confidence branch from dd4bcd0 to f68ce16 Compare June 5, 2026 18:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(correlation): add workflow-aware confidence scoring#2762

feat(correlation): add workflow-aware confidence scoring#2762
cerencamkiran wants to merge 3 commits into
Tracer-Cloud:mainfrom
cerencamkiran:feat/workflow-aware-confidence

cerencamkiran commented Jun 5, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

greptile-apps Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot Jun 5, 2026

Uh oh!

Uh oh!

Uh oh!

cerencamkiran commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cerencamkiran commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What Changed

Shared Confidence Scoring

Feature / Workflow Hypothesis Layer

File-Based Feature Configuration

Tests

Validation

Uh oh!

github-actions Bot commented Jun 5, 2026

Greptile code review

Uh oh!

greptile-apps Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

cerencamkiran commented Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cerencamkiran commented Jun 5, 2026 •

edited

Loading

greptile-apps Bot commented Jun 5, 2026 •

edited

Loading