Skip to content

Add configurability to set tags and metadata on langfuse traces#53

Open
danchild wants to merge 1 commit into
forge-sdlc:mainfrom
danchild:feature-metadata-tags-config
Open

Add configurability to set tags and metadata on langfuse traces#53
danchild wants to merge 1 commit into
forge-sdlc:mainfrom
danchild:feature-metadata-tags-config

Conversation

@danchild
Copy link
Copy Markdown
Contributor

@danchild danchild commented May 21, 2026

Change Summary

  • Remove inline "system_prompt_length"
  • Create the mechanism to pass node state to context to allow writing configured metadata and traces to langfuse traces

Spec

Langfuse Trace Tags & Metadata Configurability

Date: 2026-05-20
Status: Draft

Problem

Langfuse traces created by Forge carry almost no tags or metadata today. The only metadata set is system_prompt_length, and tags are never populated. This makes it difficult to filter, group, and analyze traces in the Langfuse dashboard by dimensions like project, ticket type, or workflow step.

Solution

A configuration-driven system where admins specify which data fields to include as Langfuse trace tags and metadata via environment variables. Forge automatically resolves the configured fields from workflow state that already flows through the system. There will be no backfilling mechanism — configuration changes only affect traces created after the change.

Architecture

TracingField Enum & Resolver Registry

A TracingField StrEnum in src/forge/integrations/langfuse/fields.py defines every field an admin can configure. Each enum member has:

  • A resolver function that extracts its value from workflow state/context
  • A tag_eligible flag indicating whether the field can be used as a tag

Tag eligibility rule: Categorical data (including numbers used as identifiers like pr_number) can be used as both tags and metadata. Quantitative values meant for math (like retry_count) are metadata-only.

Available Fields

Field Name Tag Eligible Source Example Value
ticket_key Yes state["ticket_key"] AISOS-104
ticket_type Yes state["ticket_type"] bug
project_id Yes Extracted from ticket_key (prefix before -) AISOS
workflow_step Yes state["current_node"] human_review_gate
repo Yes state["current_repo"] org/repo
pr_number Yes state["current_pr_number"] 42
ci_status Yes state["ci_status"] passed
event_source Yes state["context"]["source"] jira
event_type Yes state["event_type"] issue_updated
retry_count No state["retry_count"] 2
system_prompt_length No len(system_prompt) at agent invocation time 4523
llm_model Yes settings.claude_model claude-sonnet-4-6-20250514

Each resolver returns str | None. If the data isn't present in the state, the resolver returns None and the field is silently skipped for that trace.

Configuration

Two environment variables following the existing GITHUB_KNOWN_REPOS comma-separated pattern:

LANGFUSE_TRACE_TAGS=ticket_type,project_id,workflow_step
LANGFUSE_TRACE_METADATA=workflow_step,ticket_type,project_id,ticket_key,retry_count

Config Fields in config.py

langfuse_trace_tags: str = Field(
    default="",
    description="Comma-separated list of TracingField names to include as Langfuse trace tags",
)
langfuse_trace_metadata: str = Field(
    default="",
    description="Comma-separated list of TracingField names to include as Langfuse trace metadata",
)

Parsed Properties

@property
def trace_tag_fields(self) -> list[TracingField]:
    ...

@property
def trace_metadata_fields(self) -> list[TracingField]:
    ...

Validation Behavior

Validation is per-field, not all-or-nothing. The app does not crash on invalid configuration.

  1. Parse each comma-separated value from the env var
  2. If a value isn't a valid TracingField name: log a WARNING naming the invalid value and listing available field names, skip it, continue parsing
  3. If a tag-ineligible field (e.g., retry_count) appears in LANGFUSE_TRACE_TAGS: log a WARNING explaining why it was skipped, continue
  4. After all values are parsed, if at least one field was successfully configured: log one INFO line listing the configured tag fields, and one INFO line listing the configured metadata fields
  5. If no fields were successfully configured (empty env var or all values invalid): no INFO line is logged

Example log output:

WARNING  Invalid Langfuse trace tag field 'foobar' - not a recognized field name. Available: ticket_key, ticket_type, project_id, ...
WARNING  Field 'retry_count' is not eligible for tags - skipping
INFO     Langfuse trace tags configured: ticket_type, project_id, workflow_step
INFO     Langfuse trace metadata configured: workflow_step, ticket_type, project_id, ticket_key

Resolver Integration

New Function: resolve_trace_fields()

Located in src/forge/integrations/langfuse/fields.py:

def resolve_trace_fields(state: dict[str, Any]) -> tuple[list[str], dict[str, Any]]:
    """Resolve configured tracing fields from workflow state.

    Returns:
        (tags, metadata) tuple with resolved values. Fields that resolve
        to None are silently omitted.
    """
  1. Reads the configured tag/metadata fields from settings
  2. Calls each field's resolver against the state dict
  3. Returns (tags: list[str], metadata: dict[str, Any])
  4. Tags are raw values (e.g., "bug", "OSASINFRA") — no field-name prefix
  5. Metadata is key-value (e.g., {"ticket_type": "bug", "project_id": "OSASINFRA"})
  6. If a resolver returns None, the field is silently skipped

Call Site

The orchestrator worker or workflow nodes call resolve_trace_fields(state) before invoking the agent, passing the results down:

Orchestrator/Node (has full workflow state)
  -> resolve_trace_fields(state) -> (tags, metadata)
  -> agent._run_agent(..., tags=tags, metadata=metadata)
    -> get_langfuse_config(tags=tags, metadata=metadata)
    -> get_langfuse_context(tags=tags)

The existing hardcoded metadata={"system_prompt_length": ...} in _run_agent() is removed. system_prompt_length is now a configurable TracingField like any other — admins who want it add it to LANGFUSE_TRACE_METADATA. Its resolver receives the system prompt length at invocation time rather than reading from workflow state.

Tag Format

Tags are raw values only — no field-name prefix. Examples: "bug", "OSASINFRA", "human_review_gate". This matches the format shown in the Langfuse dashboard.

Default Behavior

When LANGFUSE_TRACE_TAGS and LANGFUSE_TRACE_METADATA are empty (the default), no tags or metadata.

Files Changed

File Change
src/forge/integrations/langfuse/fields.py New file: TracingField enum, resolver functions, resolve_trace_fields()
src/forge/integrations/langfuse/__init__.py Export new public symbols
src/forge/config.py Add langfuse_trace_tags, langfuse_trace_metadata fields and parsed properties
src/forge/integrations/langfuse/tracing.py Update get_langfuse_config() and get_langfuse_context() signatures to accept resolved tags/metadata
src/forge/integrations/agents/agent.py Accept and pass through resolved tags/metadata in _run_agent()
src/forge/orchestrator/worker.py Call resolve_trace_fields(state) before agent invocation
tests/unit/test_langfuse_fields.py New file: tests for field resolvers, config validation, and resolve_trace_fields()

Testing

Unit Tests: Field Resolvers

Each TracingField resolver tested with:

  • State dict containing the relevant data -> returns expected value
  • State dict without the data -> returns None

Unit Tests: Config Validation

  • Invalid field names produce warnings and get skipped
  • Tag-ineligible fields in LANGFUSE_TRACE_TAGS produce warnings and get skipped
  • Valid configs parse to correct TracingField lists
  • Empty configs produce no INFO logs
  • Successful configs produce the correct INFO summary lines

Integration Test: resolve_trace_fields()

Given a realistic workflow state dict and configured fields, verify the correct (tags, metadata) tuple is returned with missing fields silently omitted.

Non-Goals

  • Backfilling existing traces with new tags/metadata
  • Custom/arbitrary key-value pairs beyond the predefined field set
  • Per-workflow or per-node field configuration (all traces use the same config)
  • UI for managing the configuration (env vars only)

@danchild danchild force-pushed the feature-metadata-tags-config branch 2 times, most recently from eedb5ca to 34b8135 Compare May 26, 2026 19:38
- Remove inline "system_prompt_length"
- Create the mechanism to pass node state to context to allow writing
  configured metadata and traces to langfuse traces

Signed-off-by: Dan Childers <dchilder@redhat.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@danchild danchild force-pushed the feature-metadata-tags-config branch from 34b8135 to bc14b93 Compare May 26, 2026 19:58
@eshulman2
Copy link
Copy Markdown
Collaborator

Review Notes

Overall

Good architecture — the TracingField enum + resolver pattern is clean, the config parsing with validation is solid, and test coverage is comprehensive. A few things to address before merge:

1. Bug workflow nodes are not covered

The PR only enriches context in feature workflow nodes (prd_generation, spec_generation, epic_decomposition, task_generation, pr_creation, qa_handler, code_review).

The bug workflow nodes that invoke the agent were not updated:

  • triage.py — only passes context={"ticket_key": ticket_key}, missing all other trace fields
  • rca_analysis.py — invokes the agent via ContainerRunner, different code path entirely
  • plan_bug_fix.py — same, uses ContainerRunner

If someone configures LANGFUSE_TRACE_TAGS=ticket_type,workflow_step, bug workflow traces will be missing those tags.

2. The per-node enrichment approach is fragile

The current design requires every node that calls the agent to manually build a ~6-line context dict:

context = {
    "ticket_key": ticket_key,
    "ticket_type": state.get("ticket_type", ""),
    "current_node": state.get("current_node", ""),
    "event_type": state.get("event_type", ""),
    "event_source": state.get("context", {}).get("source", ""),
    "retry_count": state.get("retry_count", 0),
}

This is copy-pasted into 7 files and will need to be added to every future node that calls the agent. If someone forgets (as happened with the bug workflow nodes), traces from that node get no tags/metadata.

Suggested alternative: Resolve the trace fields once in the orchestrator worker — it already has the full workflow state at invocation time — and store the resolved (tags, metadata) in the state dict or pass them via the LangGraph config. The agent's run_task() would then pick them up automatically without any node needing to know about tracing. This would:

  • Eliminate the per-node boilerplate
  • Cover all nodes automatically (including future ones)
  • Remove the risk of forgetting to add trace context to new nodes
  • Work with ContainerRunner-based nodes without changes

3. Field naming inconsistency

TracingField.WORKFLOW_STEP resolves by reading state["current_node"] but produces a metadata key called "workflow_step". Same for current_reporepo, current_pr_numberpr_number. The resolvers work correctly, but the mismatch between config names, Langfuse keys, and actual state keys is confusing. Consider naming TracingField members to match their state key names (e.g., CURRENT_NODE instead of WORKFLOW_STEP).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants