Skip to content

Telemetry: Child group spans aren't properly nested #172

@rubberduck203

Description

@rubberduck203

Expected behavior

When one ScopeDoctorGroup needs another, I would expect the outermost group span to be a parent of the group(s) it imports.
For example:

scope doctor run    ==========================================
    doctor run             ==================================
        group (outer)           ====================
        group (inner)              =====
        group (sibling)                  =====

This would let us properly time each group and quickly see which groups are taking longest to run.

Actual Behavior

The parent group comes after its children at a sibling level.

Screenshot 2024-12-09 at 09 21 16

MVCE (Minimal Verifiable Complete Example)

zip archive for convenience

 unzip scope-telemetry-mvce.zip
 cd scope-telemetry-mvce
 docker compose up -d
 open http://localhost:16686
 SCOPE_OTEL_ENDPOINT=http://localhost:4318 scope doctor run

Directory structure

tree -a
.
├── .scope
│   ├── child-group.yaml
│   └── mvce.yaml
└── docker-compose.yml

mvce.yaml

apiVersion: scope.github.com/v1alpha
kind: ScopeDoctorGroup
metadata:
  name: mvce
spec:
  needs:
    - child-group
  actions: []

child-group.yaml

apiVersion: scope.github.com/v1alpha
kind: ScopeDoctorGroup
metadata:
  name: child-group
spec:
  include: when-required
  needs: []
  actions: []

docker-compose.yaml

name: jaeger
services:
    all-in-one:
        container_name: jaeger
        environment:
            - COLLECTOR_ZIPKIN_HOST_PORT=:9411
            - COLLECTOR_OTLP_ENABLED=true
        ports:
            - 6831:6831/udp
            - 6832:6832/udp
            - 5778:5778
            - 16686:16686
            - 4317:4317
            - 4318:4318
            - 14250:14250
            - 14268:14268
            - 14269:14269
            - 9411:9411
        image: jaegertracing/all-in-one:latest

Proposed Fix

The runner always sets the parent span for all groups to doctor run.

let group_span = info_span!(parent: &header_span, "group", "indicatif.pb_show" = true, "group.name" = group_name);

And the groups are passed as a Vec

async fn run_path(&self, groups: Vec<&GroupActionContainer<T>>) -> Result<PathRunResult> {

In order to fix this, we would need to preserve the tree of groups and ensure that only root nodes have their parent set to doctor run while child groups properly point to their parent group.

It's worth noting that it's actually a forest, as multiple root nodes are currently possible.

I'm sure there are other benefits to preserving this structure in the code that I haven't thought of.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions