Skip to content

Conversation

@xiaods
Copy link

@xiaods xiaods commented Oct 18, 2025

This fixes issue #202 where metrics would disappear from the repo's aggregated /metrics endpoint after updating metric definitions to add or remove labels.

Root Cause:
In MetricDataSum::sum(), the schema validation (type, shape, dimensions) was only performed when initial=true. This meant that when a metric's label structure changed at runtime (e.g., from ['label_one'] to ['label_one', 'label_two']), the shape mismatch was not detected during regular aggregation cycles.

The Bug (lines 832-836):

if (!ent || (initial && (
  ent->type != type ||
  ent->shape != shape ||
  ent->dimensions != e->dimensions
))) {

When initial=false, schema changes were silently ignored, causing:

  • Metrics with new schemas to be aggregated into entries with old schemas
  • Mismatched dimensions leading to incorrect aggregation
  • Metrics appearing in worker /metrics but missing from repo /metrics

The Fix:
Always validate schema consistency, not just during initial aggregation:

if (!ent ||
  ent->type != type ||
  ent->shape != shape ||
  ent->dimensions != e->dimensions
) {

Now when a metric's schema changes, the entry is properly recreated with the new schema, ensuring metrics are correctly aggregated and visible in the repo's /metrics endpoint.

Testing:
Reproduced with the test case in issue description:

  1. Create metric with one label
  2. Update to two labels
  3. Verify metric now appears in repo /metrics (previously disappeared)

Note: MetricHistory already had the correct implementation (lines 1117-1120, 1171-1174) and did not require changes.

🤖 Generated with Claude Code

We thank you for helping improve Pipy. In order to ease the reviewing process, we invite you to read the guidelines and ask you to consider the following points before submitting a PR:

  1. We prefer to discuss the underlying issue prior to discussing the code. Therefore, we kindly ask you to refer to an existing issue, or an existing discussion in a public space with members of the Core Team. In few cases, we acknowledge that this might not be necessary, for instance when refactoring code or small bug fixes. In this case, the PR must include the same information an issue would have: a clear explanation of the issue, reproducible code, etc.

  2. Focus the PR to the referred issue, and restraint from adding unrelated changes/additions. We do welcome another PR if you fixed another issue.

  3. If your change is big, consider breaking it into several smaller PRs. In general, the smaller the change, the quicker we can review it.

This fixes issue flomesh-io#224 where metrics would disappear from the repo's
aggregated /metrics endpoint after updating metric definitions to add
or remove labels.

**Root Cause:**
In MetricDataSum::sum(), the schema validation (type, shape, dimensions)
was only performed when initial=true. This meant that when a metric's
label structure changed at runtime (e.g., from ['label_one'] to
['label_one', 'label_two']), the shape mismatch was not detected during
regular aggregation cycles.

**The Bug (lines 832-836):**
```cpp
if (!ent || (initial && (
  ent->type != type ||
  ent->shape != shape ||
  ent->dimensions != e->dimensions
))) {
```

When initial=false, schema changes were silently ignored, causing:
- Metrics with new schemas to be aggregated into entries with old schemas
- Mismatched dimensions leading to incorrect aggregation
- Metrics appearing in worker /metrics but missing from repo /metrics

**The Fix:**
Always validate schema consistency, not just during initial aggregation:
```cpp
if (!ent ||
  ent->type != type ||
  ent->shape != shape ||
  ent->dimensions != e->dimensions
) {
```

Now when a metric's schema changes, the entry is properly recreated with
the new schema, ensuring metrics are correctly aggregated and visible in
the repo's /metrics endpoint.

**Testing:**
Reproduced with the test case in issue description:
1. Create metric with one label
2. Update to two labels
3. Verify metric now appears in repo /metrics (previously disappeared)

Note: MetricHistory already had the correct implementation (lines 1117-1120,
1171-1174) and did not require changes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant