Skip to content

Commit

Permalink
storage controller: handle legacy TenantConf in consistency_check (#1…
Browse files Browse the repository at this point in the history
…0422)

## Problem

We were comparing serialized configs from the database with serialized
configs from memory. If fields have been added/removed to TenantConfig,
this generates spurious consistency errors. This is fine in test
environments, but limits the usefulness of this debug API in the field.

Closes: #10369

## Summary of changes

- Do a decode/encode cycle on the config before comparing it, so that it
will have exactly the expected fields.
  • Loading branch information
jcsp authored Jan 16, 2025
1 parent cccc196 commit 2e13a3a
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions storage_controller/src/service.rs
Original file line number Diff line number Diff line change
Expand Up @@ -5411,6 +5411,15 @@ impl Service {

expect_shards.sort_by_key(|tsp| (tsp.tenant_id.clone(), tsp.shard_number, tsp.shard_count));

// Because JSON contents of persistent tenants might disagree with the fields in current `TenantConfig`
// definition, we will do an encode/decode cycle to ensure any legacy fields are dropped and any new
// fields are added, before doing a comparison.
for tsp in &mut persistent_shards {
let config: TenantConfig = serde_json::from_str(&tsp.config)
.map_err(|e| ApiError::InternalServerError(e.into()))?;
tsp.config = serde_json::to_string(&config).expect("Encoding config is infallible");
}

if persistent_shards != expect_shards {
tracing::error!("Consistency check failed on shards.");

Expand Down

1 comment on commit 2e13a3a

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7477 tests run: 7088 passed, 0 failed, 389 skipped (full report)


Flaky tests (4)

Postgres 17

  • test_storage_controller_node_deletion[True]: debug-x86-64
  • test_storage_controller_node_deletion[False]: debug-x86-64
  • test_create_churn_during_restart: release-arm64
  • test_delete_timeline_exercise_crash_safety_failpoints[Check.RETRY_WITHOUT_RESTART-timeline-delete-before-rm]: release-arm64

Code coverage* (full report)

  • functions: 33.7% (8427 of 25030 functions)
  • lines: 49.2% (70467 of 143341 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
2e13a3a at 2025-01-16T19:23:15.522Z :recycle:

Please sign in to comment.