Skip to content

feat(sys): add operational observability tools (leases, metrics, replication, cluster health, host info)#86

Open
czembower wants to merge 3 commits into
hashicorp:mainfrom
czembower:pr/sys-ops-observability
Open

feat(sys): add operational observability tools (leases, metrics, replication, cluster health, host info)#86
czembower wants to merge 3 commits into
hashicorp:mainfrom
czembower:pr/sys-ops-observability

Conversation

@czembower

Copy link
Copy Markdown

Adds new tools:

  • list_leases
  • read_lease
  • read_metrics
  • read_replication_status
  • read_cluster_health
  • read_host_info

Registers all tools.

@czembower czembower requested a review from a team as a code owner March 5, 2026 21:37
@hashneo hashneo requested a review from Copilot March 9, 2026 22:44

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a set of new “sys” operational observability MCP tools to the Vault MCP server, and registers them in the tool initializer to support diagnosing cluster/replication health and inspecting leases/host/metrics.

Changes:

  • Register new sys observability tools in InitTools.
  • Add tools to read replication status, cluster health, telemetry metrics, and host info.
  • Add tools to list leases and look up an individual lease.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
pkg/tools/tools.go Registers the newly added sys observability tools with the MCP server.
pkg/tools/sys/read_replication_status.go Adds a tool to fetch and format replication status for performance/DR replication.
pkg/tools/sys/read_cluster_health.go Adds a tool aggregating HA status, Raft autopilot state/config, and seal backend status.
pkg/tools/sys/read_metrics.go Adds a tool intended to read Vault telemetry metrics.
pkg/tools/sys/read_host_info.go Adds a tool to read Vault host diagnostics info.
pkg/tools/sys/list_leases.go Adds a tool to list leases under an optional prefix.
pkg/tools/sys/read_lease.go Adds a tool to look up an individual lease’s details by lease ID.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +48 to +52
// Query the sys/metrics endpoint (returns JSON by default)
secret, err := vault.Logical().Read("sys/metrics")
if err != nil {
logger.WithError(err).Error("Failed to read metrics")
return mcp.NewToolResultError(fmt.Sprintf("Failed to read metrics: %v", err)), nil

Copilot AI Mar 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sys/metrics does not reliably return a JSON body; it commonly returns Prometheus/plain-text unless a format=json query parameter is provided. Using vault.Logical().Read("sys/metrics") will fail JSON decoding in those cases, making this tool unusable on many Vault configurations. Consider switching to Logical().ReadWithData (or a raw request) and explicitly requesting JSON (and/or allow a format argument so callers can choose JSON vs Prometheus).

Copilot uses AI. Check for mistakes.
Comment on lines +79 to +80
// No leases at this path
return mcp.NewToolResultText("No leases found at this path"), nil

Copilot AI Mar 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "no leases" branch returns a human string, while other list tools in this repo return JSON (e.g., list_secrets returns []). Returning non-JSON here makes the output harder to consume programmatically and inconsistent with existing tool behavior. Consider returning a JSON empty list (or { "keys": [] }) instead.

Suggested change
// No leases at this path
return mcp.NewToolResultText("No leases found at this path"), nil
// No leases at this path; return an empty JSON array for consistency with other list tools
return mcp.NewToolResultText("[]"), nil

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +84
// Format the response
jsonData, err := json.MarshalIndent(secret.Data, "", " ")

Copilot AI Mar 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list response is marshaled as the full secret.Data map. Other list tools typically extract and return just the key list (e.g., secret.Data["keys"]) as JSON. Consider extracting the keys field and returning a JSON array for consistency with list_secrets/list_pki_roles patterns.

Suggested change
// Format the response
jsonData, err := json.MarshalIndent(secret.Data, "", " ")
// Format the response: extract and return only the "keys" field for consistency
keysVal, ok := secret.Data["keys"]
if !ok {
// No "keys" field; return an empty list to keep the response shape consistent
keysVal = []string{}
}
// Normalize keysVal into something JSON-marshalable, preferring []string
var (
marshalTarget interface{}
keysSlice []string
)
switch v := keysVal.(type) {
case []string:
marshalTarget = v
case []interface{}:
keysSlice = make([]string, 0, len(v))
for _, item := range v {
keysSlice = append(keysSlice, fmt.Sprint(item))
}
marshalTarget = keysSlice
default:
// Fallback: wrap single value or unexpected type into a one-element string slice
if v != nil {
marshalTarget = []string{fmt.Sprint(v)}
} else {
marshalTarget = []string{}
}
}
jsonData, err := json.MarshalIndent(marshalTarget, "", " ")

Copilot uses AI. Check for mistakes.
Comment on lines +139 to +149
if lastWAL, ok := data["last_wal"].(float64); ok {
status.LastWAL = uint64(lastWAL)
}

if merkleRoot, ok := data["merkle_root"].(string); ok {
status.MerkleRoot = merkleRoot
}

if lastRemoteWAL, ok := data["last_remote_wal"].(float64); ok {
status.LastRemoteWAL = uint64(lastRemoteWAL)
}

Copilot AI Mar 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Casting JSON numbers from float64 to uint64 can silently lose precision once values exceed 2^53. WAL indexes can grow large enough for this to matter, producing incorrect output. Consider preserving these fields as strings/json.Number, or decoding the API response with UseNumber and converting safely.

Copilot uses AI. Check for mistakes.
Comment on lines +323 to +328
if lastTerm, ok := data["last_term"].(float64); ok {
status.LastTerm = uint64(lastTerm)
}
if lastIndex, ok := data["last_index"].(float64); ok {
status.LastIndex = uint64(lastIndex)
}

Copilot AI Mar 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields are parsed as float64 and then cast to uint64, which can lose precision for large Raft indexes/terms. Consider decoding using json.Number (or leaving them as strings) to avoid inaccurate health output on long-running clusters.

Copilot uses AI. Check for mistakes.
Comment on lines +473 to +478
// Capture any additional details
info.Details = make(map[string]interface{})
for key, value := range data {
if key != "type" && key != "healthy" {
info.Details[key] = value
}

Copilot AI Mar 9, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SealBackendStatusInfo.Details is tagged omitempty, but this code always initializes it to an empty map, so the JSON output will include "details": {} even when there are no extra fields. Consider only allocating/setting Details when at least one non-standard key is present (or leave it nil when empty).

Suggested change
// Capture any additional details
info.Details = make(map[string]interface{})
for key, value := range data {
if key != "type" && key != "healthy" {
info.Details[key] = value
}
// Capture any additional details, but only allocate the map if needed
for key, value := range data {
if key == "type" || key == "healthy" {
continue
}
if info.Details == nil {
info.Details = make(map[string]interface{})
}
info.Details[key] = value

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants