Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(metrics): Add validator metrics for cpu model and system memory (#4666) #5574

Open
wants to merge 48 commits into
base: develop
Choose a base branch
from

Conversation

nmrshll
Copy link
Contributor

@nmrshll nmrshll commented Feb 21, 2025

Description of change

Introduces enhancements to the validator metrics by adding support for getting validator hardware specs, like CPU model/cores, system memory, disk space. This data is sent to iota-proxy (for us to monitor the state of the network through telemetry).

Also fixes private network setup scripts/configs, which didn't seem to work with the exact commands given in the different READMEs (I encountered the same problems as described in #5204, had to fix them to test this PR with docker)

Added example grafana dashboard items:

image

Links to any relevant issues

Fixes #4666.

Platform support:

  • Linux / MacOS, arm64/x86
  • ⚠️ Unfortunately, for docker on MacOS (and same for Orbstack and podman-machine), the stats gathered cannot be reliable (at least without complicating this PR and the docker setup): cpu model usually missing, disk space shows any value provided by the docker virtual machine
  • on docker on Linux (which is what most people running our default node setup will likely run), the stats are gathered correctly

Type of change

  • Enhancement (a non-breaking change which adds functionality)

How the change has been tested

The changes have been tested in a local environment to ensure that the new metrics are being collected correctly and that the private network setup works seamlessly with Grafana. Further testing will be conducted based on feedback received.

Change checklist

  • I have followed the contribution guidelines for this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked that new and existing unit tests pass locally with my changes

@nmrshll nmrshll linked an issue Feb 21, 2025 that may be closed by this pull request
Copy link

vercel bot commented Feb 21, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
apps-backend ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 18, 2025 7:08pm
apps-ui-kit ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 18, 2025 7:08pm
rebased-explorer ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 18, 2025 7:08pm
wallet-dashboard ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 18, 2025 7:08pm

@iota-ci iota-ci added core-protocol node Issues related to the Core Node team labels Feb 21, 2025
@nmrshll nmrshll force-pushed the 4666-add-validator-metrics-for-cpu-model-and-system-memory branch from ac86a7e to 2b64553 Compare February 26, 2025 18:14
@nmrshll nmrshll force-pushed the 4666-add-validator-metrics-for-cpu-model-and-system-memory branch from 28aa366 to 98362c0 Compare March 6, 2025 10:04
@alexsporn alexsporn added this to the v0.11.x milestone Mar 6, 2025
@muXxer
Copy link
Contributor

muXxer commented Mar 18, 2025

I pushed some fixes. Now the metrics also work when run via iota-swarm.

Please also implement the following changes:

  • all other disks than "db_disk" should also have a disk spec including their total_bytes in the description.
  • the value of "disk spec"'s can be set to 0, same for hw_memory_specs
  • maybe rename db_disk_name and db_disk_total_bytes, remove the db_disk prefix (this way you can reuse the same schema for all disc specs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core-protocol node Issues related to the Core Node team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add validator metrics for CPU model and system memory
7 participants