Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch by Winter-Soren · Pull Request #560 · CheckerNetwork/spark-api

Winter-Soren · 2025-03-27T17:42:12Z

This PR implements collection and visualization of Web3.Storage measurement batch metrics.

Closes #511

Changes

Created new InfluxDB bucket 'spark-batch-metrics' for dedicated batch metrics storage
Enhanced telemetry recording with separate batch metrics measurement
Added visualization dashboards for batch metrics

Code Changes

Modified common/telemetry.js to add dedicated batch metrics write client
Enhanced publish/index.js to separate batch metrics recording
Added proper tags for improved metric querying

Metrics Collected

batch_size_bytes: Total size of the measurement batch
avg_measurement_size_bytes: Average size per measurement
measurement_count: Number of measurements in batch
Tags:
- cid: Content ID of the batch
- round_index: Associated round number

Testing

Verified metrics collection with test data
Confirmed proper data storage in InfluxDB
Validated dashboard visualizations

Winter-Soren · 2025-03-27T17:43:27Z

Hi @pyropy can you review this draft and let me know if I’m heading in the right direction?

Winter-Soren · 2025-03-27T17:48:16Z

@pyropy , I am also facing this issue in parallel, where the server runs fine but suddenly outputs an error stating that InfluxDB write failed. I have attached a screenshot below.

I created InfluxDB keys from this platform: https://eu-central-1-1.aws.cloud2.influxdata.com/orgs/e0c31560a2dfff87/load-data/tokens.

I tried deleting keys and checking RPC nodes but issue persists.

juliangruber · 2025-03-28T09:00:14Z

@Winter-Soren could you please link to the issue this addresses both in the PR title and description?

bajtos

I am confused.

What problems is this pull request trying to solve?

If the goal is to start collecting telemetry about batch sizes, I would expect the pull request to simply add more fields to the existing publish point.

bajtos · 2025-03-28T09:12:35Z

+    point.tag('cid', cid.toString())
+    point.tag('round_index', roundIndex.toString())


Storing CIDs and round indexes in tags will cause high cardinality that will degrade performance and/or increase our bill.

Quoting from https://docs.influxdata.com/influxdb/v2/write-data/best-practices/resolve-high-cardinality/

Tags containing highly variable information like unique IDs, hashes, and random strings lead to a large number of series, also known as high series cardinality. High series cardinality is a primary driver of high memory usage for many database workloads.
(...)
Review your tags to ensure each tag does not contain unique values for most entries.

thanks for the suggestion, I'll rectify the commit!

+1 to what @bajtos said.

Winter-Soren · 2025-03-29T18:55:10Z

@Winter-Soren could you please link to the issue this addresses both in the PR title and description?

@juliangruber, I have edited the name of the PR and added a link to the issue it resolves.

pyropy

Good job and thank you for your contribution @Winter-Soren! 🎉

I think the PR is going in the right direction but would need to fix few things before continuing. Namely lets stick to established pattern in the common/telemetry.js file (which I've noted in the comments). We would also need to improve naming a bit to make it more obvious of what metrics we're storing.

Keep up the good work! 🚀

pyropy · 2025-03-31T09:56:48Z

  logger.log(`Publishing ${measurements.length} measurements. Total unpublished: ${totalCount}. Batch size: ${maxMeasurements}.`)

+  // Calculate batch size in bytes
+  const batchSizeBytes = Buffer.byteLength(


Have you found some other ways to calculate batch size without serialising objects to JSON? Depending on the batch size that might create consume a lot of memory.

+1

We have the following code few lines below:

const file = new File( [measurements.map(m => JSON.stringify(m)).join('\n')], 'measurements.ndjson', { type: 'application/json' } )

Please refactor it so that we create only one copy of measurements.map(m => JSON.stringify(m)).join('\n').

pyropy · 2025-03-31T09:57:35Z

+    point.tag('cid', cid.toString())
+    point.tag('round_index', roundIndex.toString())


+1 to what @bajtos said.

bajtos · 2025-03-31T11:29:52Z

Thank you for adding a link to #511, @Winter-Soren. This pull request makes sense to me now!

Thank you for the contribution ❤️

Before we move forward, I think it's best for me and @pyropy to agree on the higher-level design.

@pyropy Why is it necessary to create a new bucket and a new telemetry writer?

As I wrote before:

If the goal is to start collecting telemetry about batch sizes, I would expect the pull request to simply add more fields to the existing publish point.

If we add more fields to the existing publish point, what problems would that cause?

pyropy · 2025-03-31T12:30:08Z

Thank you for adding a link to #511, @Winter-Soren. This pull request makes sense to me now!

Thank you for the contribution ❤️

Before we move forward, I think it's best for me and @pyropy to agree on the higher-level design.

@pyropy Why is it necessary to create a new bucket and a new telemetry writer?

As I wrote before:

If the goal is to start collecting telemetry about batch sizes, I would expect the pull request to simply add more fields to the existing publish point.

If we add more fields to the existing publish point, what problems would that cause?

Sorry, I have just quickly glanced over your comment. Looking at the code I have to say you're right, we could just simply extend publish point. I'm going also update the #511 description accordingly.

pyropy · 2025-03-31T12:35:04Z

@Winter-Soren I have published a bad task description without looking deeper into the existing codebase. I have updated the task description according to original comment published by @bajtos.

pyropy

I've added new comments according to updated description of #511. Please let me know if you need further explanation. 🙏🏻

pyropy · 2025-03-31T12:37:00Z

  's' // precision
 )

+// Add new write client for batch metrics


Let's not add new bucket and write client, rather let's just extend existing publish metric.

pyropy · 2025-03-31T12:37:38Z

 setInterval(() => {
  publishWriteClient.flush().catch(console.error)
  networkInfoWriteClient.flush().catch(console.error)
+  batchMetricsWriteClient.flush().catch(console.error)


Suggested change

batchMetricsWriteClient.flush().catch(console.error)

We won't need this one if we extend existing publish metric.

pyropy · 2025-03-31T12:38:04Z

+  recordNetworkInfoTelemetry,
+  batchMetricsWriteClient


Suggested change

recordNetworkInfoTelemetry,

batchMetricsWriteClient

recordNetworkInfoTelemetry

We won't need this one if we extend existing publish metric.

pyropy · 2025-03-31T12:38:30Z


+  // Enhanced telemetry recording with separate batch metrics
  recordTelemetry('publish', point => {
+    // Existing metrics


Let's extend this metric with new point that collects batch size.

Let me correct that - we should add new fields to the existing point.

pyropy · 2025-03-31T12:39:20Z

  })
+
+  // Separate batch metrics recording for better organization
+  recordTelemetry('batch_metrics', point => {


Let's delete this new metric.

feat: web3 storage batch metrics

f22813c

Winter-Soren requested review from NikolasHaimerl, bajtos, juliangruber and pyropy as code owners March 27, 2025 17:42

github-project-automation Bot added this to CheckerNetwork Mar 27, 2025

bajtos suggested changes Mar 28, 2025

View reviewed changes

bajtos reviewed Mar 28, 2025

View reviewed changes

Winter-Soren changed the title ~~Web3.Storage Measurement Batch Metrics Visualization~~ Feat/511 Implement Data Collection and Visualization for Web3.Storage Measurement Batch Mar 29, 2025

Winter-Soren requested a review from bajtos March 29, 2025 18:57

pyropy suggested changes Mar 31, 2025

View reviewed changes

This comment was marked as off-topic.

Sign in to view

feat: add batch size metrics to publish telemetry

3a83c09

Winter-Soren force-pushed the feat/511-web3-storage-batch-metrics branch from 9bc327d to 3a83c09 Compare April 5, 2025 09:43

bajtos closed this Jan 7, 2026

github-project-automation Bot moved this to ✅ done in CheckerNetwork Jan 7, 2026

		point.tag('cid', cid.toString())
		point.tag('round_index', roundIndex.toString())

	recordNetworkInfoTelemetry,
	batchMetricsWriteClient
	recordNetworkInfoTelemetry

Conversation

Winter-Soren commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Code Changes

Metrics Collected

Testing

Uh oh!

Winter-Soren commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Winter-Soren commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juliangruber commented Mar 28, 2025

Uh oh!

bajtos left a comment

Choose a reason for hiding this comment

Uh oh!

bajtos Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Winter-Soren commented Mar 29, 2025

Uh oh!

pyropy left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bajtos commented Mar 31, 2025

Uh oh!

pyropy commented Mar 31, 2025

Uh oh!

pyropy commented Mar 31, 2025

Uh oh!

pyropy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Winter-Soren commented Mar 27, 2025 •

edited

Loading

Winter-Soren commented Mar 27, 2025 •

edited

Loading

Winter-Soren commented Mar 27, 2025 •

edited

Loading

bajtos Mar 28, 2025 •

edited

Loading