Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Prometheus metrics for AltDA failover in Batcher #361

Open
wants to merge 5 commits into
base: celo-rebase-12
Choose a base branch
from

Conversation

Kourin1996
Copy link

Closes #321

Add the following types of metrics for AltDA failover feature

  • Total count of failover occurrences
  • Total number of batches posted for each DA type

@Kourin1996 Kourin1996 requested a review from gastonponti March 14, 2025 06:36
@Kourin1996 Kourin1996 self-assigned this Mar 14, 2025
@Kourin1996 Kourin1996 marked this pull request as ready for review March 14, 2025 06:41
@palango palango requested a review from ezdac March 17, 2025 13:59
@palango
Copy link
Collaborator

palango commented Mar 18, 2025

@jcortejoso @alvarof2 @gastonponti Any metrics besides those implemented that would be helpful for you?

Copy link

@ezdac ezdac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!
I would record the batch-send success at a different location (see comment).

In terms of additional metrics,
maybe the actual number of bytes sent to the DA-proxy would be interesting,
so the cumulative sum of the http-body size in the setInput and setInputWithCommit, but only if the proxy returns a 200.

@@ -863,6 +863,7 @@ func (l *BatchSubmitter) sendTransaction(txdata txData, queue *txmgr.Queue[txRef
}
// if Alt DA is enabled we post the txdata to the DA Provider and replace it with the commitment.
l.publishToAltDAAndL1(txdata, queue, receiptsCh, daGroup)
l.Metr.RecordBatchDaType(txdata.daType.Name())
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the publishToAltDAAndL1() spawns a goroutine that in turn calls the sendTx, but only if the AltDA input can be set correctly.
If you call this here, you get a "confirmation" for sending the tx, even though you could have aborted before the send and recorded a "failedDARequest".

I think I would only add one Metr.RecordBatchDaType call in general and rather do that at the end of the sendTx(), but only if the passed in isCancel==false

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezdac You're right. For Alt DA, it recorded regardless of success or failure. Then I updated codes and now it records when handleReceipt is called.

@Kourin1996 Kourin1996 requested a review from ezdac March 20, 2025 07:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AltDA fallback: add metrics
3 participants