feat: throttle polling for status #3221
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Introduces adaptive throttling in the anchor polling loop. In the current polling behavior we observed upwards of 500 tps getStatus request to CAS for a load of 6 tps of create request on js-ceramic. This stresses CAS causing it to be unavailable with lower creation rates.
Intuition :
To make sure js-ceramic does not fall behind and also make sure it does not send more than required load to CAS while polling for anchor updates an adaptive throttling approach comes into mind. Where the throttle rate is dynamically configured based on the requests the node is getting to create streams.
Design Goals:
We require nodes to have intelligent mechanisms to dynamically adjust polling based on real-time analysis of local conditions (i.e. create request rate).
Primarily affects how often the node interacts with a shared resource (CAS), aiming to optimize this interaction to match the node's activity without adding unnecessary load to the CAS.
Solution :
Each node adjusts its own polling frequency based on its local conditions (rate of request generation)
Each node decides at what rate it should poll the central authentication service (CAS). The decision is based on the internal rate of request creation, ensuring the polling frequency is neither too high nor too low.
Implementation :
Throttling Mechanism: The main feature introduced is a throttling mechanism to regulate the frequency at which the status of an anchor request status is checked from CAS. This is accomplished by wrapping the status check function within a throttle function provided by lodash, which limits its execution to no more than once every determined interval (initially set to approximately 200ms, or 5 times per second).
AnchorProcessingLoop Modifications:
Throttled Function: A new throttled function, throttledGetStatusForRequest, has been introduced. This function wraps the CAS client’s method to get the status of an anchor request, ensuring that it does not execute more than once per defined interval. It operates on the leading edge of the interval for immediate execution upon request but prevents further calls until the interval elapses.
Dynamic Polling Interval Adjustment: Added a method, adjustPollingInterval, that dynamically adjusts the polling interval based on the rate of create requests calculated by the CAS client. This adjustment uses a square root function to moderate changes, ensuring that the polling frequency remains between a defined maximum and minimum threshold.
CAS Client Enhancements:
Create Request Rate Calculation: The CAS client now includes a method to calculate the rate of anchor request creations. This rate is determined over a rolling window of the last 15 minutes, providing a metric that can be used to adjust the polling interval dynamically.
Request Timestamps: Tracking of timestamps for each create request has been introduced to support the rate calculation.
Lifecycle Management:
Interval Subscription for Adjusting Polling: An interval subscription has been set up to periodically adjust the polling interval every 10 minutes. This subscription ensures that adjustments to the throttling behavior are made regularly based on the latest request creation rate.
Start and Stop Methods: Enhancements to the start and stop methods of the AnchorProcessingLoop to initiate and cease the interval-based adjustments and the regular polling process.
How Has This Been Tested?
Describe the tests that you ran to verify your changes. Provide instructions for reproduction.
PR checklist
Before submitting this PR, please make sure:
References:
https://lodash.com/