Skip to content

Conversation

@apenzk
Copy link
Contributor

@apenzk apenzk commented Dec 17, 2024

Summary

MD-74
MIP-74

@apenzk apenzk changed the title [draft] MIP-n: Rate-Limiter for the Lock/Mint-type Native Bridge [draft] MIP-74: Rate-Limiter for the Lock/Mint-type Native Bridge Dec 17, 2024
@0xmovses
Copy link
Contributor

Is the idea to have this completed before https://github.com/movementlabsxyz/MIP/pull/73/files is ready for review? Seems to me btw unless I have misunderstood that RateLimiting we will also want to be a manual operation. We could automate it and have it react to the Informer and I don't see any risks in increasing the Rate Limit based on the Informer. Even if we were to get some mad monitoring event from the Informer, in the worst case, the Rate Limit would only be increased. Cautionary, automated reaction probably good.

@franck44 franck44 added Draft MD/MIP A new/draft MD/MIP bridge labels Dec 17, 2024
@apenzk
Copy link
Contributor Author

apenzk commented Dec 18, 2024

@0xmovses 7

Is the idea to have this completed before https://github.com/movementlabsxyz/MIP/pull/73/files is ready for review?

3 can be independent of this MIP, MIP-73 provides an entry point and show how everything comes together so i am not sure there is a clear order

Seems to me btw unless I have misunderstood that RateLimiting we will also want to be a manual operation. We could automate it and have it react to the Informer and I don't see any risks in increasing the Rate Limit based on the Informer. Even if we were to get some mad monitoring event from the Informer, in the worst case, the Rate Limit would only be increased. Cautionary, automated reaction probably good.

We have two different trust assumptions

  1. the operator (e.g. human multisig)
  2. a trusted automated process

we have to treat these different. the target chain rate limit protects also against compromise or errors on 2. It also protects against wrong information of the informer. the maximum rate limit MUST be set by func(Insurance fund, human reaction time) (e.g. absolute max_rate= insurance fund / human reaction).

I think there could be several thresholds, some of which dynamic.

  1. the maximum rate limit on target chain and source chain is set by governance. This requires trust level 1 (above)
  2. the informer MAY reduce the rate limit even further, based on whatever metric but it MUST NOT set it higher than 1.
  3. the relayer MAY reduce the rate limit on the source chain. This is to allow for the rate limiter to be able to catch up in case it lacks behind with processing.

@apenzk apenzk changed the title [draft] MIP-74: Rate-Limiter for the Lock/Mint-type Native Bridge [review] MIP-74: Rate-Limiter for the Lock/Mint-type Native Bridge Dec 18, 2024
1. **Rate Limiter**: The Rate Limiter is a set of contracts (one on the L1 and one on the L2) that is used to limit the volume of transferred value per time window.

![alt text](overview.png)
_Figure 1: Architecture of the Rate Limitation_
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk
In Figure 1, It is not clear to me what "sets parameters" means vs "sets limit".
I would simplify the figure as follows:

  • user "sends" (rather than "attempts")
  • is the governance a contract on L2? If not, it may located outside of the yellow box.
  • what are the parameters, and is limit a parameter? Are there any constriants between the different "limits"?
  • why is the opertor instructing the governance and not the opposite?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. replaced by "requests transfer"
  2. its on L2
  3. limit = f(insurance fund, reaction time, reduction factor), governance can set the latter two, the constraints between the limits are described in the text of the document. does it need improving?
  4. its the governance operator, will change operator -> governance operator

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk The figure has a few glitches:

  1. L2 Bridge Contract used in the picture but L2 Native Bridge Contract in the text
  2. L2 bridge contract on L1, should be L1 Bridge contract.

I have added () around thr Native attribute in the text, so that may fix 1. but I have not modified the figure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated figure and text.


The following risks are associated with the Native Bridge:

1. The trusted relayer is compromised or faulty. We thus want to ensure that the relayer has not unlimited power to release or mint assets. For this we MUST implement a rate limiter on the target chain.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems to contradict that the relayer is trusted.
If it can be faulty, it is not trusted.

Copy link
Contributor Author

@apenzk apenzk Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"partially trusted" ?
"trusted (to a large extend)" ?

1. In order to rate limit the bridge (e.g. stop the bridge transfers entirely) there should be a higher instance than the relayer in setting rate limits. Thus the rate limit on the target chain SHOULD be set by the Operator.
1. The Relayer may go down, while the number of transactions and requested transfer value across the bridge still increases on the source chain. Due to rate limit on the target chain the Relayer may struggle to process all initiated transfers. Thus the Relayer or the Operator MUST rate limit the source chain as well.

### Rate-Limiter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk
Would it makes sense to list the expected properties (as per above) and what they guarantee to the operator and the user?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is the "Risks and mitigation strategy" section. Do you think it is to convoluted?

- property 2
- property 3 ...

### Rate-Limiter
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk
Can you confirm this is correct?
And add anything you think is relevant to tis section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have added several points to the "Objectives" section


`max_rate_limit_target = insurance_fund_target / reaction_time`,

where the `reaction_time` is the time it takes for the Operator to react to a faulty or compromised component. The `reaction_time` is a parameter that is set by the Operator. The Operator MAY set the actual rate limit lower than the `max_rate_limit_target`. However the Rate Limiter MUST NOT set the rate limit higher than the `max_rate_limit_target`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk
I suggest to break this into two independent parts:

  • property if the rate limit and max_rate limit etc
  • permissions: who can set what.

At the moment I find this section confusing with the Rate limier, the Operator etc so it may need to be clarified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i tried to improve, wdyt?


#### Rate limit on the source chain

On the source chain the rate limit MAY be lowered by the Relayer. This is to ensure that the rate limit on the target chain is not exceeded. It also permits the Relayer to catch up in case of the Relayer has been down for some time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk
We may need to justify why there is a need for a rate limit on the source chain.
Could we have an example fo what can happen?

My understanding was that it is needed because of delays between transfers from L2 -> L1 (or vice versa) so we need to know what is requested to bridge and what is completed (the sum of the two should be lower than the insurance funds).

Copy link
Contributor Author

@apenzk apenzk Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have added a para

On the source chain the rate limit MUST be limited by the Governance Operator to match the rate limit on the target chain. if only the target chain would be rate limited users could successfully continue to request transfers on the source chain while the budget on the target chain is already consumed. Consequently the Relayer would not be capable to complete the transfers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rate limiter must react to chain based data, it is not possible for the source chain contract to know about the completed transactions nor the insurance fund on target. Thus we assume that clocks are correct and requested transfers < completed transfers AND trust that the operator updates on source the rate limit.

These values are considered constant in the sequel. There may be updated if needed or if new funds are added to the pools.

The Insurance Fund rate-limits the outbound transfers, i.e. for a given transfer from source chain to target chain the Insurance Fund on the target chain is responsible for the rate limit, and thus we will refer to the `insurance_fund_target`. I.e. for a transfer from L1 to L2 the `insurance_fund_target = insurance_fund_L2` is responsible for the rate limit. While for a transfer from L2 to L1 the `insurance_fund_target = insurance_fund_L1` is responsible for the rate limit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk
The important part seems to be that the insurance funds (total) limits the rate.
Is it necessary to consider two sides the L1 fund and the L2 fund or could we collapse them into one (at the design level)?
I understand that if we need to compensate users we may need two pools, oneon L1 and one on L2, but this may be separate to the actual logic.

Copy link
Contributor Author

@apenzk apenzk Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason there are two pools is that each direction needs to be covered by an own value. the total rate in both direction is limited by the total amount in the insurance fund. we cannot use the insurance fund to secure twice and thus if we want to secure X in each direction the insurance fund must be 2X (if it would be on one Level only)

we initially had it on one side and i think this was advocated for by @Primata and @l-monninger .

the governance operator decides about the amount in the insurance fund as well as the reaction time. thus in principle we could simply trust it and allow it to set the value directly. however it adds additional assumptions on that the governance operator does not make errors.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk So in principle, the values for pools on L1 and L2 should be the same? We split the total insurance fund into the two pools?

Copy link
Contributor Author

@apenzk apenzk Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we would want to have the same max rate in either direction, yes. But i dont think this is a requirement.

`rate_limit_target = rate_reduction_target * max_rate_limit_target`,

where `rate_reduction_target \in {0,1}` is a parameter that is set by the Operator. Note the `rate_limit_target` MUST not be larger than `max_rate_limit_target`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it the inetrval [0,1] or 0 or 1?
In smart contracts, it is usually not easy to use real numbers (Solidity/EVM does not support it) and MoveVM does but computation is bit more expensive if I am correct than integers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. you r correct . it is [0,1]. will change it.
  2. i am not sure what would numerically be the most efficient way.. there is a maximum rate limit and the Operator could choose a smaller part of that (less than 100%).


> [!NOTE]
> I can convert the following into pseudo code, after we have discussed the algorithm and it makes sense.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is not clear to me is how the time window is updated. And is it a sliding window? e.g the last 40 txs? or the last 10 minutes? Or the last ten blocks?
I would think that a per Tx window makes sense, does it? e.g., every 1000 tx we go to w new time window/interval.

Copy link
Contributor Author

@apenzk apenzk Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets assume the simplest non-optimized case first : fixed timewindows, lets call them epochs. then (if not too expensive) every initiate transaction checks and resets the budget IF it is in the new epoch. would this work?

a sliding window seems more difficult. the easiest and most gas efficient may be that we approximate a sliding window. E.g. split the epoch into smaller parts, but maybe not the most elegant approach. We are restricted by the user having to update the bridge contract state

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easiest and cheapest implementation is, taking the block timestamp. Divide it by 24hrs and you get a number, that is your time allocation to be checked against. IF we change the 24hrs fixed value it leads to an issue with time allocation, that's why I'd advise against it because to fix it we need some extra operations and logic that might lead to unexpected behaviors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with @Primata and suggest we leave sliding windows and such for future MIPs

The following are possible ways to adjust the rate limit:

1. The Governance Operator can adjust the rate limit by adding or removing funds from the Insurance Fund.
1. The Governance Operator may adjust the rate limit by changing the `reaction_time`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current implementation is 24hrs and I think that a fixed, 24hrs reaction_time is fine. There are gas cost and logic issues with allowing the change of reaction_time that we should consider so that we don't bloat the contract too much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added this default parameter setting

Implementation recommendation #1: The default value is 24h. In the initial implementation this value is fixed to avoid complications in gas.

> [!NOTE]
> I can convert the following into pseudo code, after we have discussed the algorithm and it makes sense.
**Algorithm for the Native Bridge contract on the source chain**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes to use MIP-58's design C, correct?

Copy link
Contributor Author

@apenzk apenzk Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i have moved the moved the alternative designs described in MIP-58 to the appendix here and added you consequently as an author.


> [!NOTE]
> I can convert the following into pseudo code, after we have discussed the algorithm and it makes sense.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easiest and cheapest implementation is, taking the block timestamp. Divide it by 24hrs and you get a number, that is your time allocation to be checked against. IF we change the 24hrs fixed value it leads to an issue with time allocation, that's why I'd advise against it because to fix it we need some extra operations and logic that might lead to unexpected behaviors.

@franck44 franck44 added Ready to Review Needs reviewing and removed Draft MD/MIP A new/draft MD/MIP labels Jan 5, 2025
The following algorithm is a recommendation for the operation of the Relayer:

**(Optional) Algorithm for the Relayer**

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk I am not sure we need to do anything with the Relayer. If a transaction is not processed within a given amount of time (for any reason, network down etc), the Relayer should re-send it to ensure it evetually succeeds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was to avoid excessive costs to the relayer incase the budget is consumed on the target chain but is not consumed on the source chain. i have removed it as it is an optimisation that we can handle once we discover it actually is an issue.

`rate_limit_source = min{rate_reduction_source * rate_limit_target, rate_limit_operator_source}`,

where `rate_reduction_source` $\in$ `[0,1]` is a parameter that is set by the Relayer. `rate_limit_operator_source` is a parameter that is set by the Governance Operator. Note the `rate_limit_source` SHOULD not be larger than `rate_limit_operator_source`.

Copy link
Contributor

@franck44 franck44 Jan 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@apenzk So it looks like there is a need for synchronising the rate limits on both the source and traget and the relayer can do that. We could use this sync mechanism to simplify the rate limiter I think.

I would assume that as long as less than insurance_fund1 is currently "in-flight" we are OK, so we may try to measure that.
Assume we are bridging from L2 to L1.
If we know that an amount $S$ is currently bridged over (pending to be completed) to L1 and $S \leq $ insurance_fund1 then we can cover the loss (there is no need for a time window).
To know that this the case, we need to keep track of the completed Txs and another conponent can do that (or the Relayer).
When a bridge Tx from L2 to L1, anount $k$, is successfully completed we can decrease $S$ and update it to $S := S - k$.

On another note, I am a bit confused about the wording/naming here. Why do we need a rate?
If we have a unique Tx with the maximum amount insurance_fund1 we should first clear this Tx before we continue to process new ones.

Would it be the role of the Informer to do this kind of things (syncing L2 <-> L1)?

Copy link
Contributor Author

@apenzk apenzk Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume that as long as less than insurance_fund1 is currently "in-flight" we are OK, so we may try to measure that.
...
To know that this the case, we need to keep track of the completed Txs and another conponent can do that (or the Relayer).

as discussed above with @Primata sliding windows or reseting budget based on completing transactions seems out of scope for this version. I suggest we move more complicated versions over to a new MIP.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth noting:

There is a valid case where the relayer should set the rate limit on the source chain: if the operator fails to adjust the rate limit on the source chain when ajusting the rate limit on the target chain (by changing insurance fund or reaction time).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the main thing for the relayer to implement is holding transaction until it's able to complete on target chain.

  1. Worst case scenario users need to wait for 24hrs.
  2. This would allow us to also inform the user on the source chain that there is a difference between source chain and target chain rate limit budget. That would be the in-flight value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the rate limit on the source chain is much larger than on the target chain, it is could be much more than 24h.

but yes, the relayer would need to be able to hold. or in a simple implementation he would just retry continuously which is handled by MIP-61

@apenzk apenzk requested review from Primata and franck44 January 7, 2025 12:06
@apenzk apenzk force-pushed the mip/rate-limiter-lock-mint-bridge branch from cee8351 to 2beeae8 Compare March 26, 2025 17:14
@apenzk apenzk requested a review from franck44 March 26, 2025 17:14
@apenzk apenzk dismissed Primata’s stale review March 26, 2025 17:15

merge will go ahead, as the MIP is stagnant.

@movementlabsxyz movementlabsxyz deleted a comment from Primata Mar 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bridge Committee review To review at next committee committee-vote:md-stagnant When the committee votes that an MD is stagnant and the effects need to be applied committee-vote:mip-stagnant When the committee votes that an MD is stagnant and the effects need to be applied MD Contains an MD MIP Contains an MIP Needs changes Requires attention & changes Ready to Review Needs reviewing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants