Skip to content

Conversation

@toger5
Copy link

@toger5 toger5 commented May 7, 2024

toger5 added 2 commits May 7, 2024 18:52
@toger5 toger5 force-pushed the toger5/expiring-events-keep-alive branch from 2bc07c4 to 0eb1abc Compare May 7, 2024 17:03
@toger5 toger5 force-pushed the toger5/expiring-events-keep-alive branch from 0eb1abc to 8bf6db7 Compare May 8, 2024 15:49
Signed-off-by: Timo K <[email protected]>
@turt2live turt2live changed the title Draft for expiring event PR MSC4140: Expiring events with keep alive endpoint May 9, 2024
@turt2live turt2live added proposal A matrix spec change proposal client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels May 9, 2024
@toger5 toger5 force-pushed the toger5/expiring-events-keep-alive branch from 3e54c2a to c82adf7 Compare May 10, 2024 17:54
@toger5 toger5 force-pushed the toger5/expiring-events-keep-alive branch from c82adf7 to 54fff99 Compare May 10, 2024 18:08
toger5 added 3 commits May 13, 2024 16:56
…is used to trigger on of the actions

Signed-off-by: Timo K <[email protected]>
Add event type to the body
Add event id template variable
Comment on lines 208 to 209
New authenticated client-server API endpoints `GET /_matrix/client/v1/delayed_events?status=scheduled` and
`GET /_matrix/client/v1/delayed_events?status=finalised` allows clients to get a list of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wording suggests that the status query-parameter is mandatory. Could you clarify whether it is or is not?

(I don't see any reason to make it mandatory, but YMMV)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. The default can be to return both scheduled & finalised events, and setting status can filter on one or the other. This is also a good spot to filter on the delay_id to query for just a single event.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default can be to return both scheduled & finalised events

On second thought, this complicates pagination, as it makes the response have to paginate two different item categories.

A possible solution is to take advantage of the fact that all scheduled delayed events have send times in the future, while finalised ones have send times in the past. The first "page" of the response would include both scheduled & finalised delayed events whose send times are within an interval around when the lookup request was made, and would also return both a next_batch and prev_batch token for retrieving further scheduled or finalised delayed events, respectively. Any "next" page would include only scheduled delayed events (since they are "next" in time) while any "previous" page would include only finalised ones (since they are "previous" in time).

If that is too much complexity, then there could simply be dedicated endpoints for looking up either scheduled delayed events or finalised ones. This has the added benefit of each endpoint being able to support unique query parameters that are relevant only for their respective delayed event status (like filtering finalised delayed events by the value of their outcome property).

[The combined endpoint] is also a good spot to filter on the delay_id to query for just a single event.

I'll walk back on this too, and will opt for having the delay_id as a path parameter as was originally proposed instead of a query parameter, because:

  1. There is precedent in the CS-API for using a path parameter for the ID of a singular resource to look up.
  2. Having a delay_id query parameter would suggest being able to specify it multiple times to look up multiple items, but this should have safeguards on how many items to allow looking up, to avoid absurdly long URLs or the need to paginate. Getting that right is not really worth the effort, especially since a viable alternative to a multi-item lookup is to just do multiple single-item lookups.
  3. Giving the delay_id-based lookup its own endpoint leaves room for it to be unauthenticated, just like the management endpoints now are, for the purpose of delegating control over specific delayed events to an external service. If it shared the same endpoint as the status-based lookups, this would not be possible.

The only caveat is that if the status-based lookups were to be split into one endpoint per status, we'd be back in the situation of needing to avoid a namespace clash with a delay_id-based lookup endpoint at GET /delayed_events/{delayId}. So, their paths could be something like GET /delayed_events/by-status/[scheduled|finalised] to satisfy that.

itsoyou pushed a commit to famedly/synapse that referenced this pull request Oct 13, 2025
… v11 using the /send endpoint (#18898)

Implement
[MSC4169](matrix-org/matrix-spec-proposals#4169)

While there is a dedicated API endpoint for redactions, being able to
send redactions using the normal send endpoint is useful when using
[MSC4140](matrix-org/matrix-spec-proposals#4140)
for sending delayed redactions to replicate expiring messages. Currently
this would only work on rooms >= v11 but fail with an internal server
error on older room versions when setting the `redacts` field in the
content, since older rooms would require that field to be outside of
`content`. We can address this by copying it over if necessary.

Relevant spec at
https://spec.matrix.org/v1.8/rooms/v11/#moving-the-redacts-property-of-mroomredaction-events-to-a-content-property

---------

Co-authored-by: Tulir Asokan <[email protected]>
data. Since the additional capability to use a template `event_id` parameter is also needed,
this probably is not a good fit.

### Not reusing the `send`/`state` endpoint
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would highly prefer that.

  1. An endpoint should not return two completely different response types depending on the query parameter.
  2. Sending a delayed event is a different action than sending an event and should be more explicit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its about the send body bein equivalent in both requests.
Sticky events also take the exact same appraoch.

I think the opinion on this very much depends on your mental model of delayed events

  • they are normal event sending actions but with a configurable increase in latency (it will take the hs a little bit to send the event anyways, you can just further delay this manually)
  • This endpoint schedules sth that is not a matrix event yet. Its a new entity that eveutally becomes a matrix event.

I like the first view on it since it makes it easier to justify how this is compatible with matrix and why this probably wont change anything fundamental (matrix already needs to be capable dealing with differen network delays)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an SDK developer, I prefer a type safe API. It would've been helpful to have this natively in Matrix instead making a workaround at the SDK level to get type-safety. At the end, I will provide a type safe API to the user anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another point against reusing the /send & /state endpoints is that having the delay as a query parameter conflicts with the convention of query parameters being filters, not action options.

It also interferes with the transaction identifier, for which the spec says:

The homeserver should identify a request as a retransmission if the transaction ID is the same as a previous request, and the path of the HTTP request is the same.

If I'm reading that rule correctly, it means that a request with the same transaction ID as an earlier request, but with a different query string, is still treated as a retransmission. That would make a /send or /state request with a ?delay= potentially be treated as a retransmission of an earlier such request without a ?delay=.

Using a dedicated endpoint could avoid this by specifying the delay as a top-level field of the request body, with the content of the delayed event moved down one level into an object:

PUT /_matrix/client/v1/rooms/{roomId}/send_delayed_event/{eventType}/{txnId}
{
  "delay": <delay_ms>
  "content": {
    "body": "hello",
    "msgtype": "m.text"
  }
}

Since the delayed event is sent first, a client can guarantee (at the time they are sending
the join event) that it will eventually leave.

### Self-destructing messages
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good use case, but I see a fundamental problem in this MSC: Delayed events are not passed down the sync. So even if one device sends a self destruction, other devices of the same account would not be notified about it in the sync. Therefore other devices would always need to poll delayed_events, which breaks the concept of having the Matrix sync and would unnecessary flood the homeserver with requests.

Copy link
Member

@tulir tulir Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any need to know about delayed events in that context. The sender creates the delay, other devices don't care. Beeper already implements disappearing messages like that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just about the user experience and communicating your intent to only let this message life for X minutes, that seems like an easy addition in a new MSC. A content fieled: "scheduled_redaction_ms" could be used to share when you plan to let this message disappear and clients can render some UI around that.

This is off topic, but if a burn on read semantic (in a DM) is desired. one could even go as far as sharing a link with a scoped token that sends the delayed redaction. So once received the receiver can than delete the message by sending the redaction scheduled by the sender of the message.

What I am trying to say, this MSC supports all the fundamentals for a really good self-destruction implementation. I am not sure it needs fundamental changes, but maybe some metadata on top is reaquired to check of UX features.
But this MSC proposes the general delayed event logic and is not specific to self-destructing messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other devices don't care

If that is the case, do we need the polling api GET delayed_events? (we could close this thread, because the other is regarding the same topic)

Copy link
Member

@tulir tulir Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Querying delayed events is necessary for other things like scheduled messages even though it's not needed for disappearing messages. Scheduled messages don't need a push mechanism, it's enough to be able to view and manage scheduled events when the user navigates to a specific view inside a room.

If this is just about the user experience and communicating your intent to only let this message life for X minutes, that seems like an easy addition in a new MSC. A content fieled: "scheduled_redaction_ms" could be used to share when you plan to let this message disappear and clients can render some UI around that.

This is indeed how Beeper works (using a com.beeper.disappearing_timer object) and that'll probably be MSC'd at some point

The primary point of rate limiting is event sending when the delay times out or the event is sent using the `send`
action. However, servers can choose to rate limit the management endpoints themselves as well if necessary.

### Getting delayed events
Copy link
Contributor

@benkuly benkuly Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a poll mechanism. In my opinion, we would need a push (sync) mechanism to take really advantage of this MSC and be able to introduce a bunch of new Matrix features based on it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the above.
There even is a follow up msc moving this into the sync block.
But your comments implicitly ask the question who should be able to access the list of scheduled delayed events: "The sender" vs "All room members". Right now the idea is that only the sender knows about their schedueld delayed events. (A dag like federated data exachange is required to sync inforamtion to all room members on other homeservers so sharing a schedueld delayed event is not as trivial, there is a reason matrix writes things into a room dag.)

The current MSC only ever exposes the scheduled delayed events to the sender.
This also has privacy/security advantages.
Whenever it is desired to share shedued data with the room metadata in the conetent of antoher room event should be used.

This might be worth explicitly mentioning in the MSC.

I hope this approach makes sense and coveres all of the usecases you have in mind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect to share delayed_events between users, but between devices of the same user (like account data). If we don't want that, which is totally okay, do we want an endpoint to get all delayed_events on an account base.

The other problem I see is, that the MSC suggest a new endpoint for getting delayed_events. It is poll based. As far as I know in Matrix there is no other data structure, that needs polling. What is the use case to poll delayed_events instead of syncing them? Why this "downgrade" regarding the Matrix push-first principle.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a follow up MSC to include it in sync (sorry that i confused user vs room delayed events sharing)
The "all devices for one user" push/sync based system has been moved to that msc:
#4309

The reason was to make this msc simpler.

It does only include finalized delayed events. So your point of syncing schedules to all devices via a push semantic is very valid.

What do you think about moving/discussing this functionality to the other MSC?

(It sadly is still in draft)

Also:
- split them into one endpoint per management action
- propose alternative of OAuth 2.0 scoped access for the endpoints
and list it as what the Synapse implementation currently uses
last restarted.
- `content` - Required. The content of the delayed event. This is the body of the original `PUT` request, not a preview
of the full event after sending.
- `finalised` - Required if the request has no `status` parameter, or sets it to `"finalised"`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we agreed oob to drop finalised from this endpoint, so it only returns events due to be sent now. Perhaps for future proofing we can have GET /_matrix/client/v1/delayed_events/scheduled.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #4140 (comment) for some brainstorming on this.

Without this parameter, delayed events of all status types are included in the response.
Requests that set this parameter to an unsupported value will respond with HTTP 400 and `M_UNKNOWN`.

The endpoint accepts a query parameter of `delay_id=<delay_id>` to filter the response on delayed events with a matching ID.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we also said this should just be a GET /_matrix/client/v1/delayed_events/{delay_id}.

This could return a finalised event, although I feel like homeservers should be able to clear up old delay_ids from the database so I wouldn't make this a hard requirement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this endpoint is mentioned at the halfway mark of #4140 (comment)

The batch size and the amount of terminated events that stay on the homeserver can be chosen, by the homeserver.
The recommended values are:

- batch size: 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really understand the rationale for retaining finalised events to be honest, what's the use case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's less about retaining them & more about purging them, i.e. about setting a limit on how many may be retained. Server implementations will inevitably need some purging scheme lest they end up storing an unbounded number of finalised events, but the benefit of standardising purge rules (and configs for them) is to give server implementers a starting point / baseline of how it ought to be done, as opposed to each having to define its own scheme (which may very well converge to the rules proposed here).

}
```

The `delay_id` is an [opaque identifier](https://spec.matrix.org/v1.16/appendices/#opaque-identifiers) generated by the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be nice if the delay_id ended up in the unsigned content of the event down sync so your client could identify which event mapped to which delay. I don't know how important this is, but surely it would be the nice thing t odo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, and would help with at least one edge case in Synapse that I can think of 🙂 What makes this possible is that the need to keep a delay_id secret (as it currently doubles as a token for having control over it) goes away once a delayed event is sent.

The only "information leak" is that others will be able to identify when an event was sent with a delay. IMO that is not too bad (and maybe even useful) but perhaps some users would prefer not to expose that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

client-server Client-Server API implementation-needs-checking The MSC has an implementation, but the SCT has not yet checked it. kind:feature MSC for not-core and not-maintenance stuff matrix-2.0 Required for Matrix 2.0 proposal A matrix spec change proposal voip

Projects

Status: Tracking for review

Development

Successfully merging this pull request may close these issues.