Skip to content

MSC4258: Federated User Directory #4258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

MatMaul
Copy link

@MatMaul MatMaul commented Jan 29, 2025

Rendered

This proposal has been thought and written by me and people listed below, all employed by the French state for the Tchap project.

@mcalinghee @odelcroi @yostyle @NicolasBuquet

@MatMaul MatMaul changed the title MSC5000: Federated User Directory MSC4258: Federated User Directory Jan 29, 2025
@MatMaul MatMaul force-pushed the fed-user-dir branch 3 times, most recently from 224fe5c to f4ee6bc Compare January 29, 2025 15:22
Co-authored-by: Maghen Calinghee <[email protected]>
Co-authored-by: Olivier Delcroix <[email protected]>
Co-authored-by: Yoan Pintas <[email protected]>
Co-authored-by:	Nicolas Buquet <[email protected]>
@MatMaul MatMaul marked this pull request as ready for review January 29, 2025 15:23
@MatMaul
Copy link
Author

MatMaul commented Jan 29, 2025

We are planning to implement the MSC in the coming months, however we would happily take early feedback in the meantime.

@turt2live turt2live added proposal A matrix spec change proposal s2s Server-to-Server API (federation) kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Jan 29, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

  • Client
  • Server

@turt2live turt2live added the client-server Client-Server API label Jan 29, 2025
```json
{
"limit": 10,
"search_term": "foo"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there guidance on how the search term is used? Is it the same as the current API?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current spec is telling to search the Matrix ID and the display name and for now this proposal doesn't change that.

Should we change that to also search all profile fields instead ? I am not sure, opinions welcome here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once the user id search term contains (part of) a domain name, we can probably stop asking random servers? Such as search term "@Steve:matri" probably does not need to query the etke.cc server for users? I am just a little worried about the performance impact of these searches. (although most search will probably NOT contain domain parts. Not sure really.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be left as an implementation detail: we added or a subset of it at L47 so the homeserver can be quite liberal in the caching and requesting behavior.

We first propose a new federation endpoint similar to the [current client API](https://spec.matrix.org/v1.12/client-server-api#post_matrixclientv3user_directorysearch).
It would be authenticated and rate limited.

#### `POST /_matrix/federation/v3/user_directory/search`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the valid error conditions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tagging on to this, the profile federation API has a 403 to let server admins deny profile look-up. This might be good to have on the user directory API as well.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see several cases:

  • an empty list should be returned if all matching users has visibility settings that would hide them.
  • federated user directory is fully disabled on the server. 404 could be used here.
  • federated user directory is restricted to a set of allowed servers for example. We should probably use 403 then.


All profile fields (cf [MSC4133](https://github.com/matrix-org/matrix-spec-proposals/pull/4133)) should be returned here.

When an user calls the client user search API, the server should send a federated user search request to all known servers. It would then receive the results and return them to the user.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds really really expensive for the server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could benefit from #4259 🙂

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed MSC #4259 could/should be mentioned as a possibility to build upon?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds really really expensive for the server.

We should probably relax the requirement to include all known servers, so that implementations can optimize that if needed. to all known servers or a subset of it for example ?

I am not sure the impact will be that heavy however : typing in a room with almost all servers like Matrix HQ will similarly broadcast events to all of those.

This could benefit from #4259 🙂
Indeed MSC #4259 could/should be mentioned as a possibility to build upon?

Could you elaborate what you both have in mind ? for now it can help a bit to receive profile updates if we have a local search cache, but I don't really see something else.

- `restricted`: visible to any user sharing a room with
- `remote` (or federated or public ?): visible to users on local and remote homeservers

If no value is provided (or it is null), the user hasn't set a preference and the server should follow the current expected behavior (visible if sharing a room in common or in public room).
Copy link
Contributor

@Johennes Johennes Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"visible if sharing a room in common or in public room" is actually only the minimum requirement.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This proposal would change that, and user defined visibility would prevail, as expected by the users :)

- introduce `search_scope` in the client API
Copy link

@spaetz spaetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it is such a hard problem (mainly performance and privacy). There is a reason why there is no federated email directory....
A few minor comments inline.

@@ -0,0 +1,162 @@
# MSC4258: Federated User Directory

Currently user search can only be done locally, which would at best get a list of all users known to the server.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I know what is meant, but the grammar of the second half of that sentence seems weird or at least difficult to parse.


The federation search endpoint should be rate limited.

We recommend to not answer for `search_term` with less than 3 characters like "a" or "at".
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so @A:matrix.org could never be found?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation could just return the exact match but not the rest. I'll add it.

```json
{
"limit": 10,
"search_term": "foo"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once the user id search term contains (part of) a domain name, we can probably stop asking random servers? Such as search term "@Steve:matri" probably does not need to query the etke.cc server for users? I am just a little worried about the performance impact of these searches. (although most search will probably NOT contain domain parts. Not sure really.

@@ -0,0 +1,162 @@
# MSC4258: Federated User Directory

Currently user search can only be done locally, which would at best get a list of all users known to the server.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currently user search can only be done locally, which would at best get a list of all users known to the server.
Currently user search can only be done locally, which lists - at maximum - all users known to the user's server.

"display_name": "Foo",
"m.tz": "America/New_York",
"user_id": "@foo:bar.com",
"visibility": "local",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may have discussed this in another thread already but since I cannot find it: I don't think this should be exposed in the response regardless of whether it's a local or a remote query. It is a user configuration and I cannot think of a reason why the requester would need to know whether they found the user because their visibility setting was local, restricted or remote.

Copy link
Author

@MatMaul MatMaul Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only in the federation response, we removed it from the client one after switching to account data. It is useful to be able to apply restricted visibility on the requester server.

It could be omitted if we pass the requester in the request, then the remote server is able to calculate if it should return the result or not.

We were more thinking about caching when designing the API, since having the visibility allows to cache the request for all users while we can only cache the result per user if we don't have the visibility.

Perhaps we should just return "visibility": "restricted" and nothing otherwise as a tradeoff, I am not sure. But yeah current state is too leaky I agree.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see. Sorry, I had mistaken this as the CS response. 🤦‍♂️

In that case this seems acceptable to me since the field is required for filtering by the server and being stripped out in the CS response.


#### New account data to control user visibility in the directory

We propose to add a new account data of type `m.user_directory` with a single `visibility` field to give the user the ability to control their visibility in the user directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For rooms, their directory visibility is integrated into /createRoom. Would it make sense to do something similar for users and integrate their initial visibility choice with user registration?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong opinion, let's wait for others opinion on that I think.

Comment on lines +61 to +64
- `hidden` : not visible to anyone
- `local` : visible only to local homeserver users
- `restricted`: visible to any user sharing a room with
- `remote`: visible to users on local and remote homeservers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For rooms, the visibility values are private and public. This could be mimicked here by using private, local, restricted and public. The terms "private" and "public" have been deemed ambiguous in the past. At the same time, however, maybe we're making things worse by adding yet another terminology. I don't have a strong opinion myself here to be honest.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I have a small preference to avoid the ambiguity, but both are fine for me.

- `restricted`: visible to any user sharing a room with
- `remote`: visible to users on local and remote homeservers

If no value is provided (or it is null), the user hasn't set a preference and the server should follow the current expected behavior (visible if sharing a room in common or in public room).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To follow up on #4258 (comment), what I meant is something like this:

Suggested change
If no value is provided (or it is null), the user hasn't set a preference and the server should follow the current expected behavior (visible if sharing a room in common or in public room).
If no value is provided (or it is null), the user hasn't set a preference and the server should follow the current expected behavior (MUST be visible if sharing a room in common or in public room, MAY still be visible in all other cases if the server chooses so).

We also propose a new `search_scope` parameter to limit the scope of a search.
Possible values are:
- `local` : only search users local to the homeserver, this must not trigger a federated search
- `restricted`: search users known to this homeserver, this must not trigger a federated search
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the server still have to trigger federated profile queries for external users in this case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure honestly. I think it may be left as an implementation detail for now at least, and if it turns out that one way or the other is quite superior during implementation we can specify it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal s2s Server-to-Server API (federation)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants