Skip to content

Add deployment.GetFullyQualifiedHomeserverName(t, hsName) to support custom Deployment #780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

MadLittleMods
Copy link
Collaborator

@MadLittleMods MadLittleMods commented May 16, 2025

Split off from #778 per discussion,

Spawning from a real use-case with a custom Deployment/Deployer (Element-internal).

Introduce complement.Deployment.GetFullyQualifiedHomeserverName(hsName) to allow the per-deployment short homeserver aliases like hs1 to be mapped to something else that's resolvable in your custom deployments context. Example: hs1 -> hs1.shard1:8481.

This is useful for situations like the following where you have to specify the via servers in a federated join request during a test:

alice.MustJoinRoom(t, bobRoomID, []string{
	deployment.GetFullyQualifiedHomeserverName(t, "hs2"),
})

But why does this have to be part of the complement.Deployment interface instead of your own custom deployment?

  • Tests only have access to the generic complement.Deployment interface
  • We can't derive fully-qualified homeserver names from the existing complement.Deployment interface
  • While we could cheekily cast the generic complement.Deployment back to CustomDeployment in our own tests (and have the helper in the CustomDeployment instead), if we start using something exotic in our out-of-repo Complement tests, the suite of existing Complement tests in this repo will not be compatible.

(also see below)

Motivating custom Deployment use case

complement.Deployment is an interface that can be backed by anything. For reference, custom deployments were introduced in #750. The default Deployment naming scheme in Complement is hs1, hs2, etc. It's really nice and convenient to be able to simply refer to homeservers as hs1, etc within a deployment. And using consistent names like this makes tests compatible with each other regardless of which Deployment is being used.

The built-in Deployment in Complement has each homeserver in a Docker container which already has network aliases like hs1, hs2, etc so no translation is needed from friendly name to resolvable address. When one homeserver needs to federate with the other, it can simply make a request to https://hs1:8448/... per spec on resolving server names.

Right-now, we hard-code hs1 across the tests when we specify "via" servers in join requests but that only works if you follow the strict single-deployment naming scheme.

bob.MustJoinRoom(t, roomID, []string{"hs1"})

But imagine a case where we have multiple Deployment and we want the homeservers to communicate with each other. If we keep using the consistent hs1, hs2 naming scheme for each Deployment we're going to have conflicts. This is where deployment.GetFullyQualifiedHomeserverName(t, hsName) comes in handy. We can call deployment1.GetFullyQualifiedHomeserverName(t, "hs1") -> hs1.shard1 and also deployment2.GetFullyQualifiedHomeserverName(t, "hs1") -> hs1.shard2 to get their unique resolvable addresses in the network.

Additionally, the helper removes the constraint of needing the network to strictly resolve hs1, hs2 hostnames to their respective homeservers. Whenever you need to refer to another homeserver, use deployment.GetFullyQualifiedHomeserverName(hsName) to take care of the nuance of environment that the given Deployment creates.

Todo

  • Update tests to use GetFullyQualifiedHomeserverName(...)
    • Join room (MustJoinRoom, JoinRoom)
    • Knock room (mustKnockOnRoomSynced, knockOnRoomWithStatus)
    • srv.MustJoinRoom, srv.MustLeaveRoom, srv.MustSendTransaction
    • FederationClient -> fedClient.MakeJoin, fedClient.SendJoin, etc
    • fclient, fclient.NewFederationRequest
    • m.space.child via
    • m.space.parent via
    • m.room.join_rules restricted via
    • gomatrixserverlib.EDU Destination
  • Potentially update the built-in Deployment implementation to support multiple deployments at the same time, tracked by this discussion below

Pull Request Checklist

Example:
```
alice.MustJoinRoom(t, bobRoomID, []string{
	shardDeployment2.GetFullyQualifiedHomeserverName(t, "hs1"),
})
```
@MadLittleMods MadLittleMods force-pushed the madlittlemods/deployment-fqdn-helper branch from 0e5756a to e5ff236 Compare May 16, 2025 15:59
@MadLittleMods MadLittleMods changed the title Add deployment.GetFullyQualifiedHomeserverName(hsName) Add deployment.GetFullyQualifiedHomeserverName(hsName) to support custom Deployment May 16, 2025
@MadLittleMods MadLittleMods changed the title Add deployment.GetFullyQualifiedHomeserverName(hsName) to support custom Deployment Add deployment.GetFullyQualifiedHomeserverName(t, hsName) to support custom Deployment May 16, 2025
Comment on lines 142 to +147
// MustJoinRoom joins the room ID or alias given, else fails the test. Returns the room ID.
func (c *CSAPI) MustJoinRoom(t ct.TestLike, roomIDOrAlias string, serverNames []string) string {
//
// Args:
// - `serverNames`: The list of servers to attempt to join the room through.
// These should be a resolvable addresses within the deployment network.
func (c *CSAPI) MustJoinRoom(t ct.TestLike, roomIDOrAlias string, serverNames []spec.ServerName) string {
Copy link
Collaborator Author

@MadLittleMods MadLittleMods May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, wherever we expect people to use deployment.GetFullyQualifiedHomeserverName(t, hsName), I've updated these function signatures to accept spec.ServerName instead of just plain strings.

I also think this is more semantically correct for the places because this needs to be a resolvable homeserver address in the federation.

@@ -117,7 +117,7 @@ func MakeRespMakeKnock(s *Server, room *ServerRoom, userID string) (resp fclient
// the current server is returned to the joining server.
func SendJoinRequestsHandler(s *Server, w http.ResponseWriter, req *http.Request, expectPartialState bool, omitServersInRoom bool) {
fedReq, errResp := fclient.VerifyHTTPRequest(
req, time.Now(), spec.ServerName(s.serverName), nil, s.keyRing,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we've converted some of these types to spec.ServerName, we no longer need the conversion.

Comment on lines +521 to +522
// TODO: It feels like `ServersInRoom` should be `[]spec.ServerName` instead of `[]string`
ServersInRoom: serversInRoomStrings,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a gomatrixserverlib problem.

(not going to fix in this PR)

@@ -63,7 +64,7 @@ func TestOutboundFederationSend(t *testing.T) {
roomAlias := srv.MakeAliasMapping("flibble", serverRoom.RoomID)

// the local homeserver joins the room
alice.MustJoinRoom(t, roomAlias, []string{deployment.GetConfig().HostnameRunningComplement})
alice.MustJoinRoom(t, roomAlias, []spec.ServerName{srv.ServerName()})
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, this was just a mistake from before. Unclear how it worked before since it's missing the randomly assigned port.

Comment on lines +24 to +30
// Returns the resolvable server name (host) of a homeserver given its short alias
// (e.g., "hs1", "hs2").
//
// In the case of the standard Docker deployment, this will be the same `hs1`, `hs2`
// but may be different for other custom deployments (ex.
// `shardDeployment1.GetFullyQualifiedHomeserverName(t, "hs1")` -> `hs1.shard1:8081`).
GetFullyQualifiedHomeserverName(t ct.TestLike, hsName string) spec.ServerName
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main change of this PR is here. It's a breaking change to the complement.Deployment interface

The rest of the diff is essentially just updating to use this new utility.

Comment on lines +61 to +69
func (d *Deployment) GetFullyQualifiedHomeserverName(t ct.TestLike, hsName string) spec.ServerName {
_, ok := d.HS[hsName]
if !ok {
ct.Fatalf(t, "Deployment.GetFullyQualifiedHomeserverName - HS name '%s' not found", hsName)
}
// We have network aliases for each Docker container that will resolve the `hsName` to
// the container.
return spec.ServerName(hsName)
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the implementation for the new GetFullyQualifiedHomeserverName method that we added to the complement.Deployment interface.

@@ -27,7 +28,9 @@ func TestPresence(t *testing.T) {

// to share presence alice and bob must be in a shared room
roomID := alice.MustCreateRoom(t, map[string]interface{}{"preset": "public_chat"})
bob.MustJoinRoom(t, roomID, []string{"hs1"})
bob.MustJoinRoom(t, roomID, []spec.ServerName{
deployment.GetFullyQualifiedHomeserverName(t, "hs1"),
Copy link
Collaborator Author

@MadLittleMods MadLittleMods May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the reviewer: I've tried to be thorough in updating everything on this list (from the PR description). This would be the main thing to think about. Are there other spots that we need to use GetFullyQualifiedHomeserverName(...) instead of the hard-coded hs1 values?

  • Update tests to use GetFullyQualifiedHomeserverName(...)
    • Join room (MustJoinRoom, JoinRoom)
    • Knock room (mustKnockOnRoomSynced, knockOnRoomWithStatus)
    • srv.MustJoinRoom, srv.MustLeaveRoom, srv.MustSendTransaction
    • FederationClient -> fedClient.MakeJoin, fedClient.SendJoin, etc
    • fclient, fclient.NewFederationRequest
    • m.space.child via
    • m.space.parent via
    • m.room.join_rules restricted via
    • gomatrixserverlib.EDU Destination

I've also reviewed the diff itself to ensure that I didn't accidentally swap hs1 for hs2 somewhere.

// the container.
return spec.ServerName(hsName)
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, I don't think the built-in Complement Deployment implementation supports multiple Deployments at the same time (hs1, hs2 would conflict between them). Since one of the goals of this PR is to unlock that functionality for custom Deployment's, it could make some sense to also refactor and update that here as well.

I'd rather leave it as-is until we need it or at-least do this in a follow-up PR.

See the PR description for more context on multiple Deployment.

@MadLittleMods MadLittleMods marked this pull request as ready for review May 16, 2025 21:37
@MadLittleMods MadLittleMods requested review from kegsay and a team as code owners May 16, 2025 21:37
@MadLittleMods MadLittleMods removed request for a team May 16, 2025 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant