Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions changelog.d/19062.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Fix a bug introduced in 1.111.0 where failed attempts to download authenticated remote media would not be handled correctly.
2 changes: 1 addition & 1 deletion synapse/http/matrixfederationclient.py
Original file line number Diff line number Diff line change
Expand Up @@ -1719,6 +1719,7 @@ async def federation_get_file(
response, output_stream, boundary, expected_size + 1
Copy link
Contributor

@MadLittleMods MadLittleMods Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have tests for this kind of timeout? (just wondering)

It looks like this was previously functional and the only problem was the unexpected logs/error handling so a high-level integration/end-to-end test wouldn't have really prevented this.

It looks like we already have other tests in tests/http/test_matrixfederationclient.py that ensure we raise a RequestSendFailed in certain scenarios. We could add one of those 🤷

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the pointer to the test file. I've written up a test (that fails on develop) in 1f20fe7.

)
deferred.addTimeout(self.default_timeout_seconds, self.reactor)
multipart_response = await make_deferred_yieldable(deferred)
Copy link
Contributor

@MadLittleMods MadLittleMods Oct 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be inside the async with self.remote_download_linearizer.queue(ip_address):?

Maybe. We do it in get_file which is probably what federation_get_file was based off of:

async with self.remote_download_linearizer.queue(ip_address):
# add a byte of headroom to max size as function errs at >=
d = read_body_with_max_size(response, output_stream, expected_size + 1)
d.addTimeout(self.default_timeout_seconds, self.reactor)
length = await make_deferred_yieldable(d)

And although this isn't explained anywhere, I guess the point of self.remote_download_linearizer is to ensure we only download 6 pieces of remote media at a time for some reason (maybe resource exhaustion, just a bunch of assumptions). So it makes sense that the download part is under the lock.

Copy link
Member Author

@anoadragon453 anoadragon453 Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue yes. This code:

# add a byte of headroom to max size as `_MultipartParserProtocol.dataReceived` errs at >=
deferred = read_multipart_response(
    response, output_stream, boundary, expected_size + 1
)
deferred.addTimeout(self.default_timeout_seconds, self.reactor)

merely sets up the download, whereas:

multipart_response = await make_deferred_yieldable(deferred)

is where we actually wait for the download to complete. So I would expect the lineariser to cover the latter.

I can't speak to why we limit the number of downloads. Potentially the limit the number of open file descriptors? But 6 seems quite low for only that reason. 🤷

except BodyExceededMaxSize:
msg = "Requested file is too large > %r bytes" % (expected_size,)
logger.warning(
Expand Down Expand Up @@ -1755,7 +1756,6 @@ async def federation_get_file(
)
raise

multipart_response = await make_deferred_yieldable(deferred)
if not multipart_response.url:
assert multipart_response.length is not None
length = multipart_response.length
Expand Down
59 changes: 59 additions & 0 deletions tests/http/test_matrixfederationclient.py
Original file line number Diff line number Diff line change
Expand Up @@ -415,6 +415,65 @@ def test_authed_media_redirect_response(self) -> None:
self.assertEqual(length, len(data))
self.assertEqual(output_stream.getvalue(), data)

@override_config(
{
"federation": {
# Set the timeout to a deterministic value, in case the defaults
# change.
"client_timeout": "10s",
}
}
)
def test_authed_media_timeout_reading_body(self) -> None:
"""
If the HTTP request is connected, but gets no response before being
timed out, it'll give a RequestSendFailed with can_retry.

Regression test for https://github.com/element-hq/synapse/issues/19061
"""
limiter = Ratelimiter(
store=self.hs.get_datastores().main,
clock=self.clock,
cfg=RatelimitSettings(key="", per_second=0.17, burst_count=1048576),
)

output_stream = io.BytesIO()

d = defer.ensureDeferred(
# timeout is set by `client_timeout`, which we override above.
self.cl.federation_get_file(
"testserv:8008", "path", output_stream, limiter, "127.0.0.1", 10000
)
)

self.pump()

conn = Mock()
clients = self.reactor.tcpClients
client = clients[0][2].buildProtocol(None)
client.makeConnection(conn)

# Deferred does not have a result
self.assertNoResult(d)

# Send it the HTTP response
client.dataReceived(
b"HTTP/1.1 200 OK\r\n"
b"Server: Fake\r\n"
# Set a large content length, prompting the federation client to
# wait to receive the rest of the body.
b"Content-Length: 1000\r\n"
b"Content-Type: multipart/mixed; boundary=6067d4698f8d40a0a794ea7d7379d53a\r\n\r\n"
)

# Push by enough to time it out
self.reactor.advance(10.5)
f = self.failureResultOf(d)

self.assertIsInstance(f.value, RequestSendFailed)
self.assertTrue(f.value.can_retry)
self.assertIsInstance(f.value.inner_exception, defer.TimeoutError)

@parameterized.expand(["get_json", "post_json", "delete_json", "put_json"])
def test_timeout_reading_body(self, method_name: str) -> None:
"""
Expand Down
Loading