Implement HappyEyeballs for NettyConnectionPool (#1996) by 987Nabil · Pull Request #3876 · zio/zio-http

987Nabil · 2026-01-01T11:06:35Z

fixes #1996
/claim #1996

netlify · 2026-01-01T11:06:40Z

✅ Deploy Preview for zio-http ready!

Name	Link
🔨 Latest commit	`54df420`
🔍 Latest deploy log	https://app.netlify.com/projects/zio-http/deploys/69b4274b68e6770008eb1af3
😎 Deploy Preview	https://deploy-preview-3876--zio-http.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

kyri-petrou · 2026-01-19T01:59:47Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+          connectToAddress(
+            addresses.head,
+            channelFactory,
+            eventLoopGroup,
+            location,
+            initializer,
+            connectionTimeout,
+            localAddress,
+          ).onExit {
+            case e: Exit.Success[JChannel] => successful.update(channels => channels :+ e.value)
+            case _: Exit.Failure[_]        => lastFailed.offer(()).unit
+          },


Maybe put all of this in a function and reuse it to make the code a bit more readable:

connect0 = connectToAddress( _, channelFactory, eventLoopGroup, location, initializer, connectionTimeout, localAddress, ).onExit { case e: Exit.Success[JChannel] => successful.update(channels => channels :+ e.value) case _: Exit.Failure[_] => lastFailed.offer(()).unit }

kyri-petrou · 2026-01-19T02:01:27Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+          case ch: Some[JChannel] => ZIO.succeed(ch.value)
+          case None               => ZIO.fail(new RuntimeException("All connection attempts failed"))
+        }
+        _          <- ZIO.foreachDiscard(channels.tail)(ch => ZIO.ignore(ch.close()))


There's no guarantee that the channel we selected via the collectFirst will be at the head of the list (since the first one might fail the c.isOpen condition)

kyri-petrou · 2026-01-19T02:04:59Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+            case _: Exit.Failure[_]        => lastFailed.offer(()).unit
+          },
+          addresses.tail.zipWithIndex.map { case (address, index) =>
+            ZIO.sleep(HappyEyeballsDelay * index.toDouble).raceFirst(lastFailed.take).ignore *>


I'm not sure if the usage of a Queue here is intended or if it's a bug.

With a Queue, only one of the fibers will be "interrupted" via raceFirst(lastFailed.take) because only a single fiber can poll the item from the queue. Is this a bug? If this is the intended operation, then why do we want to "interrupt" a single connection attempt only?

kyri-petrou · 2026-01-19T02:05:17Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+          ).onExit {
+            case e: Exit.Success[JChannel] => successful.update(channels => channels :+ e.value)
+            case _: Exit.Failure[_]        => lastFailed.offer(()).unit
+          },


Don't we need raceFirst(lastFailed.take).ignore here as well?

kyri-petrou · 2026-01-19T02:23:40Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+            connectionTimeout,
+            localAddress,
+          ).onExit {
+            case e: Exit.Success[JChannel] => successful.update(channels => channels :+ e.value)


Previously I described how there's an issue here but perhaps my comment wasn't clear enough. What I meant was that since the release of the connections is managed by the Scope, you might end up in a case that the connection is acquired but not released until the Scope is closed, even if that connection lost the race.

Hope this reproducer helps to understand the issue:

import zio.* final class Foo(val i: Int) { def acquired() = println(s"acquired: $i") def released() = println(s"released: $i") } object Main extends ZIOAppDefault { val sleepRandom = ZIO.randomWith(_.nextIntBounded(100).flatMap(i => ZIO.sleep(i.millis))) def resource(i: Int) = { val res = ZIO.acquireRelease(ZIO.succeed { val foo = new Foo(i) foo.acquired() foo })(f => ZIO.succeed(f.released())) (res.map(_.i) <* sleepRandom <* ZIO.debug(s"initialized: $i")).onInterrupt(ZIO.debug(s"interrupted: $i")) } def run = { val rs = (1 to 4).map(resource).toList val f = for { won <- ZIO.raceAll(rs.head, rs.tail) _ <- ZIO.sleep(10.millis) _ <- ZIO.debug(s"doing some stuff with resource: $won") _ <- ZIO.sleep(5.seconds) _ <- ZIO.debug("Finished work") } yield () ZIO.scoped(f) } }

This prints the following which is really problematic, because released: 2 should be printed before the "Finished work" message.

acquired: 2 acquired: 1 initialized: 1 interrupted: 2 doing some stuff with resource: 1 Finished work released: 1 released: 2

Copilot

Pull request overview

This PR implements the Happy Eyeballs algorithm (RFC 8305) for NettyConnectionPool to improve connection reliability and performance by racing IPv6 and IPv4 connection attempts with staggered delays.

Changes:

Replaced sequential connection attempts with the Happy Eyeballs algorithm that races multiple address families
Added address sorting logic to prioritize IPv6 while alternating between address families
Refactored connection logic to support both managed and unmanaged channel creation

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
NettyConnectionPool.scala	Implements Happy Eyeballs connection algorithm with address sorting, concurrent connection racing, and proper channel lifecycle management
NettyConnectionPoolSpec.scala	Adds comprehensive test suite for address sorting logic covering various IPv4/IPv6 combinations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-26T17:02:53Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+            _         <- ZIO.raceAll(
+              attemptConnect(addresses.head),
+              addresses.tail.zipWithIndex.map { case (address, index) =>
+                ZIO.sleep(HappyEyeballsDelay * (index + 1).toDouble).raceFirst(failedSub.take).ignore *>


The delay calculation uses (index + 1).toDouble which means the first tail address waits for 250ms * 2 = 500ms (index 0 becomes 2), then 750ms for the second, etc. This should likely be index.toDouble so the first tail address waits 0ms (starting immediately after the head), the second waits 250ms, and so on. The current implementation introduces an unnecessary extra delay.

Suggested change

ZIO.sleep(HappyEyeballsDelay * (index + 1).toDouble).raceFirst(failedSub.take).ignore *>

ZIO.sleep(HappyEyeballsDelay * index.toDouble).raceFirst(failedSub.take).ignore *>

Copilot · 2026-01-26T17:02:54Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+            _         <- ZIO.raceAll(
+              attemptConnect(addresses.head),
+              addresses.tail.zipWithIndex.map { case (address, index) =>
+                ZIO.sleep(HappyEyeballsDelay * (index + 1).toDouble).raceFirst(failedSub.take).ignore *>
+                  attemptConnect(address)
+              },
+            )


The raceAll terminates as soon as the first connection succeeds, but ongoing connection attempts continue running in the background. These lingering fibers could establish connections that are then immediately closed at lines 284-286, creating unnecessary network overhead. Consider interrupting pending connection attempts when the first one succeeds by using fiber-based racing with explicit interruption.

Suggested change

_ <- ZIO.raceAll(

attemptConnect(addresses.head),

addresses.tail.zipWithIndex.map { case (address, index) =>

ZIO.sleep(HappyEyeballsDelay * (index + 1).toDouble).raceFirst(failedSub.take).ignore *>

attemptConnect(address)

},

)

// Build all connection attempt effects (first immediately, others with Happy Eyeballs delays)

attempts = attemptConnect(addresses.head) ::

addresses.tail.zipWithIndex.map { case (address, index) =>

ZIO

.sleep(HappyEyeballsDelay * (index + 1).toDouble)

.raceFirst(failedSub.take)

.ignore *>

attemptConnect(address)

}

// Fork each attempt into its own fiber so we can explicitly interrupt the losers

fibers <- ZIO.foreach(attempts)(_.fork)

// Race on joining the fibers; once one completes, interrupt all others

_ <- ZIO

.raceAll(

fibers.head.join,

fibers.tail.map(_.join),

)

.onExit(_ => ZIO.foreachDiscard(fibers)(_.interrupt.ignore))

Copilot · 2026-01-26T17:02:54Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+
+        channel <- channels.find(_.isOpen) match {
+          case Some(ch) => ZIO.succeed(ch)
+          case None     => ZIO.fail(new RuntimeException("All connection attempts failed"))


The error message 'All connection attempts failed' doesn't provide useful debugging information. Consider including details such as the number of addresses attempted and the target host to help diagnose connection failures.

Suggested change

case None => ZIO.fail(new RuntimeException("All connection attempts failed"))

case None =>

val attemptedCount = addresses.size

val targetDesc =

addresses.headOption

.map(addr => s"${addr.getHostName}/${addr.getHostAddress}")

.getOrElse("unknown target")

ZIO.fail(

new RuntimeException(

s"All connection attempts failed for $targetDesc (attempted $attemptedCount address(es))",

),

)

Copilot · 2026-01-26T17:02:54Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+  )(implicit trace: Trace): ZIO[Scope, Throwable, JChannel] = {
+
+    if (resolvedHosts.isEmpty) {
+      ZIO.fail(new RuntimeException("No addresses to connect to"))


The error message 'No addresses to connect to' lacks context. Consider including the hostname or location that failed to resolve to help with debugging DNS resolution issues.

Suggested change

ZIO.fail(new RuntimeException("No addresses to connect to"))

ZIO.fail(new RuntimeException(s"No addresses to connect to for location: $location"))

kyri-petrou · 2026-01-30T11:17:04Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+        }
+
+        _ <- ZIO.foreachDiscard(channels.filterNot(_ eq channel)) { ch =>
+          ZIO.succeed(ch.close()).ignore


Suggested change

ZIO.succeed(ch.close()).ignore

ZIO.ignore(ch.close())

Done — applied ZIO.ignore(ch.close()) as suggested.

kyri-petrou · 2026-01-30T11:19:50Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+        )
+
+      for {
+        failedHub  <- Hub.dropping[Unit](requestedCapacity = 1)


I might have missed the previous response for this, but isn't a Promise better suited for this? Since we only care about fulfilling it once I think it's a better implementation for this, wdut?

Good call — switched to Promise.make[Nothing, Unit]. The one-shot semantics are actually what we want here: per RFC 8305 Section 5, a failure should trigger the next staggered attempt to start immediately, but should NOT cancel or affect other running attempts. Promise.succeed(()).unit is idempotent, so subsequent failures are no-ops. This also let me remove the ZIO.scoped wrapper that was only needed for Hub.subscribe.

kyri-petrou · 2026-01-30T11:21:40Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+
+      NettyFutureExecutor
+        .executed(channelFuture)
+        .onInterrupt(ZIO.succeed(channelFuture.cancel(true)) *> ZIO.succeed(ch.close()).ignore)


Suggested change

.onInterrupt(ZIO.succeed(channelFuture.cancel(true)) *> ZIO.succeed(ch.close()).ignore)

.onInterrupt(ZIO.ignore { channelFuture.cancel(true); ch.close() })

Done — applied ZIO.ignore { channelFuture.cancel(true); ch.close() } as suggested.

kyri-petrou · 2026-01-30T11:25:22Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+            _         <- ZIO.raceAll(
+              attemptConnect(addresses.head),
+              addresses.tail.zipWithIndex.map { case (address, index) =>
+                ZIO.sleep(HappyEyeballsDelay * (index + 1).toDouble).raceFirst(failedSub.take).ignore *>


Is this the right behaviour according to the spec? In the case that a single connection attempt fails, should all the other ones be cancelled?

Good question — per RFC 8305 Section 5: "Starting a new connection attempt does not affect previous attempts" and "Once one of the connection attempts succeeds, all other connections attempts that have not yet succeeded SHOULD be canceled." So failures should NOT cancel other attempts. The previous Hub-based approach was incorrect because Hub broadcasts to ALL subscribers simultaneously, meaning a single failure would unblock ALL waiting delays at once. Switched to Promise which has one-shot semantics — only the first failure unblocks the next waiting delay, subsequent failures are no-ops. The raceAll handles cancelling losers on success.

kyri-petrou · 2026-01-30T11:31:47Z

zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnectionPool.scala

+    initializer: ChannelInitializer[JChannel],
+    connectionTimeout: Option[Duration],
+    localAddress: Option[InetSocketAddress],
+  )(implicit trace: Trace): ZIO[Scope, Throwable, JChannel] = {


I feel like we need to control interruption more closely here (using uninterruptibleMask) and allow it only during NettyFutureExecutor.executed(channelFuture). Otherwise we're risking interruption creeping in between effect evaluations and leaking connections

Good point — added ZIO.uninterruptibleMask { restore => ... } wrapping the entire connectToAddressUnmanaged body, with restore() applied only to NettyFutureExecutor.executed(channelFuture). This follows the same pattern used in ZioNettyConnectionPool.get and ensures the channel creation + onInterrupt handler registration happen in an uninterruptible region, preventing leaks.

…ionPool.scala

Co-authored-by: kyri-petrou <67301607+kyri-petrou@users.noreply.github.com>

- Extract connect0 helper for code readability - Fix channels.tail bug: filter by identity, not position - Use Hub instead of Queue for broadcast failure notification - Fix resource leak: manually manage connections in race, close losers immediately - Add connectToAddressUnmanaged for race scenarios - Fix delay calculation for staggered attempts - Add comprehensive tests for sortAddresses

…FC 8305

…revent connection leaks

987Nabil requested review from jdegoes, kyri-petrou and vigoo as code owners January 1, 2026 11:06

algora-pbc bot added the 🙋 Bounty claim label Jan 1, 2026

987Nabil force-pushed the happy-eyeballs branch from 8bc08ea to fcddfc8 Compare January 17, 2026 07:53

kyri-petrou reviewed Jan 19, 2026

View reviewed changes

987Nabil requested a review from Copilot January 26, 2026 17:01

Copilot AI reviewed Jan 26, 2026

View reviewed changes

kyri-petrou reviewed Jan 30, 2026

View reviewed changes

987Nabil mentioned this pull request Mar 3, 2026

NoRouteToHostException when making a request to https url from a zio-http Client using Scala 3 #2280

Closed

987Nabil and others added 9 commits March 13, 2026 15:36

Implement HappyEyeballs for NettyConnectionPool (zio#1996)

4b80e8e

Update zio-http/jvm/src/main/scala/zio/http/netty/client/NettyConnect…

143392c

…ionPool.scala

Apply suggestions from code review

12f22e1

Co-authored-by: kyri-petrou <67301607+kyri-petrou@users.noreply.github.com>

Protect against race conditions during connection retrieval (zio#1996)

469bca5

fix(netty): use ZIO.ignore style per review

6b8b263

fix(netty): use Promise for failure signaling in Happy Eyeballs per R…

56b2c2f

…FC 8305

fix(netty): add uninterruptibleMask to connectToAddressUnmanaged to p…

a5454d5

…revent connection leaks

style: format code

54df420

987Nabil force-pushed the happy-eyeballs branch from 9864b7f to 54df420 Compare March 13, 2026 15:03

	ZIO.sleep(HappyEyeballsDelay * (index + 1).toDouble).raceFirst(failedSub.take).ignore *>
	ZIO.sleep(HappyEyeballsDelay * index.toDouble).raceFirst(failedSub.take).ignore *>

-          case None     => ZIO.fail(new RuntimeException("All connection attempts failed"))
+          case None =>
+            val attemptedCount = addresses.size
+            val targetDesc     =
+              addresses.headOption
+                .map(addr => s"${addr.getHostName}/${addr.getHostAddress}")
+                .getOrElse("unknown target")
+            ZIO.fail(
+              new RuntimeException(
+                s"All connection attempts failed for $targetDesc (attempted $attemptedCount address(es))",
+              ),
+            )

	ZIO.fail(new RuntimeException("No addresses to connect to"))
	ZIO.fail(new RuntimeException(s"No addresses to connect to for location: $location"))

	.onInterrupt(ZIO.succeed(channelFuture.cancel(true)) *> ZIO.succeed(ch.close()).ignore)
	.onInterrupt(ZIO.ignore { channelFuture.cancel(true); ch.close() })

Uh oh!

Conversation

987Nabil commented Jan 1, 2026

Uh oh!

netlify bot commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for zio-http ready!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

netlify bot commented Jan 1, 2026 •

edited

Loading