Skip to content

Failover mechanism is not working when connection is reset by peer on Initiator #402

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
esanchezros opened this issue Jun 24, 2021 · 1 comment
Labels

Comments

@esanchezros
Copy link
Contributor

esanchezros commented Jun 24, 2021

Describe the bug
We have an initiator configured with 2 acceptors and it connects to them via an sTunnel service running locally:

SocketConnectHost=localhost
SocketConnectPort=44445
SocketConnectHost1=localhost
SocketConnectPort1=44446

If the first acceptor is offline, the initiator keeps on trying the first acceptor and never moves on to the failover ones.

To Reproduce

  • Configure the initiator with 2 acceptors using any IP address and port that accepts tcp traffic.
  • Start the initiator
  • The connection is attempted with the first acceptor, a logon attempt is made, the connection times out, the connection is disconnected (see the logs).
.l.a.c.f.a.Application                   : Adding logon details. Username TargetCompID
quickfixj.msg.outgoing                   : FIXT.1.1:SenderCompID/TargetCompID->MAIN: 8=FIXT.1.1|9=112|35=A|34=459|49=SenderCompID|50=TargetCompID|52=20210624-09:29:29.788|56=MAIN|98=0|108=30|553=TargetCompID|554=xxxxxxxx|1137=9|10=158|
quickfixj.event                          : FIXT.1.1:SenderCompID/TargetCompID->MAIN: Initiated logon request
quickfixj.errorEvent                     : FIXT.1.1:SenderCompID/TargetCompID->MAIN: Disconnecting: Socket exception (localhost/127.0.0.1:44445): java.io.IOException: Connection reset by peer
l.a.c.f.m.MonitoringSessionStateListener : onDisconnect: session FIXT.1.1:SenderCompID/TargetCompID->MAIN[in:461,out:460] disconnected
l.a.c.f.m.MonitoringSessionStateListener : onLogout: session FIXT.1.1:SenderCompID/TargetCompID->MAIN[in:461,out:460] logged out
l.a.c.f.m.MonitoringSessionStateListener : onConnect: session FIXT.1.1:SenderCompID/TargetCompID->MAIN[in:461,out:460] connected
quickfixj.event                          : FIXT.1.1:SenderCompID/TargetCompID->MAIN: MINA session created: local=/127.0.0.1:49158, class org.apache.mina.transport.socket.nio.NioSocketSession, remote=localhost/127.0.0.1:44445
.l.a.c.f.a.Application                   : Adding logon details. Username TargetCompID
quickfixj.msg.outgoing                   : FIXT.1.1:SenderCompID/TargetCompID->MAIN: 8=FIXT.1.1|9=112|35=A|34=460|49=SenderCompID|50=TargetCompID|52=20210624-09:29:59.785|56=MAIN|98=0|108=30|553=TargetCompID|554=xxxxxxxx|1137=9|10=150|
quickfixj.event                          : FIXT.1.1:SenderCompID/TargetCompID->MAIN: Initiated logon request
quickfixj.errorEvent                     : FIXT.1.1:SenderCompID/TargetCompID->MAIN: Disconnecting: Socket exception (localhost/127.0.0.1:44445): java.io.IOException: Connection reset by peer
l.a.c.f.m.MonitoringSessionStateListener : onDisconnect: session FIXT.1.1:SenderCompID/TargetCompID->MAIN[in:461,out:461] disconnected
l.a.c.f.m.MonitoringSessionStateListener : onLogout: session FIXT.1.1:SenderCompID/TargetCompID->MAIN[in:461,out:461] logged out

Expected behavior
I would expect the initiator to switch over to the failover acceptors when a socket connection failure happens. This is true when the IP address/hostname is not resolvable (the failover mechanism works as expected).

system information:

  • OS: Linux
  • Java 11
  • QFJ Version 2.3.0

Additional context
Is there a way to trigger the failover to the next acceptor programmatically?

@chrjohn
Copy link
Member

chrjohn commented Jul 9, 2021

Hi @esanchezros , thanks for the report and sorry for the delay.
Unfortunately there is no way to trigger the failover programatically. This would probably a sensible enhancement since the current failover mechanism is not customizable. However, there were some plans to make it more customizable (e.g. by implementing a custom strategy) but up to now there was no time to do so.

Cheers,
Chris.

P.S.: #250 also is related. Some months ago there was a discussion on the mailing list about this topic here: https://sourceforge.net/p/quickfixj/mailman/quickfixj-users/thread/CAFn6DsGzB%3DeFSPt_B0LAZtG_jq%3Dr9HBTSW17aBv90Jrdo4801Q%40mail.gmail.com/#msg37249310

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants