-
Notifications
You must be signed in to change notification settings - Fork 603
IPaddr2: add support to kill dangling IP connections #2076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
IPaddr2: add support to kill dangling IP connections #2076
Conversation
…ing an ip address
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2076/1/input |
heartbeat/IPaddr2
Outdated
| local ss_line="" | ||
| local ss_out_loglevel="info" | ||
|
|
||
| local ipaddr="$1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the first line of the function.
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2076/2/input |
Also fix ss_output variable missing '$'
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2076/3/input |
|
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/resource-agents/job/resource-agents-pipeline/job/PR-2076/4/input |
|
What is the use-case for this? Usually the service (being run by the cluster as well) would try to close the connections cleanly before the IPaddr2 resource is stopped, if you got them in a group or with constraints to start the service after the IP resource. If this is because of connections in TIME_WAIT state in |
|
Correction, it doesnt go against the RFC as it sends RST packet to both endpoints, but it seems to be mostly needed to clean up connections "dead" connections from running processes, which doesnt seem to match the usual use-cases for this agent.
|
The use case is this (I think there are several cases where this patch can be very useful). We have 2 nodes with postgresql servers, in streaming, one in host-standby attending read-only queries, and other in read/write attending read/write queries, in front of those servers, we have pgbouncer, one process in each node, same config for two nodes. When we want to change ip address for read-only queries from one server to the other, we don't want to stop pgbouncer neither postgresql, we could stop read-only pgbouncer before change ip address, but the problem appears when we have the two ip addresses in same server (and only one pgbouncer for all connections), and we want to change read-only ip address to the other server, we don't want to stop pgbouncer that holds read-write and read-only queries. In that case, if we change the ip address, connections to the ip address that we move, get dangling in server and also in clients. We have try to improve our linux behavior tunning tcp parameters like tcp_keepalive and othe parameters, but when we remove ip address from de server, it doesn't work as espectedm depens on the server proccess With this patch we are sure that we close all connections to the ip address before delete it, and clients reaction to that connection reset is quickly. Also thingking in this other way: Other use case where this can be very usefull. I know that it's unusual to need this behavior, but I'm also sure that this option is going to be very usefull for many use cases |
I have seen some cases where when you delete an ip address while there are stablished connections, the connections stay dangled.
This patch try to avoid that cases.
First it kills connections before deleting the ip address to inform the clients, and after deleting the ip address to kill connections that can initiated between first kill and ip deletion.
Other approach could be to kill connections in fuction "delete_interface", before "addr delete"
To be able to use "ss -K", you need a kernel with config option CONFIG_INET_DIAG_DESTROY set.