PROTON-2931: epoll proactor thread races using async c-ares name reso…#444
PROTON-2931: epoll proactor thread races using async c-ares name reso…#444cliffjansen wants to merge 1 commit intoapache:mainfrom
Conversation
astitcher
left a comment
There was a problem hiding this comment.
Some small , but important issues to address, and some cosmetic, but imo significant issues to address.
| static void connection_done_cb(void *user_data, struct addrinfo *ai, int gai_error) { | ||
| pconnection_t *pc = (pconnection_t *)user_data; | ||
| lock(&pc->task.mutex); | ||
| pc->name_lookup_pending = false; |
There was a problem hiding this comment.
If this is within the lock, maybe it should be in the _lh function not here? Especially as setting the flag is done in a _lh function.
| bool notify = false; | ||
| if (gai_error) { | ||
|
|
||
| if (pconnection_rclosed(pc) && pconnection_wclosed(pc)) { |
There was a problem hiding this comment.
Is it possible to only have one of these set in this condition?
| if (pn_condition_is_set(pn_transport_condition(tp))) | ||
| return false; | ||
| } | ||
| return !pc->queued_disconnect && !pni_task_wake_pending(&pc->task); |
There was a problem hiding this comment.
This is pretty obscure compared to the previous return value (which would essentially be true at his point I think if there wer no errors from starting the lookup), perhaps an explanation in a comment is in order.
| /* Called on initial connect, and if connection fails to try another address */ | ||
| /* May be called within the praw_connection task or from an external name_lookup task */ | ||
| static void praw_connection_maybe_connect_lh(praw_connection_t *prc) { | ||
| int err = 0; |
There was a problem hiding this comment.
Move this to just before the while loop where it belongs, also no need to set it to 0 here - it is initialised first thing in the loop.
| static void raw_connection_done_cb(void *user_data, struct addrinfo *ai, int gai_error) { | ||
| praw_connection_t *prc = (praw_connection_t *)user_data; | ||
| lock(&prc->task.mutex); | ||
| prc->name_lookup_pending = false; |
There was a problem hiding this comment.
As above shouldn't this be inside the _lh function? Even though there is no _lh function anymore for lookup start here.
| shutdown(prc->psocket.epoll_io.fd, SHUT_RDWR); | ||
| return; | ||
| } | ||
| if (prc->name_lookup_pending) { |
There was a problem hiding this comment.
What about the protecting lock?
| const char *host; | ||
| const char *port; | ||
| pn_proactor_t *p = prc->task.proactor; | ||
| bool notify = false; |
There was a problem hiding this comment.
Move this down to where it's used - I think inside the if (!rc) block
| const char *host; | ||
| const char *port; |
There was a problem hiding this comment.
Actually these lines should also be just before they are used in (new) line 253
| if (wake_event) { | ||
| lock(&rc->task.mutex); | ||
| t->working = true; | ||
| return &rc->batch; |
There was a problem hiding this comment.
It looks wrong to return without unlocking here - if it actually correct then a comment explaining why is needed.
fixes router_locust zombie connections