Skip to content

PROTON-2931: epoll proactor thread races using async c-ares name reso…#444

Open
cliffjansen wants to merge 1 commit intoapache:mainfrom
cliffjansen:pn2931_2
Open

PROTON-2931: epoll proactor thread races using async c-ares name reso…#444
cliffjansen wants to merge 1 commit intoapache:mainfrom
cliffjansen:pn2931_2

Conversation

@cliffjansen
Copy link
Copy Markdown
Contributor

fixes router_locust zombie connections

Copy link
Copy Markdown
Member

@astitcher astitcher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small , but important issues to address, and some cosmetic, but imo significant issues to address.

Comment thread c/src/proactor/epoll.c
static void connection_done_cb(void *user_data, struct addrinfo *ai, int gai_error) {
pconnection_t *pc = (pconnection_t *)user_data;
lock(&pc->task.mutex);
pc->name_lookup_pending = false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is within the lock, maybe it should be in the _lh function not here? Especially as setting the flag is done in a _lh function.

Comment thread c/src/proactor/epoll.c
bool notify = false;
if (gai_error) {

if (pconnection_rclosed(pc) && pconnection_wclosed(pc)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to only have one of these set in this condition?

Comment thread c/src/proactor/epoll.c
if (pn_condition_is_set(pn_transport_condition(tp)))
return false;
}
return !pc->queued_disconnect && !pni_task_wake_pending(&pc->task);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty obscure compared to the previous return value (which would essentially be true at his point I think if there wer no errors from starting the lookup), perhaps an explanation in a comment is in order.

/* Called on initial connect, and if connection fails to try another address */
/* May be called within the praw_connection task or from an external name_lookup task */
static void praw_connection_maybe_connect_lh(praw_connection_t *prc) {
int err = 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to just before the while loop where it belongs, also no need to set it to 0 here - it is initialised first thing in the loop.

static void raw_connection_done_cb(void *user_data, struct addrinfo *ai, int gai_error) {
praw_connection_t *prc = (praw_connection_t *)user_data;
lock(&prc->task.mutex);
prc->name_lookup_pending = false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above shouldn't this be inside the _lh function? Even though there is no _lh function anymore for lookup start here.

shutdown(prc->psocket.epoll_io.fd, SHUT_RDWR);
return;
}
if (prc->name_lookup_pending) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the protecting lock?

const char *host;
const char *port;
pn_proactor_t *p = prc->task.proactor;
bool notify = false;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this down to where it's used - I think inside the if (!rc) block

Comment on lines 246 to 247
const char *host;
const char *port;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually these lines should also be just before they are used in (new) line 253

if (wake_event) {
lock(&rc->task.mutex);
t->working = true;
return &rc->batch;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks wrong to return without unlocking here - if it actually correct then a comment explaining why is needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants