ws4py remarks #20

Lawouach · 2012-06-17T08:42:04Z

Hi there, ws4py's author here.

Thanks for the benchmark, though there can always be some concerns over their design, environment and execution, I find them useful and interesting nonetheless.

Just a couple of remarks for posterity:

ws4py was initially designed as a playground for implementing WebSocket in a specific way (using generators in Python). It wasn't implemented with high number of connections in mind. I thought this could be implemented gradually afterwards. I'm not surprised it didn't do that well yet.
ws4py runs much faster on PyPy, do you think it'd be possible for you to test that configuration as well/instead?
Here are a few test results comparing Tornado and Autobahn on my box along side ws4py. http://www.defuze.org/oss/ws4py/testreports/servers/0.2.1/

Thanks,

jlouis · 2012-06-17T09:51:54Z

Do you have any idea why you are seeing all those connection timeouts? I wonder why that happens - perhaps it is the TCP accept() backlog default of 128 which is causing trouble here. So when you are having trouble keeping up with the backlog just once in a while, your connection timeouts increase wildly.

Lawouach · 2012-06-17T10:38:53Z

I'll admit, I've never loaded ws4py that much so it's only guesses, specially with the fact I don't usually run the gevent implementation but rather the CherryPy/good ol' threads server. However, you are probably right, the backlog likely fills up quickly and I would definitely increase it. Skimming through gevent's code, the backlog seems to be at 50 by default on a stream server.

I would really need to profile ws4py to understand where it spends most of its time. I know that the (un)masking is actually heavy on the process all things considered, but here the data sent is so tiny it shouldn't hurt the results.

Looking at the reports I've linked above, I'd also be very interested if if the benchmark was executed with pypy. I have no doubt ws4py could do better if I could find the time to work on it more.

jlouis · 2012-06-17T11:13:32Z

For Python it can't be GC unless it is the cycle detector blocking the VM. So my guess is that the system can't keep up with the load, overflows the backlog queue and then stuff begins timing out. Increasing the queue will stop timeouts, but it will then also make latencies worse all over the place.

ericmoritz · 2012-06-17T15:03:13Z

I would really like to figure out why these servers drop connections like they do. For little else to determine what the optimal configuration for each platform is.

Lawouach · 2012-06-17T15:15:51Z

You might want to start increasing the socket backlog. In your ws4py runner, just add backlog=XYZ to the WebSocketServer(...) call.

server = WebSocketServer(('', 8000), backlog=128, websocket_class=EchoServer)

perone · 2012-06-17T15:49:55Z

I really think you should increase the net.core.somaxconn parameter of your setup, this could be the cause of the timeouts. It would be nice to check your syslog to verify if it isn't sending tcp cookies too, syncookies can cause timeouts and disconnections in benchmarks like that.

ericmoritz · 2012-06-17T20:40:14Z

@perone Excellent, I'll try that soon. It's not hard to get the timeouts on the other platforms, they occur very early in the test. I'll create a gist of the syslog for you to look at as well.

ericmoritz · 2012-06-17T21:05:43Z

I get this in the syslog on the server:

Jun 17 21:03:39 ip-10-36-118-97 kernel: [1125015.358550] TCP: Possible SYN flooding on port 8000. Sending cookies.  Check SNMP counters.

ericmoritz · 2012-06-17T22:08:45Z

@jlouis What puzzles me about the timeouts is that Erlang seems to be immune to them while the others. In fact in the most recent benchmark, Go hit 10,000 clients as well. I haven't had a chance to summarize the event data yet but the meminfo files have the connection counts for each server:

https://github.com/ericmoritz/wsdemo/tree/eleveldb-logging/results

Wouldn't an untuned TCP stack affect all the servers equally?

jlouis · 2012-06-17T22:19:35Z

You can set the backlog when you open the listen-socket, which is one thing to bear in mind. Another point is that if your Erlang code has plenty of available processes waiting in accept state, then there is no backlog introduced at all since there is an accepting process you can pair off the incoming connection with. I bet your code will spawn a new accepting process and that this process will call gen_tcp:accept(LSock) fairly quickly, thus establishing a 0 backlog scenario.

Say you start with 1000 of these processes. Then your backlog is, practically, 1000+D where D is the default. If the python system runs with a single accepting loop say, then surely your backlog is at most 1+D. In effect, Erlang can now tolerate a way higher amount of quick connections since there is a process to absorb it. Whereas you will quickly see timeouts in python because the only thing the kernel can do is to drop connection attempts under the assumptions that the python process is under heavy pressure.

This also gives a plausible explanation as to why the behaviour is different. But you should really check my hypothesis by reading code :)

ericmoritz · 2012-06-17T23:01:17Z

Keep in mind that the kernel is not triggering this timeout. It is the TCP client in my erlang client. I set the connection timeout to 2 seconds to determine if the server was unavailable. I had no way to tell if a small number of successful clients were due to an error on my part or if the server became unavailable. There is no timeout if the TCP connection was accepted.

jlouis · 2012-06-18T07:56:24Z

Ah! So it is a question of semantics then. The problem is, to the best of my knowledge, that the server can't keep up within the 2 second timeframe then. This means it answers in some value above 2000 ms but at that point the client has already registered the connection as a lost one.

If we graph the kernel density of response times, we can glean and see if that could be the case.

ericmoritz · 2012-06-18T20:56:34Z

Do you have enough data to graph the kernel density?

jlouis · 2012-06-18T21:49:56Z

More than! The problem is that I have too much :)

ericmoritz · 2012-06-19T00:10:29Z

@Lawouach I'm trying to run ws4py using pypy. How did you run it? Did you use cherrypy or gevents?

Lawouach · 2012-06-19T05:16:42Z

Yes I did. Gevent doesn't run on PyPy IIRC. I used CherryPy 3.2.2 and PyPy 1.8.

ericmoritz · 2012-06-20T15:09:25Z

Someone just submitted code to run tornado under pypy. If you're still curious I'll write an implementation using ws4py and cherrypy.

I also wonder if anyone has written a ws server or a http server using pypy's native greenlets module. Perhaps gunicorn?

Lawouach · 2012-06-20T15:17:00Z

Not that I'm aware of but that'd be interesting indeed. Regarding CherryPy and ws4py, you may simply use this code:

https://github.com/Lawouach/WebSocket-for-Python/blob/master/test/autobahn_test_servers.py#L4

That worked just fine with CP 3.2.2. and PyPy 1.8 (didn't try with more recent releases).

You may want to remove the two lines about logging (l28/29) which are not relevant to the test.

Also you may want to add following config settings:

'server.thread_pool': 128
'server.socket_queue_size': 128

To cherrypy.config.update(...)

Lawouach · 2012-06-20T15:18:05Z

I guess that'd be better if I could submit a pull-request for it but I won't have the time before tomorrow or even this week-end unfortunately :/

ericmoritz · 2012-06-20T15:51:41Z

I'll write up a simple server and submit a pull request that you can take a glance at. I wrote one yesterday based on your echo server but I think I deleted. I know it didn't take very long.

ericmoritz closed this as completed Jun 19, 2012

ericmoritz reopened this Jun 19, 2012

ghost assigned ericmoritz Jun 22, 2012

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ws4py remarks #20

ws4py remarks #20

Lawouach commented Jun 17, 2012

jlouis commented Jun 17, 2012

Lawouach commented Jun 17, 2012

jlouis commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

Lawouach commented Jun 17, 2012

perone commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

jlouis commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

jlouis commented Jun 18, 2012

ericmoritz commented Jun 18, 2012

jlouis commented Jun 18, 2012

ericmoritz commented Jun 19, 2012

Lawouach commented Jun 19, 2012

ericmoritz commented Jun 20, 2012

Lawouach commented Jun 20, 2012

Lawouach commented Jun 20, 2012

ericmoritz commented Jun 20, 2012

ws4py remarks #20

ws4py remarks #20

Comments

Lawouach commented Jun 17, 2012

jlouis commented Jun 17, 2012

Lawouach commented Jun 17, 2012

jlouis commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

Lawouach commented Jun 17, 2012

perone commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

jlouis commented Jun 17, 2012

ericmoritz commented Jun 17, 2012

jlouis commented Jun 18, 2012

ericmoritz commented Jun 18, 2012

jlouis commented Jun 18, 2012

ericmoritz commented Jun 19, 2012

Lawouach commented Jun 19, 2012

ericmoritz commented Jun 20, 2012

Lawouach commented Jun 20, 2012

Lawouach commented Jun 20, 2012

ericmoritz commented Jun 20, 2012