-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared memory performance loss compared with standalone iceoryx #2155
Comments
Hi @OscarMrZ, thank you for the detailed information. I know it took a while to respond, but that's because it took a while to get everything in place to reproduce it. (I am using macOS, so I can't trivially reproduce it.) At least I think this counts as a reproduction 😀: Just from the general shape and numbers, it has be to (de)serializing. The flame graph we got fits with that: ( This is with Cyclone 0.9, the current master should definitely do better, and I think the 0.10 will also be better. Now that we have everything set up, the next step is to have a look at whether it got fixed already or whether there's a bug in there. More to come. |
We tested, I was overly optimistic about the 0.10 version, it still has the old code for taking samples that doesn't properly support loans. In The API is the same in principle, but we deprecated If you replace the
it'll work. Not super pretty, it is only a quick hack. (And to be honest, I hacked the quick hack we used without trying out the hacked hack first, so ... 🤞) The Cyclone DDS RMW layer for ROS 2 needs updating for |
Hello @eboasson, Thank you for this amazing response! I'll try your patch for the performance tool and try to replicate. That release you mention from master would definetely be amazing, I'll play around. |
Hello everyone,
I’ve been testing the performance of CycloneDDS with shared memory (SHM) enabled to demonstrate the performance improvements it should provide.
Using the APEX performance test package and this dockerfile they provide, I ran the following test:
This test measures the average latency between a publisher publishing an 8MB payload at 30Hz and an increasing number of subscribers, for the three different available transports available for Cyclone: UDP(copy), SHM, and SHM with loaned samples. The QoS are the ones specified in the YAML.
These are the results:
I anticipated a substantial performance boost with SHM enabled, regardless of message size or the number of subscribers. However, from the previous plot we can see that:
For reference, these are the results of the same test with standalone iceoryx, using this dockerfile as a testing environment. In this case, only the latency for LOANED transport was measured as the other two are not supported.
Both experiments were run with the RouDi mempools config provided here.
Notice that while the latency seems to increase, it is on the order of magnitude of hundredths of milliseconds and the differences are minimal. While I could understand the latency increasing slightly due to the increased management needed for the subscriber queues, the differences of magnitude escape my understanding.
To the best of my knowledge, when enabling shared memory in Cyclone, the behavior should be very similar to the one in the second graph, as it uses iceoryx behind the scenes. Are you guys aware of any reason this could not be the case?
Many thanks in advance!
The text was updated successfully, but these errors were encountered: