-
Notifications
You must be signed in to change notification settings - Fork 418
Gracefully handle time jumps #2255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gracefully handle time jumps #2255
Conversation
f9a4f4d to
95a4b44
Compare
dpotman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! A few minor things:
- the test code for the C++ API is using
ddsrt_cond_waitfor, so that needs an update (or a wrapper for this function in DDSRT?) sync.cfor FreeRTOS still hasddsrt_cond_waitfor
And at some places (e.g. waitset, xeventq, writer throttling) it may be helpful to add a comment with the rationale for using mtime_t or etime_t (although technically it makes no difference in many cases)
src/ddsrt/CMakeLists.txt
Outdated
| add_library(ddsrt INTERFACE) | ||
| if(ENABLE_MIMALLOC) | ||
| target_link_libraries(ddsrt INTERFACE mimalloc) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a change from #2258 accidentally was included in src/ddsrt/CMakeLists.txt?
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
This is a IDL file used for testing these combinations, so disabling the warning is reasonable. Signed-off-by: Erik Boasson <[email protected]>
The only call site used an actual size larger than sizeof(char *) so this only resulted in truncation in an error path. Signed-off-by: Erik Boasson <[email protected]>
This splits out the generic ddsrt_cond_t into 4 different ones: - ddsrt_cond_t: no associated clock, can't be used with timeouts - ddsrt_cond_wctime_t: associated with ddsrt_wctime_t - ddsrt_cond_mtime_t: associated with ddsrt_mtime_t - ddsrt_cond_etime_t: associated with ddsrt_etime_t And removes the relative timeouts from the internal interface (this actually simplifies the code in several cases). For reference: - wctime is wall-clock time, a.k.a. CLOCK_REALTIME, system time, default clock, ... - mtime_t is a monotonic clock, ticking only when the machine is running (not suspended) - etime_t is a monotonic clock, ticking also when the machine is suspended Not all platforms provide all three, and perhaps on some platforms all three are present but not actually used by Cyclone DDS. Platforms like macOS where a condition variable can't be associated with a clock necessarily have to fake it by doing an approximate conversion of the timeout to the system time. Note that the generic dds_time_t has the same clock source as ddsrt_wctime_t. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Some platforms (QNX is an example) have low-resolution clocks that one can use for timeouts, and high-resolution clocks that cannot be used for that. This introduces a separate "high resolution" time that can be used only for computing time elapsed between two events. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
There is currently an assumption in the Cyclone serializers that an enum maps to an (unsigned) int and than an int is 32-bits. Because of that, printing an enum with a format string of PRIu32 will work fine, but there should be a cast to uint32_t nonetheless. Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
Signed-off-by: Erik Boasson <[email protected]>
95a4b44 to
3265dd8
Compare
noxpardalis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new changes look good to me and address @dpotman’s points. All tests also run fine locally.
I also took a gander at the associated PR for the C++ repo: eclipse-cyclonedds/cyclonedds-cxx#546 and it looks fine. I can do a full review once this is merged and the noise from the CI failures on the C++ repo disappear on a re-run.
Thank you! |
The main commit in this PR allows blocking with timeouts based on a monotonic clock rather than the system clock. This should address the various problems with time jumps people have been facing. I haven't actually tested it with time jumps yet because of some practicalities, but perhaps some kind souls are willing to give it a go as it is.
Edit: I accidentally pushed some more onto this branch, but it fits well enough with the subject: some platforms have low-resolution clocks that interact with the scheduler but can provide a high-resolution clock that is unrelated to scheduling. Using the latter for benchmarking seems sensible.
The other commits are just housekeeping and fixing some trivialities.
Fixes #1995 ros2/rmw_cyclonedds#276