Skip to content

Conversation

@eboasson
Copy link
Contributor

@eboasson eboasson commented Jul 29, 2025

The main commit in this PR allows blocking with timeouts based on a monotonic clock rather than the system clock. This should address the various problems with time jumps people have been facing. I haven't actually tested it with time jumps yet because of some practicalities, but perhaps some kind souls are willing to give it a go as it is.

Edit: I accidentally pushed some more onto this branch, but it fits well enough with the subject: some platforms have low-resolution clocks that interact with the scheduler but can provide a high-resolution clock that is unrelated to scheduling. Using the latter for benchmarking seems sensible.

The other commits are just housekeeping and fixing some trivialities.

Fixes #1995 ros2/rmw_cyclonedds#276

Copy link
Contributor

@dpotman dpotman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! A few minor things:

  • the test code for the C++ API is using ddsrt_cond_waitfor, so that needs an update (or a wrapper for this function in DDSRT?)
  • sync.c for FreeRTOS still has ddsrt_cond_waitfor

And at some places (e.g. waitset, xeventq, writer throttling) it may be helpful to add a comment with the rationale for using mtime_t or etime_t (although technically it makes no difference in many cases)

add_library(ddsrt INTERFACE)
if(ENABLE_MIMALLOC)
target_link_libraries(ddsrt INTERFACE mimalloc)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a change from #2258 accidentally was included in src/ddsrt/CMakeLists.txt?

eboasson added 14 commits August 1, 2025 13:29
This is a IDL file used for testing these combinations, so disabling the warning is
reasonable.

Signed-off-by: Erik Boasson <[email protected]>
The only call site used an actual size larger than sizeof(char *) so this only resulted in
truncation in an error path.

Signed-off-by: Erik Boasson <[email protected]>
This splits out the generic ddsrt_cond_t into 4 different ones:

- ddsrt_cond_t: no associated clock, can't be used with timeouts
- ddsrt_cond_wctime_t: associated with ddsrt_wctime_t
- ddsrt_cond_mtime_t: associated with ddsrt_mtime_t
- ddsrt_cond_etime_t: associated with ddsrt_etime_t

And removes the relative timeouts from the internal interface (this actually simplifies
the code in several cases). For reference:

- wctime is wall-clock time, a.k.a. CLOCK_REALTIME, system time, default clock, ...
- mtime_t is a monotonic clock, ticking only when the machine is running (not suspended)
- etime_t is a monotonic clock, ticking also when the machine is suspended

Not all platforms provide all three, and perhaps on some platforms all three are present
but not actually used by Cyclone DDS. Platforms like macOS where a condition variable
can't be associated with a clock necessarily have to fake it by doing an approximate
conversion of the timeout to the system time.

Note that the generic dds_time_t has the same clock source as ddsrt_wctime_t.

Signed-off-by: Erik Boasson <[email protected]>
Some platforms (QNX is an example) have low-resolution clocks that one can use for
timeouts, and high-resolution clocks that cannot be used for that. This introduces a
separate "high resolution" time that can be used only for computing time elapsed between
two events.

Signed-off-by: Erik Boasson <[email protected]>
There is currently an assumption in the Cyclone serializers that an enum maps to
an (unsigned) int and than an int is 32-bits. Because of that, printing an enum with a
format string of PRIu32 will work fine, but there should be a cast to uint32_t
nonetheless.

Signed-off-by: Erik Boasson <[email protected]>
Copy link
Contributor

@noxpardalis noxpardalis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new changes look good to me and address @dpotman’s points. All tests also run fine locally.

I also took a gander at the associated PR for the C++ repo: eclipse-cyclonedds/cyclonedds-cxx#546 and it looks fine. I can do a full review once this is merged and the noise from the CI failures on the C++ repo disappear on a re-run.

@eboasson
Copy link
Contributor Author

eboasson commented Aug 5, 2025

The new changes look good to me and address @dpotman’s points. All tests also run fine locally.

I also took a gander at the associated PR for the C++ repo: eclipse-cyclonedds/cyclonedds-cxx#546 and it looks fine. I can do a full review once this is merged and the noise from the CI failures on the C++ repo disappear on a re-run.

Thank you!

@eboasson eboasson merged commit 49c313a into eclipse-cyclonedds:master Aug 5, 2025
14 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Event thread is freezed when time is changed to past

3 participants