Skip to content

Optimize QoS to improve responsiveness of reliable endpoints #26

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Aug 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,7 @@ variables.
- [RMW_CONNEXT_CYCLONE_COMPATIBILITY_MODE](#RMW_CONNEXT_CYCLONE_COMPATIBILITY_MODE)
- [RMW_CONNEXT_DISABLE_LARGE_DATA_OPTIMIZATIONS](#RMW_CONNEXT_DISABLE_LARGE_DATA_OPTIMIZATIONS)
- [RMW_CONNEXT_DISABLE_FAST_ENDPOINT_DISCOVERY](#RMW_CONNEXT_DISABLE_FAST_ENDPOINT_DISCOVERY)
- [RMW_CONNEXT_DISABLE_RELIABILITY_OPTIMIZATIONS](#RMW_CONNEXT_DISABLE_RELIABILITY_OPTIMIZATIONS)
- [RMW_CONNEXT_ENDPOINT_QOS_OVERRIDE_POLICY](#RMW_CONNEXT_ENDPOINT_QOS_OVERRIDE_POLICY)
- [RMW_CONNEXT_INITIAL_PEERS](#RMW_CONNEXT_INITIAL_PEERS)
- [RMW_CONNEXT_LEGACY_RMW_COMPATIBILITY_MODE](#RMW_CONNEXT_LEGACY_RMW_COMPATIBILITY_MODE)
Expand Down Expand Up @@ -207,6 +208,17 @@ Variable `RMW_CONNEXT_DISABLE_FAST_ENDPOINT_DISCOVERY` may be used to disable
these automatic optimizations, and to leave the DomainParticipant's QoS to
its defaults.

### RMW_CONNEXT_DISABLE_RELIABILITY_OPTIMIZATIONS

By default, `rmw_connextdds` will modify the QoS of each reliable DataWriter
and DataReader to improve the responsiveness of the RTPS [reliability protocol](https://community.rti.com/static/documentation/connext-dds/6.0.1/doc/manuals/connext_dds/html_files/RTI_ConnextDDS_CoreLibraries_UsersManual/Content/UsersManual/Using_QosPolicies_to_Tune_the_Reliable_P.htm?tocpath=Part%203%3A%20Advanced%20Concepts%7C11.%20Reliable%20Communications%7C11.3%20Using%20QosPolicies%20to%20Tune%20the%20Reliable%20Protocol%7C_____0#reliable_1394042328_776265).

For example, the ["heartbeat period"](https://community.rti.com/static/documentation/connext-dds/6.0.1/doc/manuals/connext_dds/html_files/RTI_ConnextDDS_CoreLibraries_UsersManual/Content/UsersManual/Controlling_Heartbeats_and_Retries.htm#reliable_1394042328_785637)
is sped up from 3 seconds to 100 milliseconds.

These optimizations may be disabled using variable
`RMW_CONNEXT_DISABLE_RELIABILITY_OPTIMIZATIONS`.

### RMW_CONNEXT_ENDPOINT_QOS_OVERRIDE_POLICY

When this variable is not set or set to `always`, the QoS settings specified in
Expand Down
3 changes: 3 additions & 0 deletions rmw_connextdds_common/include/rmw_connextdds/context.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,9 @@ struct rmw_context_impl_s
#if RMW_CONNEXT_DEFAULT_LARGE_DATA_OPTIMIZATIONS
bool optimize_large_data{true};
#endif /* RMW_CONNEXT_DEFAULT_LARGE_DATA_OPTIMIZATIONS */
#if RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS
bool optimize_reliability{true};
#endif /* RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS */

enum class participant_qos_override_policy_t
{
Expand Down
12 changes: 12 additions & 0 deletions rmw_connextdds_common/include/rmw_connextdds/static_config.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,11 @@
"RMW_CONNEXT_DISABLE_LARGE_DATA_OPTIMIZATIONS"
#endif /* RMW_CONNEXT_ENV_DISABLE_LARGE_DATA_OPTIMIZATIONS */

#ifndef RMW_CONNEXT_ENV_DISABLE_RELIABILITY_OPTIMIZATIONS
#define RMW_CONNEXT_ENV_DISABLE_RELIABILITY_OPTIMIZATIONS \
"RMW_CONNEXT_DISABLE_RELIABILITY_OPTIMIZATIONS"
#endif /* RMW_CONNEXT_ENV_DISABLE_RELIABILITY_OPTIMIZATIONS */

// TODO(security-wg): These are intended to be temporary, and need to be
// refactored into a proper abstraction.
#ifndef RMW_CONNEXT_ENV_SECURITY_LOG_FILE
Expand Down Expand Up @@ -226,6 +231,13 @@
#define RMW_CONNEXT_TYPE_OBJECT_MAX_SERIALIZED_SIZE 65000
#endif /* RMW_CONNEXT_TYPE_OBJECT_MAX_SERIALIZED_SIZE */

/******************************************************************************
* Customize the RTPS reliability protocol to speed up its responsiveness.
******************************************************************************/
#ifndef RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS
#define RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS 1
#endif /* RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS */

/******************************************************************************
* Automatically tune DataWriterQos to better handle reliable "large data".
******************************************************************************/
Expand Down
19 changes: 19 additions & 0 deletions rmw_connextdds_common/src/common/rmw_context.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1281,6 +1281,25 @@ rmw_api_connextdds_init(
RMW_CONNEXT_LOG_DEBUG_A("initial DDS peers: %s", initial_peers)
}

#if RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS
// Check if we should disable the optimizations for the RTPS reliability protocol
const char * disable_optimize_reliability_env = nullptr;
lookup_rc = rcutils_get_env(
RMW_CONNEXT_ENV_DISABLE_RELIABILITY_OPTIMIZATIONS,
&disable_optimize_reliability_env);

if (nullptr != lookup_rc || nullptr == disable_optimize_reliability_env) {
RMW_CONNEXT_LOG_ERROR_A_SET(
"failed to lookup from environment: "
"var=%s, "
"rc=%s ",
RMW_CONNEXT_ENV_DISABLE_RELIABILITY_OPTIMIZATIONS,
lookup_rc)
return RMW_RET_ERROR;
}
ctx->optimize_reliability = '\0' == disable_optimize_reliability_env[0];
#endif /* RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS */

if (nullptr == RMW_Connext_gv_DomainParticipantFactory) {
RMW_CONNEXT_ASSERT(1 == RMW_Connext_gv_ContextCount)
RMW_CONNEXT_LOG_DEBUG("initializing DDS DomainParticipantFactory")
Expand Down
37 changes: 37 additions & 0 deletions rmw_connextdds_common/src/ndds/dds_api_ndds.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -501,6 +501,32 @@ rmw_connextdds_get_datawriter_qos(
qos->publish_mode.kind = DDS_ASYNCHRONOUS_PUBLISH_MODE_QOS;
}

#if RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS
// The default settings for the RTPS reliability protocol are not very
// responsive, and they cause some unit tests to fail. These optimizations
// have been derived from profile `Optimization.ReliabilityProtocol.Common`
// available in Connext 6+. `Generic.StrictReliable` is the equivalent
// profile in 5.3.1. Changes are limited to `DDS_RtpsReliableWriterProtocol_t`.
if (ctx->optimize_reliability) {
// All write() calls will block (for at most max_blocking_time) once the send_window
// is filled with samples that haven't yet been acknowledged by all active readers.
qos->protocol.rtps_reliable_writer.min_send_window_size = 40;
qos->protocol.rtps_reliable_writer.max_send_window_size = 40; // fixed size window
qos->protocol.rtps_reliable_writer.heartbeats_per_max_samples = 10; // 1 every 4
qos->protocol.rtps_reliable_writer.heartbeat_period = {0, 200000000}; // 200ms
qos->protocol.rtps_reliable_writer.late_joiner_heartbeat_period = {0, 20000000}; // 20ms
qos->protocol.rtps_reliable_writer.fast_heartbeat_period = {0, 20000000}; // 20ms
qos->protocol.rtps_reliable_writer.max_heartbeat_retries = 500; // 10s @ 50hz
// Force the writer to reply immediately to ACKNACK's received from a writer.
qos->protocol.rtps_reliable_writer.max_nack_response_delay = DDS_DURATION_ZERO;
// When the number of unack'd samples reaches the high_watermark the fast_heartbeat_period
// is used. When the number dips below the low_watermark, the heartbeat_period is used.
// These numbers are tied to the send_window size.
qos->protocol.rtps_reliable_writer.high_watermark = 25;
qos->protocol.rtps_reliable_writer.low_watermark = 10;
}
#endif /* RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS */

#if RMW_CONNEXT_DEFAULT_LARGE_DATA_OPTIMIZATIONS
// Unless disabled, optimize the DataWriter's reliability protocol to
// better handle large data samples. These are *bounded* types whose
Expand Down Expand Up @@ -587,6 +613,17 @@ rmw_connextdds_get_datareader_qos(
}
}

#if RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS
// The default settings for the RTPS reliability protocol are not very
// responsive, and they cause some unit tests to fail. These optimizations
// are dual to those applied in rmw_connextdds_get_datawriter_qos().
// Changes are limited to `DDS_RtpsReliableReaderProtocol_t`.
if (ctx->optimize_reliability) {
qos->protocol.rtps_reliable_reader.min_heartbeat_response_delay = DDS_DURATION_ZERO;
qos->protocol.rtps_reliable_reader.max_heartbeat_response_delay = DDS_DURATION_ZERO;
}
#endif /* RMW_CONNEXT_DEFAULT_RELIABILITY_OPTIMIZATIONS */

#if RMW_CONNEXT_DEFAULT_LARGE_DATA_OPTIMIZATIONS
// Unless disabled, optimize the DataReader's reliability protocol to
// better handle large data samples. These are *bounded* types whose
Expand Down