rptest: add skip_in_cdt opt-out for CDT runs#30975
Conversation
There was a problem hiding this comment.
Pull request overview
Adds an opt-out mechanism for ducktape tests to be skipped specifically in CDT (real cloud) runs, keeping local/docker CI behavior unchanged while still reporting skipped tests as IGNORE in CDT.
Changes:
- Introduces
in_cdt(),skip_in_cdt(method/class decorator), andskip_file_in_cdt(module-level helper) inmode_checks.py. - Adds unit tests validating method-, class-, and file-scope skipping behavior and ensuring
reasonis not forwarded to ducktape’signore.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| tests/rptest/utils/mode_checks.py | Adds CDT detection and skip helpers that apply ducktape IGNORE marks only when running in real-cloud CDT. |
| tests/rptest/utils/skip_in_cdt_test.py | Adds unit tests for method/class/file scope CDT skipping and verifies ignore mark semantics. |
ReviewNice, focused utility — gating everything on Main concern: class/file scope misses inherited test methods
Minor notes
Overall the approach is sound; the inherited-test-method gap is the one item I'd want addressed (or at least loudly documented) before this lands, since it can silently fail to skip. |
c2a250c to
d99bbc5
Compare
|
/cdt |
|
/cdt |
|
/cdt |
2379a47 to
d99bbc5
Compare
|
/cdt |
d99bbc5 to
097f0ad
Compare
Review:
|
097f0ad to
908a606
Compare
CI test resultstest results on build#86689
test results on build#86695
|
CDT runs every ducktape test on real cloud infrastructure, and a whole-run timeout fails the entire run. Some tests (e.g. exhaustive broker HTTP-API tests) add runtime there without adding cloud-infra coverage. skip_in_cdt opts out a test method or a whole class; skip_file_in_cdt opts out every test class in a module. All are gated on CLOUD_PROVIDER != "docker", so they no-op locally and in dockerized CI and the default behavior is unchanged. In CDT they attach ducktape's ignore mark, so opted-out tests report as IGNORE rather than vanishing from the run. reason is documentation only; it is never passed to ducktape's ignore, which would otherwise treat it as a parametrization matcher and silently un-skip the test. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
The exhaustive schema-registry HTTP-API tests are the largest single contributor to a CDT run (hundreds of test instances) and exercise no cloud infrastructure -- the dockerized CI run already covers them, so running them again on real cloud only adds wall-clock. skip_file_in_cdt marks each module class's own test methods. The test methods live on in-module base classes and the concrete classes inherit them, so marking the bases covers the whole suite. No-op locally and in dockerized CI. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
908a606 to
3e76378
Compare
| Because that mark lives on the function object itself, a subclass of the same | ||
| base in ANOTHER module would also be skipped in CDT (safe as long as opted-out | ||
| bases share no test methods with non-opted-out suites). A class here that |
There was a problem hiding this comment.
The docstring correctly notes this footgun, but there is an asymmetry worth calling out because it is a genuine silent-failure risk for future work:
skip_file_in_cdtwarns when a class in this module inherits test methods from another module (_cross_module_cdt_test_methods).- There is no guard for the reverse: a base defined here whose methods get marked, then subclassed by a non-opted-out suite in another module. Those inherited tests will silently vanish from CDT with no warning anywhere, because the IGNORE mark rides on the shared function object.
Today this is safe only by accident — SchemaRegistryEndpoints/AuditLogTestBase define no test methods, so the only marked bases with tests (SchemaRegistryTestMethods, etc.) are subclassed exclusively in-module. But if someone later writes class Foo(SchemaRegistryTestMethods) in a new file (e.g. a scale/CDT-only suite), it will be silently skipped in CDT with zero signal.
That reverse direction is fundamentally undetectable at this module's import time, which is exactly why it's dangerous. Consider a follow-up guard that doesn't depend on function-object marks — e.g. an opt-out registry checked by the ducktape loader, or asserting at collection that no marked base is subclassed by an unmarked class. At minimum, please make the "share no test methods with non-opted-out suites" invariant explicit at the call sites, since it's load-bearing.
| # coverage (the dockerized CI run already exercises it end to end). The test | ||
| # methods live on the base classes above; skip_file_in_cdt marks each module | ||
| # class's own methods, so the thin concrete subclasses inherit the mark. | ||
| skip_file_in_cdt( |
There was a problem hiding this comment.
A module-scope opt-out also pulls in SchemaRegistryTransportStressTest (and its *RpcTransport*/*KafkaClientTransport* subclasses), which does concurrent SR read/write while transferring leadership of _schemas to check resilience under real network/timing. That is arguably one of the SR tests where real EC2 (vs. loopback docker) network + scheduling behavior does add signal, unlike the pure HTTP-API surface the PR body describes.
Is opting the transport-stress tests out of CDT intentional? If so, a one-line note in the accounting justifying it would help; if not, consider a method-scope @skip_in_cdt on the HTTP-API methods instead of a whole-file opt-out, so the resilience tests keep their real-hardware coverage.
| if args: | ||
| # bare usage: @skip_in_cdt | ||
| return decorate(args[0]) | ||
| # called usage: @skip_in_cdt(reason=...) | ||
| return decorate |
Review summaryNicely scoped change with a thoughtfully-designed mechanism and good test coverage. The What I verified
Main points (both inline)
Neither is blocking — no correctness bug in the current diff, no security/perf concerns. The two items are about protecting the invariant against future edits and confirming the coverage tradeoff. |
| Because that mark lives on the function object itself, a subclass of the same | ||
| base in ANOTHER module would also be skipped in CDT (safe as long as opted-out | ||
| bases share no test methods with non-opted-out suites). A class here that |
There was a problem hiding this comment.
The docstring correctly notes this footgun, but there is an asymmetry worth calling out because it is a genuine silent-failure risk for future work:
skip_file_in_cdtwarns when a class in this module inherits test methods from another module (_cross_module_cdt_test_methods).- There is no guard for the reverse: a base defined here whose methods get marked, then subclassed by a non-opted-out suite in another module. Those inherited tests will silently vanish from CDT with no warning anywhere, because the IGNORE mark rides on the shared function object.
Today this is safe only by accident — SchemaRegistryEndpoints/AuditLogTestBase define no test methods, so the only marked bases with tests (SchemaRegistryTestMethods, etc.) are subclassed exclusively in-module. But if someone later writes class Foo(SchemaRegistryTestMethods) in a new file (e.g. a scale/CDT-only suite), it will be silently skipped in CDT with zero signal.
That reverse direction is fundamentally undetectable at this module's import time, which is exactly why it's dangerous. Consider a follow-up guard that doesn't depend on function-object marks — e.g. an opt-out registry checked by the ducktape loader, or asserting at collection that no marked base is subclassed by an unmarked class. At minimum, please make the "share no test methods with non-opted-out suites" invariant explicit at the call sites, since it's load-bearing.
| # coverage (the dockerized CI run already exercises it end to end). The test | ||
| # methods live on the base classes above; skip_file_in_cdt marks each module | ||
| # class's own methods, so the thin concrete subclasses inherit the mark. | ||
| skip_file_in_cdt( |
There was a problem hiding this comment.
A module-scope opt-out also pulls in SchemaRegistryTransportStressTest (and its *RpcTransport*/*KafkaClientTransport* subclasses), which does concurrent SR read/write while transferring leadership of _schemas to check resilience under real network/timing. That is arguably one of the SR tests where real EC2 (vs. loopback docker) network + scheduling behavior does add signal, unlike the pure HTTP-API surface the PR body describes.
Is opting the transport-stress tests out of CDT intentional? If so, a one-line note in the accounting justifying it would help; if not, consider a method-scope @skip_in_cdt on the HTTP-API methods instead of a whole-file opt-out, so the resilience tests keep their real-hardware coverage.
CDT runs every ducktape test on real cloud infrastructure, and a whole-run timeout fails the entire run. Some tests (e.g. exhaustive broker HTTP-API tests) add runtime there without adding cloud-infra coverage.
skip_in_cdt opts out a test method or a whole class; skip_file_in_cdt opts out every test class in a module. All are gated on CLOUD_PROVIDER != "docker", so they no-op locally and in dockerized CI and the default behavior is unchanged. In CDT they attach ducktape's ignore mark, so opted-out tests report as IGNORE rather than vanishing from the run.
reason is documentation only; it is never passed to ducktape's ignore, which would otherwise treat it as a parametrization matcher and silently un-skip the test.
Accounting
Backports Required
Release Notes