-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: use shared containers for integration tests #924
base: main
Are you sure you want to change the base?
Conversation
pub rest_catalog: RestCatalog, | ||
pub catalog_config: RestCatalogConfig, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the tests now share this fixture but they're all executed in different runtimes, they can't share the same client, as those can get dropped prematurely when a test ends, resulting in dispatch task is gone: runtime dropped the dispatch task
.name("t1".to_string()) | ||
.name("t2".to_string()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making the write tests use different tables each allows for them to also use the shared docker stack.
fd569a9
to
110ce33
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally lgtm. Since we're using a shared catalog, are there any concerns with side effects?
In pyiceberg we use a table identifier fixture to generate table names so they dont conflict
Good point, I could also extract the namespace as a shared fixture among the tests (in the PR i just ignore the result of EDIT: I've extracted two common fixtures now: the apple-ios namespace and the foo-bar-baz schema. |
By extension make them use the same set of docker containers.
fcf379f
to
7fb4d98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I understand it correctly:
- Before this PR, tests are compiled into separated binaries. In theory they can run concurrently (e.g., if use
cargo nextest
), but justcargo test
will run them in serial. - In this PR, we put them into 1 binary. And then they are run concurrently by
#[tokio::tests]
.- We use "shared container" to also save the time of spinning up and down containers. But need to take care of potential conflicts.
let ns = Namespace::with_properties( | ||
NamespaceIdent::from_strs(["apple", "ios"]).unwrap(), | ||
HashMap::from([ | ||
("owner".to_string(), "ray".to_string()), | ||
("community".to_string(), "apache".to_string()), | ||
]), | ||
); | ||
let fixture = get_shared_containers(); | ||
let rest_catalog = RestCatalog::new(fixture.catalog_config.clone()); | ||
let ns = apple_ios_ns().await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about creating a per-test randomized namespace instead? I don't get why do we want a shared namespace. I guess before this PR, we just didn't consider the concurrency and conflict problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, that makes sense, I'll change it.
EDIT: Done, the randomness is introduced via uuid
s.
Correct; at present the tests can't be run concurrently because they depend on separate docker container sets. (Also even if they didn't,
Yep, that's it in a nutshell. Using the shared container set is the biggest time-saver (eliminating the docker build-start-stop overhead for each test). Having made them use shared containers, the next simple improvement is to make them all compile to one test binary (thus eliminating multiple compilations), as is done through having a single top-level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes and the idea LGTM 👍 Maybe we could rename the PR title to "use shared container for integration tests" instead as it's the biggest time-saver?
thanks! generally lgtm. |
Good question; a brief glance suggests it may have shaved about ~5 minutes from the The difference would certainly grow as more and more integration tests are added. EDIT: It's actually closer to 10 minutes, since the integration tests are run twice, once in the |
thats amazing! Thanks for looking into that, i like faster CI :) |
Closes #923
This PR makes integration tests re-use the same set of docker containers, hence allowing them to be run concurrently
shared_tests
module, with a single top-levelshared.rs
file, thus requiring only one compilation step (as opposed to compiling every test file individually)set_test_fixture
is made sync, since there's no async code in itOnceLock
, so that it is invoked only once for the resulting (shared) test binary; it also has a corresponding destructor to spin down the containers after the tests runshared.rs
(top-level); otherwise they should be created in theshared_tests
module#[serial]
)Finally, with this change my integration test runs go from taking ~3.5 minutes down to 30 seconds.
EDIT: The CI
unit
workflow run duration also seems ~10min shorter now.