-
Notifications
You must be signed in to change notification settings - Fork 158
Remove ASVBase from the finalize staged data benchmarks #2825
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We remove real_finalize_staged_data.py and instead parameterize this over two storages. We also rip out the AsvBase base here.
| copytree(FinalizeStagedData.ARCTIC_DIR_ORIGINAL, FinalizeStagedData.ARCTIC_DIR, dirs_exist_ok=True) | ||
| del self.ac | ||
| self.logger.info(f"SETUP_CACHE TIME: {time.time() - start}") | ||
| return lib_for_storage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are the lib_for_storage library objects transitioned between the setup_cache and setup_and_benchmark processes? I assume they are pickled. Does pickling and unpickling a Library object work for all storages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they're pickled. We have quite careful support for pickling in the NativeVersionStore, as people rely on it for multi-processing. It certainly works for LMDB and Amazon, which are the only ones we use at the moment, and if we find storages where this does not work that would be a bug worth fixing in its own right.
696096b to
429590c
Compare
| raise RuntimeError(f"storage {storage} not implemented for benchmark {__file__}") | ||
|
|
||
| if lib_name in ac: | ||
| ac.delete_library(lib_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: delete_library is a no-op if library doesn't exist, so we can just drop the if.
| number = 1 | ||
| rounds = 1 | ||
| repeat = 5 | ||
| repeat = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit weary of designing benchmarks in a way that relies on a single benchmark execution. I.e. doing all of the setup in setup_cache which will never be repeated.
It is possible in the future we will want to be able to repeat the benchmark multiple times to decrease the variance.
I'm ok with leaving repeat=1 if benchmark is not flaky for now but I think we should _prepopulate_library inside setup to allow increasing the repeats in the future.
Also would you mind adding comments explaining these like:
number = 1 # Not safe to increase, each benchmark run relies on setup
repeat = 1 # Safe to increase, `setup` does required staging to correctly repeat benchmark multiple times
| list_of_chunks = [10_000] * param | ||
|
|
||
| for suffix in ("-time", "-mem"): | ||
| symbol = _symbol_name(param) + suffix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not add a required suffix argument to _symbol_name since we just add it every time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't add it every time, eg
self.symbol = _symbol_name(num_chunks)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case we still then add the suffix in the benchmarks themselves e.g. self.symbol + "-time" in several places.
I don't have a strong preference but it feels confusing to have self.symbol not be an actual symbol in the library but rather a symbol prefix. What do you think about either:
- renaming
self.symboltoself.symbol_prefix - adding
suffixargument to_symbol_nameand constructing the symbol in the benchmarks itself
429590c to
8d7ce6c
Compare
…taged_data.py to it and fix the benchmark so that it actually does something.
…ng, so that all storages are present in `benchmarks.json`
…. This matches the old behaviour.
8d7ce6c to
d3736fd
Compare
…ges to enable. Suite overwrite will be irrelevant.
| assert len(self.lib.list_symbols()) == 0 # check we are in a clean state | ||
| initial_timestamp = TimestampNumber(0, self.df_generator.TIME_UNIT) | ||
|
|
||
| class FinalizeStagedDataWiderDataframeX3(FinalizeStagedData): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hasn't run for ages. I'm just going to remove it.
Start to replace the current real storage testing framework.
There are many layers of abstraction in the current code, such as the
TestLibraryManagerandAsvBase. There is also a notion of "persistent" storage - long running data stored on S3 and only manually updated, for read benchmarks to work against. This is complicated to work with and fragile - for example it is easy to forget to update "persistent" benchmarking libraries when we update benchmark parameters, so they silently stay stuck on the wrong parameters.The datasets we write for benchmarking are actually fairly small but they are not being written efficiently at the moment (for example, they are not using our batch APIs).
This PR shows a new design, using
finalize_staged_dataas a starting point.finalize_staged_dataandreal_finalize_staged_datamodules. Parameterize a single one by storage instead. This should be safer, since the exact benchmark code will at least run against LMDB on each PR.ARCTICDB_STORAGE_AWS_S3=0,ARCTICDB_STORAGE_LMDB=1)benchmarks.jsonincludes all the storages so that adhoc real storage runs can execute.This PR doesn't clean up data written to S3 yet, that will be easier to implement once all the real storage benchmarks are ported over. My plan for that is for the CI to create a temporary bucket, and drop it at the end of the benchmarking run.
Manual run against real storage: https://github.com/man-group/ArcticDB/actions/runs/20850427753/job/59904037259
Manual run against LMDB: https://github.com/man-group/ArcticDB/actions/runs/20850444779