-
Notifications
You must be signed in to change notification settings - Fork 158
ASV performance improvements #2834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
f9b32c3 to
e19ce1a
Compare
… rows is pointless.
…tions information about the setup_cache time is clearer
Remove unused start_time = time.time()
To start with this took 63s on my machine. After the change for string generation, this took 8s. After the change for date generation, this took 3s.
…y run for each "rows" parameter, even though they do not use it.
… ModificationFunctions otherwise they run for each "rows" parameter, even though they do not use it.
e19ce1a to
8c66918
Compare
| param_names = PARAM_NAMES | ||
|
|
||
| CONNECTION_STRING = "lmdb://basic_functions" | ||
| DATE_RANGE = DATE_RANGE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
| self.fresh_lib._nvs.compact_incomplete(f"sym", False, False) | ||
|
|
||
|
|
||
| class ShortWideWrite: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to keep the ShortWideWrite and UltraShortWideWrite separated by have parametrized ShortWideRead
| warmup_time = 0 | ||
| timeout = 6000 | ||
| CONNECTION_STRING = "lmdb://batch_basic_functions?map_size=20GB" | ||
| sample_time = 0.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What was the process of coming up with this?
| os.remove(self.path) | ||
|
|
||
| def create_dict(self, size): | ||
| ten_char_strings = [random_string(10) for _ in range(1000)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I doubt it will matter much but maybe it'll be a good idea to add some seeds for all the random stuff below
| self.lib.append("sym", self.df_append_single) | ||
|
|
||
| def time_append_large(self, lad: LargeAppendDataModify, rows): | ||
| large: pd.DataFrame = lad.df_append_large[rows].pop(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lad seems to be used only here. Does it make sense to make a separate benchmark class for it?
| rounds = 1 | ||
| number = 1 # We do a single run between setup and teardown because we e.g. can't delete a symbol twice | ||
| repeat = 2 | ||
| warmup_time = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why it's included in some tests and some doesn't
|
|
||
| class IterateVersionChain: | ||
| timeout = 6000 | ||
| timeout = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this has been tweaked a few times. Given that its backend is LMDB so the storage less likely to fail, shall this be relaxed to prioritize outputting result at the end?
No strong opinion on this though
Note that this includes a non-test change I came across while working on this, just to fix the logging when we write empty dataframes, which was malformed. This is e7ecd73.
The other changes are to speed up or otherwise improve our ASV execution.
This has reduced the benchmarking time to roughly 1h40 down from about 2h20m.
Please review commit-by-commit.