Skip to content

CA-408552: Improve bootstrom performance by save db ops #6541

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

liulinC
Copy link
Collaborator

@liulinC liulinC commented Jun 18, 2025

events_from_xenopsd thread is critical as it sync up VM status in case of bootstorm, this thread is flood as lots of events comes from xenopsd waiting for process.

During processing of the events, VM/VDI/VBD update_allowed_operations will be called to refresh the allowed operations.

However, for each ops (start/suspend,etc) for the same object(VM), the object info is always the same no matter what the ops is. Thus, it is not necessary to query the object information over and over again.

Disclosure is used to resovle the issue. Query once and the disclosure will just remember the query result.

The performance test for starting 500 VM on 4 hosts improve around 10% performance for both XS8 and XS9

@liulinC
Copy link
Collaborator Author

liulinC commented Jun 18, 2025

image
Test passed on ring3 bst.

@liulinC liulinC force-pushed the private/linl/perfpr branch from 7d4f2a8 to 1d43b77 Compare June 18, 2025 09:12
Copy link
Contributor

@edwintorok edwintorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is difficult to review on github due to all the moved code.

But the following settings in .config/git/config locally makes reviewing easier, this will showed code that was moved but otherwise unchanged in grey (and ignores indentation changes):

[diff]
    algorithm = histogram
    colorMoved = dimmed_zebra
    colorMovedWS = allow-indentation-change

Copy link
Contributor

@edwintorok edwintorok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be easier to review if this was split into multiple commits, e.g. one commit that just introduces the 'fun op' and reindents code, then another that introduces names for things we want to cache ahead of fun op (but doesn't move the code yet) (e.g. 'vdi_is_ha_state_or_redolog'), and then a final commit that just moves the cached values that don't depend on op outside of fun op.

@liulinC
Copy link
Collaborator Author

liulinC commented Jun 18, 2025

It might be easier to review if this was split into multiple commits, e.g. one commit that just introduces the 'fun op' and reindents code, then another that introduces names for things we want to cache ahead of fun op (but doesn't move the code yet) (e.g. 'vdi_is_ha_state_or_redolog'), and then a final commit that just moves the cached values that don't depend on op outside of fun op.

I thought it was a simple update, as I only moved codes to the beginning with low risk.
But if that is preferred, I can update the commits later.

Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is best-reviewed with the setting to hide whitespace on the github UI (or review it locally if you have a fancy diffing tool that can ignore that as well)

liulinC added 3 commits June 20, 2025 03:29
events_from_xenopsd thread is critical as it sync up VM status
in case of bootstorm, this thread is flood as lots of events comes
from xenopsd waiting for process.

During processing of the events, VM/VDI/VBD update_allowed_operations
will be called to refresh the allowed operations.

However, for each ops (start/suspend,etc) for the same object(VM),
the object info is always the same no matter what the ops is.
Thus, it is not necessary to query the object information over and
over again.

Disclosure is used to resovle the issue. Query once and the disclosure
will just remember the query result.

The performance test for starting 500 VM on 4 hosts improve around 10%
performance for both XS8 and XS9

This commit just introduce disclosure and caller call the disclosure
instead of the original function

Signed-off-by: Lin Liu <[email protected]>
Define db ops into variables and keep them inside returned function
for code review

Signed-off-by: Lin Liu <[email protected]>
Move ops unrelated db operation outside of returned function

Signed-off-by: Lin Liu <[email protected]>
@liulinC liulinC force-pushed the private/linl/perfpr branch from 1d43b77 to e1a57fa Compare June 20, 2025 07:39
@liulinC
Copy link
Collaborator Author

liulinC commented Jun 20, 2025

It might be easier to review if this was split into multiple commits, e.g. one commit that just introduces the 'fun op' and reindents code, then another that introduces names for things we want to cache ahead of fun op (but doesn't move the code yet) (e.g. 'vdi_is_ha_state_or_redolog'), and then a final commit that just moves the cached values that don't depend on op outside of fun op.

Just follow the suggestion to split the commits, @edwintorok Please take a look, thanks.

Ok ()
in
match op with
| `forget ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be encoded using when condition ->; it's a matter or taste. I like when because it avoids nesting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same taste.
But would just leave it as it was, to de-risk regression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants