-
Notifications
You must be signed in to change notification settings - Fork 43
[sled-diagnostics] use JoinSet for multiple commands #8151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Created using spr 1.3.6-beta.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thought, I'm not sure whether it makes a meaningful difference given that it seems much of the issue is just "fork
sucks when you have a big resident set"...
sled-diagnostics/src/lib.rs
Outdated
let permit = Arc::clone(&self.semaphore) | ||
.acquire_owned() | ||
.await | ||
.expect("semaphore acquire"); | ||
let _abort_handle = self.set.spawn(async move { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this will not even spawn the tasks until there's capacity in the concurrency limit. What happens if you change it to:
let permit = Arc::clone(&self.semaphore) | |
.acquire_owned() | |
.await | |
.expect("semaphore acquire"); | |
let _abort_handle = self.set.spawn(async move { | |
let semaphore = self.semaphore.clone(); | |
let _abort_handle = self.set.spawn(async move { | |
let permit = semaphore.acquire().await.expect("semaphore acquire"); |
This way, all the tasks are spawned immediately, and the task adding commands to the set can do so synchronously (changing add_command
to a normal fn
) and then just wait for them to all come back. Right now, the task that adds commands has to get woken up a bunch of times to add the next one to the set; I wonder whether changing this to spawn everything synchronously would make a meaningful difference in performance...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this change I see the following...
stable rust + 1gb RSS:
took 24.416534527s
beta rust + 1gb RSS:
took 3.829577938s
Over chat I told you that with beta rust and the joinset we saw took 3.674288649s
so it's negligible but overall I think a better design to go with your suggestion. I am going to push a commit to this branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah well, it was worth a shot, i guess! thanks for giving it a spin!
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Test failure filed as #8170 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modulo fixing CI, this looks good to me. I left some non-blocking thoughts.
@@ -29,6 +31,41 @@ pub use crate::queries::{ | |||
}; | |||
use queries::*; | |||
|
|||
/// Max number of commands to run in parallel | |||
const MAX_PARALLELISM: usize = 50; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did we arrive at this number?
|
||
impl<T: 'static + Send> MultipleCommands<T> { | ||
fn new() -> MultipleCommands<T> { | ||
let semaphore = Arc::new(Semaphore::new(MAX_PARALLELISM)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the subject of the concurrency limit, it seems to me that the functions in this crate that run multiple commands fall into two categories: those that run a fairly small, fixed number of commands, like ipadm_info
and dladm_info
, and those which run a command or set of commands against every Oxide process PID (pargs_oxide_processes
etc).
For the functions that run commands against every Oxide process pid, the concurrency limit is certainly useful, as there may be basically any number of pids. But something like ipadm_info
will always spawn exactly 3 processes, which is below the concurrency limit, and all this faffing around with a Semaphore
is unnecessary.
I kind of wonder if the class of functions that spawn a fixed set of commands should eschew the use of MultipleCommands
and just construct a JoinSet
and spawn their 3 or 5 tasks or whatever. In practice, any overhead from the semaphore acquire/release/drop and stuff is probably insignificant compared to "actually spawning a child process" so this probably doesn't actually matter, but we could avoid doing it...up to you.
@@ -4,7 +4,8 @@ | |||
|
|||
//! Diagnostics for an Oxide sled that exposes common support commands. | |||
|
|||
use futures::{StreamExt, stream::FuturesUnordered}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the rebase picked up new code that uses FuturesUnordered
: https://github.com/oxidecomputer/omicron/actions/runs/15056224877/job/42322633598?pr=8151#step:11:767
FWIW, I pulled this code into #8174, and added a test that we don't exceed our specified "parallelism limit" there. |
I can wait to merge this then, thanks for pulling this into a stand alone crate. Let's just use your new crate when it lands instead of duplicating the code. |
As a part of the #8166 investigation we decided that we should move from
FuturesUnordered
to aJoinSet
to gain parallelism.