You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge?
I've had to write a couple ExecutionPlanVisitors recently (see below) and when I started I initially looked for some documentation on this but wasn't able to find any. I think it would be beneficial to new comers to see a couple examples of ExecutionPlanVisitor in the docs.
Describe the solution you'd like
A section in the docs with some example implementations of ExecutionPlanVisitor
Describe alternatives you've considered
No response
Additional context
These were the ExecutionPlanVisitors I made. I would be happy to add docs around these.
#[derive(Debug)]
struct ParquetVisitor;
impl ExecutionPlanVisitor for ParquetVisitor {
type Error = DataFusionError;
fn pre_visit(&mut self, plan: &dyn ExecutionPlan) -> Result<bool, Self::Error> {
// Get the one-line representation of the ExecutionPlan, something like this:
// ParquetExec: file_groups=[...], ...
let mut buf = String::new();
write!(&mut buf, "{}", displayable(plan).one_line()).map_err(|e| {
DataFusionError::Internal(format!("Error while collecting metrics: {e}"))
})?;
// Trim everything up to the first colon.
// This is a hack to extract a human-readable representation of the ExecutionPlan's type.
// We would prefer if `ExecutionPlan` had `name` method, but this will do,
// since every physical operator seems to follow this convention.
// If a node doesn't, we just skip collecting its metrics, and no harm is done.
let plan_type = match buf.split_once(':') {
None => {
println!("execution plan has unexpected display format: {buf}");
return Ok(true);
}
Some((name, _)) => name.to_string(),
};
let maybe_parquet_exec = plan.as_any().downcast_ref::<ParquetExec>();
match maybe_parquet_exec {
Some(parquet_exec) => {
let metrics = match parquet_exec.metrics() {
None => return Ok(true),
Some(metrics) => metrics,
};
// println!("Metrics: {:?}", metrics);
let bytes_scanned = metrics.sum_by_name("bytes_scanned");
println!("Parquet Bytes scanned: {:?}", bytes_scanned);
}
None => {
}
}
Ok(true)
}
}
I had in mind having two examples, one for getting information from parquet files (I could probably combine the two I had) and one that tracked data across all nodes (maybe output_rows).
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem or challenge?
I've had to write a couple
ExecutionPlanVisitor
s recently (see below) and when I started I initially looked for some documentation on this but wasn't able to find any. I think it would be beneficial to new comers to see a couple examples ofExecutionPlanVisitor
in the docs.Describe the solution you'd like
A section in the docs with some example implementations of
ExecutionPlanVisitor
Describe alternatives you've considered
No response
Additional context
These were the
ExecutionPlanVisitor
s I made. I would be happy to add docs around these.I had in mind having two examples, one for getting information from parquet files (I could probably combine the two I had) and one that tracked data across all nodes (maybe
output_rows
).The text was updated successfully, but these errors were encountered: