Description:
Integrate Linux perf_event hardware counters into the pipeline executor to provide CPU microarchitecture-level insights for each plan node in EXPLAIN PERF output.
Motivation:
EXPLAIN PERF currently provides flamegraph-based CPU profiling (via sampling), which shows where time is spent but not why. Hardware performance counters reveal the root cause — whether a processor is bottlenecked by memory latency, branch mispredictions, or instruction throughput — enabling targeted
optimization without external profiling tools.