Skip to content

server: enable collecting CPU profiles at a lower rate to limit the CPU overhead of the profiling #75801

Open
@knz

Description

@knz

We'd like to dump periodic CPU profiles on every node. Or at least when CPU usage increases with spikes. That is, we'd like to reuse a similar logic at the one we already use to collect heap dumps and goroutine dumps (#75799).

Unfortunately, the pprof default profile rate (100Hz) is causing a noticeable (1-2%) performance dip.
Given that we usually need profiles when CPU is overloaded, the additional cost due to profiling is unwelcome.

So we'd like to explore a way to collect profiles at a lower sampling rate, to lower the overhead.

Sadly, the code in pprof.StartCPUProfile() which we currently use, hardcodes the rate at 100Hz.

We haven't yet found another way to do this short of forking pprof.

Jira issue: CRDB-12842

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-cli-serverCLI commands that pertain to CockroachDB server processesC-enhancementSolution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)T-observability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions