Skip to content

High CPU Spike in NATS Server when Leafnode Clients reach CPU Saturation [v2.12.2] #7859

@yoonseo-han

Description

@yoonseo-han

Observed behavior

We are running a NATS cluster where multiple Leafnodes subscribe to all subjects, leading to high fan-out traffic. We observed that when the client-side server (hosting the Leafnodes) reaches CPU saturation (100%+ utilization), the connected NATS Core server's CPU usage remains stable for a while and then suddenly spikes to ~99%.

Despite the CPU spike, the NATS server continues to function without immediate performance degradation or message loss. Profiling shows a massive increase in runtime.memclrNoHeapPointers and klauspost/compress/s2 related functions during this spike.

Expected behavior

We expected that the CPU saturation on the client-side would lead to backpressure but it should not cause the NATS Core server's CPU to spike to near 100% capacity.

Server and client version

NATS Server: v2.12.2

Host environment

NATS Core: 3-node cluster on AWS instances.
Client Host: Single m6i.large instance running multiple Leafnode instances.

Steps to reproduce

  1. Deploy a NATS Core cluster.
  2. Launch multiple Leafnode instances on a single client host, all subscribing to a high-throughput stream (simulating a full market data feed).
  3. Induce CPU saturation on the client host by increasing the number of Leafnodes or adding artificial CPU load.
  4. Observe the NATS Core server CPU utilization after the client host hits 100% CPU.

Metadata

Metadata

Assignees

No one assigned

    Labels

    defectSuspected defect such as a bug or regression

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions