-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Observed behavior
We are running a NATS cluster where multiple Leafnodes subscribe to all subjects, leading to high fan-out traffic. We observed that when the client-side server (hosting the Leafnodes) reaches CPU saturation (100%+ utilization), the connected NATS Core server's CPU usage remains stable for a while and then suddenly spikes to ~99%.
Despite the CPU spike, the NATS server continues to function without immediate performance degradation or message loss. Profiling shows a massive increase in runtime.memclrNoHeapPointers and klauspost/compress/s2 related functions during this spike.
Expected behavior
We expected that the CPU saturation on the client-side would lead to backpressure but it should not cause the NATS Core server's CPU to spike to near 100% capacity.
Server and client version
NATS Server: v2.12.2
Host environment
NATS Core: 3-node cluster on AWS instances.
Client Host: Single m6i.large instance running multiple Leafnode instances.
Steps to reproduce
- Deploy a NATS Core cluster.
- Launch multiple Leafnode instances on a single client host, all subscribing to a high-throughput stream (simulating a full market data feed).
- Induce CPU saturation on the client host by increasing the number of Leafnodes or adding artificial CPU load.
- Observe the NATS Core server CPU utilization after the client host hits 100% CPU.