Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions website/content/en/docs/concepts/nodepools.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,10 @@ kind: NodePool
metadata:
name: default
spec:
# Optional: Number of nodes to maintain for static capacity
# When set, NodePool operates in static mode maintaining fixed node count
replicas: 5

# Template section that describes how to template out NodeClaim resources that Karpenter will provision
# Karpenter will consider this template to be the minimum requirements needed to provision a Node using this NodePool
# It will overlay this NodePool with Pods that need to schedule to further constrain the NodeClaims
Expand Down Expand Up @@ -145,10 +149,14 @@ spec:
limits:
cpu: "1000"
memory: 1000Gi
# For static NodePools, limits.nodes constrains maximum node count during scaling/drift
# Note : Supported only for static NodePools
nodes: 10

# Priority given to the NodePool when the scheduler considers which NodePool
# to select. Higher weights indicate higher priority when comparing NodePools.
# Specifying no weight is equivalent to specifying a weight of 0.
# Note: weight cannot be set when replicas is specified
weight: 10
status:
conditions:
Expand All @@ -158,11 +166,28 @@ status:
lastTransitionTime: "2024-02-02T19:54:34Z"
reason: NodeClaimNotLaunched
message: "NodeClaim hasn't succeeded launch"
# Current node count for the NodePool
nodes: 5
resources:
cpu: "20"
memory: "8192Mi"
ephemeral-storage: "100Gi"
```
## spec.replicas

Optional field that enables static capacity mode. When specified, the NodePool maintains a fixed number of nodes regardless of pod demand.

**Static NodePool Constraints:**
- Cannot be changed once set (NodePool cannot switch between static and dynamic modes)
- Consolidation settings are ignored when replicas is specified
- Only `limits.nodes` is allowed in limits section
- `weight` field cannot be set
- Nodes are not considered as consolidation candidates
- Scale operations bypass disruption budgets but respect PodDisruptionBudgets
- Karpenter-driven actions (e.g., drift) respect disruption budgets and scheduling safety

**Scaling:** Use `kubectl scale nodepool <name> --replicas=<count>` to change replica count.

## metadata.name
The name of the NodePool.

Expand Down Expand Up @@ -204,6 +229,13 @@ These well-known labels may be specified at the NodePool level, or in a workload

For example, an instance type may be specified using a nodeSelector in a pod spec. If the instance type requested is not included in the NodePool list and the NodePool has instance type requirements, Karpenter will not create a node or schedule the pod.

**Static NodePool**

The requirements for static NodePool behaves identically to dynamic pools—it defines the constraints for all NodeClaims launched under that NodePool.

The NodeClaim requirements are directly derived from the NodeClaimTemplate on the NodePool. These are evaluated once per NodeClaim at creation, meaning the selection is based solely on what the template allows.
As a result, even though all NodeClaims come from the same static NodePool, they may still result in different instance types (shapes/flavors), depending on availability, since that decision happens during cloud-provider Create() call.

### Well-Known Labels

#### Instance Types
Expand All @@ -229,6 +261,12 @@ Karpenter can be configured to create nodes in a particular zone. Note that the
[Learn more about Availability Zone
IDs.](https://docs.aws.amazon.com/ram/latest/userguide/working-with-az-ids.html)

**Static NodePool**

Topology requirement field is the source of truth for topology decisions. Users who want to spread nodes across zones can do so explicitly by:
- Specifying multiple zones in the topology.kubernetes.io/zone requirement, or
- Creating multiple static NodePools, each pinned to a specific AZ.

#### Architecture

- key: `kubernetes.io/arch`
Expand Down Expand Up @@ -373,12 +411,15 @@ The NodePool spec includes a limits section (`spec.limits`), which constrains th

If the `NodePool.spec.limits` section is unspecified, it means that there is no default limitation on resource allocation. In this case, the maximum resource consumption is governed by the quotas set by your cloud provider. If a limit has been exceeded, nodes provisioning is prevented until some nodes have been terminated.

**For Static NodePools:** Only `limits.nodes` is supported. This field constrains the maximum number of nodes during scaling operations or drift replacement. Note that `limits.nodes` is support only on static NodePools.

```yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
spec:
replicas: 10
template:
spec:
requirements:
Expand All @@ -389,6 +430,8 @@ spec:
cpu: 1000
memory: 1000Gi
nvidia.com/gpu: 2
# For static NodePools, only nodes limit is allowed
nodes: 20
```

{{% alert title="Note" color="primary" %}}
Expand Down Expand Up @@ -431,6 +474,9 @@ NodePools have the following status conditions:

If a NodePool is not ready, it will not be considered for scheduling.

## status.nodes
This field shows the current number of nodes managed by the NodePool.

## status.resources
Objects under `status.resources` provide information about the status of resources such as `cpu`, `memory`, and `ephemeral-storage`.

Expand Down Expand Up @@ -462,6 +508,33 @@ spec:
```
In order for a pod to run on a node defined in this NodePool, it must tolerate `nvidia.com/gpu` in its pod spec.

### Static NodePool

A NodePool can be configured for static capacity by setting the `replicas` field. This maintains a fixed number of nodes regardless of pod demand:

```yaml
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: static-capacity
spec:
replicas: 10
template:
spec:
requirements:
- key: node.kubernetes.io/instance-type
operator: In
values: ["m5.large", "m5.xlarge"]
- key: topology.kubernetes.io/zone
operator: In
values: ["us-west-2a"]
limits:
nodes: 15 # Maximum nodes during scaling/drift
disruption:
budgets:
- nodes: 20% # Disruption budget for drift replacement
```

### Cilium Startup Taint

Per the Cilium [docs](https://docs.cilium.io/en/stable/installation/taints/#taint-effects), it's recommended to place a taint of `node.cilium.io/agent-not-ready=true:NoExecute` on nodes to allow Cilium to configure networking prior to other pods starting. This can be accomplished via the use of Karpenter `startupTaints`. These taints are placed on the node, but pods aren't required to tolerate these taints to be considered for provisioning.
Expand Down