Skip to content

Commit

Permalink
Merge pull request #7548 from adrianmoisey/move_around_docs
Browse files Browse the repository at this point in the history
Move all VPA docs into ./docs
  • Loading branch information
k8s-ci-robot authored Dec 4, 2024
2 parents 4ec3336 + 1621f41 commit b5b760f
Show file tree
Hide file tree
Showing 11 changed files with 569 additions and 538 deletions.
425 changes: 13 additions & 412 deletions vertical-pod-autoscaler/README.md

Large diffs are not rendered by default.

135 changes: 135 additions & 0 deletions vertical-pod-autoscaler/docs/components.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,135 @@
# Components

## Contents

- [Components](#components)
- [Introduction](#introduction)
- [Recommender](#recommender)
- [Running](#running-the-recommender)
- [Implementation](#implementation-of-the-recommender)
- [Updater](#updater)
- [Current implementation](#current-implementation)
- [Missing Parts](#missing-parts)
- [Admission Controller](#admission-controller)
- [Running](#running-the-admission-controller)
- [Implementation](#implementation-of-the-admission-controller)

## Introduction

The VPA project consists of 3 components:

- [Recommender](#recommender) - monitors the current and past resource consumption and, based on it,
provides recommended values for the containers' cpu and memory requests.

- [Updater](#updater) - checks which of the managed pods have correct resources set and, if not,
kills them so that they can be recreated by their controllers with the updated requests.

- [Admission Controller](#admission-controller) - sets the correct resource requests on new pods (either just created
or recreated by their controller due to Updater's activity).

More on the architecture can be found [HERE](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md).

## Recommender

Recommender is the core binary of Vertical Pod Autoscaler system.
It computes the recommended resource requests for pods based on
historical and current usage of the resources.
The current recommendations are put in status of the VPA resource, where they
can be inspected.

## Running the recommender

- In order to have historical data pulled in by the recommender, install
Prometheus in your cluster and pass its address through a flag.
- Create RBAC configuration from `../deploy/vpa-rbac.yaml`.
- Create a deployment with the recommender pod from
`../deploy/recommender-deployment.yaml`.
- The recommender will start running and pushing its recommendations to VPA
object statuses.

### Implementation of the recommender

The recommender is based on a model of the cluster that it builds in its memory.
The model contains Kubernetes resources: *Pods*, *VerticalPodAutoscalers*, with
their configuration (e.g. labels) as well as other information, e.g. usage data for
each container.

After starting the binary, recommender reads the history of running pods and
their usage from Prometheus into the model.
It then runs in a loop and at each step performs the following actions:

- update model with recent information on resources (using listers based on
watch),
- update model with fresh usage samples from Metrics API,
- compute new recommendation for each VPA,
- put any changed recommendations into the VPA resources.

## Updater

Updater component for Vertical Pod Autoscaler described in the [Vertical Pod Autoscaler - design proposal](https://github.com/kubernetes/community/pull/338)

Updater runs in Kubernetes cluster and decides which pods should be restarted
based on resources allocation recommendation calculated by Recommender.
If a pod should be updated, Updater will try to evict the pod.
It respects the pod disruption budget, by using Eviction API to evict pods.
Updater does not perform the actual resources update, but relies on Vertical Pod Autoscaler admission plugin
to update pod resources when the pod is recreated after eviction.

### Current implementation

Runs in a loop. On one iteration performs:

- Fetching Vertical Pod Autoscaler configuration using a lister implementation.
- Fetching live pods information with their current resource allocation.
- For each replicated pods group calculating if pod update is required and how many replicas can be evicted.
Updater will always allow eviction of at least one pod in replica set. Maximum ratio of evicted replicas is specified by flag.
- Evicting pods if recommended resources significantly vary from the actual resources allocation.
Threshold for evicting pods is specified by recommended min/max values from VPA resource.
Priority of evictions within a set of replicated pods is proportional to sum of percentages of changes in resources
(i.e. pod with 15% memory increase 15% cpu decrease recommended will be evicted
before pod with 20% memory increase and no change in cpu).

### Missing parts

- Recommendation API for fetching data from Vertical Pod Autoscaler Recommender.

## Admission-controller

This is a binary that registers itself as a Mutating Admission Webhook
and because of that is on the path of creating all pods.
For each pod creation, it will get a request from the apiserver and it will
either decide there's no matching VPA configuration or find the corresponding
one and use current recommendation to set resource requests in the pod.

### Running the admission-controller

1. You should make sure your API server supports Mutating Webhooks.
Its `--admission-control` flag should have `MutatingAdmissionWebhook` as one of
the values on the list and its `--runtime-config` flag should include
`admissionregistration.k8s.io/v1beta1=true`.
To change those flags, ssh to your API Server instance, edit
`/etc/kubernetes/manifests/kube-apiserver.manifest` and restart kubelet to pick
up the changes: ```sudo systemctl restart kubelet.service```
1. Generate certs by running `bash gencerts.sh`. This will use kubectl to create
a secret in your cluster with the certs.
1. Create RBAC configuration for the admission controller pod by running
`kubectl create -f ../deploy/admission-controller-rbac.yaml`
1. Create the pod:
`kubectl create -f ../deploy/admission-controller-deployment.yaml`.
The first thing this will do is it will register itself with the apiserver as
Webhook Admission Controller and start changing resource requirements
for pods on their creation & updates.
1. You can specify a path for it to register as a part of the installation process
by setting `--register-by-url=true` and passing `--webhook-address` and `--webhook-port`.
1. You can specify a minimum TLS version with `--min-tls-version` with acceptable values being `tls1_2` (default), or `tls1_3`.
1. You can also specify a comma or colon separated list of ciphers for the server to use with `--tls-ciphers` if `--min-tls-version` is set to `tls1_2`.
1. You can specify a comma separated list to set webhook labels with `--webhook-labels`, example format: key1:value1,key2:value2.

### Implementation of the Admission Controller

All VPA configurations in the cluster are watched with a lister.
In the context of pod creation, there is an incoming https request from
apiserver.
The logic to serve that request involves finding the appropriate VPA, retrieving
current recommendation from it and encodes the recommendation as a json patch to
the Pod resource.
110 changes: 110 additions & 0 deletions vertical-pod-autoscaler/docs/examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Examples

## Contents

- [Examples](#examples)
- [Keeping limit proportional to request](#keeping-limit-proportional-to-request)
- [Capping to Limit Range](#capping-to-limit-range)
- [Resource Policy Overriding Limit Range](#resource-policy-overriding-limit-range)
- [Starting multiple recommenders](#starting-multiple-recommenders)
- [Using CPU management with static policy](#using-cpu-management-with-static-policy)
- [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource)
- [Limiting which namespaces are used](#limiting-which-namespaces-are-used)
- [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy)

## Keeping limit proportional to request

The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
specifies resource limit of 2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA
applies the recommendation, it will also set the memory limit to 4 GB.

## Capping to Limit Range

The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
specifies resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container.
VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA applies the recommendation, it will
set the memory limit to 3 GB (to keep it within the allowed limit range) and the memory request to 1.5 GB (
to maintain a 2:1 limit/request ratio from the template).

## Resource Policy Overriding Limit Range

The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
specifies a resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container.
VPAs Container Resource Policy requires VPA to set containers request to at least 750 milli CPU and
2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When applying the recommendation,
VPA will set RAM request to 2 GB (following the resource policy) and RAM limit to 4 GB (to maintain
the 2:1 limit/request ratio from the template).

## Starting multiple recommenders

It is possible to start one or more extra recommenders in order to use different percentile on different workload profiles.
For example you could have 3 profiles: [frugal](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-low.yaml),
[standard](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment.yaml) and
[performance](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-high.yaml) which will
use different TargetCPUPercentile (50, 90 and 95) to calculate their recommendations.

Please note the usage of the following arguments to override default names and percentiles:

- --recommender-name=performance
- --target-cpu-percentile=0.95

You can then choose which recommender to use by setting `recommenders` inside the `VerticalPodAutoscaler` spec.

## Custom memory bump-up after OOMKill

After an OOMKill event was observed, VPA increases the memory recommendation based on the observed memory usage in the event according to this formula: `recommendation = memory-usage-in-oomkill-event + max(oom-min-bump-up-bytes, memory-usage-in-oomkill-event * oom-bump-up-ratio)`.
You can configure the minimum bump-up as well as the multiplier by specifying startup arguments for the recommender:
`oom-bump-up-ratio` specifies the memory bump up ratio when OOM occurred, default is `1.2`. This means, memory will be increased by 20% after an OOMKill event.
`oom-min-bump-up-bytes` specifies minimal increase of memory after observing OOM. Defaults to `100 * 1024 * 1024` (=100MiB)

Usage in recommender deployment

```yaml
containers:
- name: recommender
args:
- --oom-bump-up-ratio=2.0
- --oom-min-bump-up-bytes=524288000
```
## Using CPU management with static policy
If you are using the [CPU management with static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) for some containers,
you probably want the CPU recommendation to be an integer. A dedicated recommendation pre-processor can perform a round up on the CPU recommendation. Recommendation capping still applies after the round up.
To activate this feature, pass the flag `--cpu-integer-post-processor-enabled` when you start the recommender.
The pre-processor only acts on containers having a specific configuration. This configuration consists in an annotation on your VPA object for each impacted container.
The annotation format is the following:

```yaml
vpa-post-processor.kubernetes.io/{containerName}_integerCPU=true
```

## Controlling eviction behavior based on scaling direction and resource

To limit disruptions caused by evictions, you can put additional constraints on the Updater's eviction behavior by specifying `.updatePolicy.EvictionRequirements` in the VPA spec. An `EvictionRequirement` contains a resource and a `ChangeRequirement`, which is evaluated by comparing a new recommendation against the currently set resources for a container

Here is an example configuration which allows evictions only when CPU or memory get scaled up, but not when they both are scaled down

```yaml
updatePolicy:
evictionRequirements:
- resources: ["cpu", "memory"]
changeRequirement: TargetHigherThanRequests
```

Note that this doesn't prevent scaling down entirely, as Pods may get recreated for different reasons, resulting in a new recommendation being applied. See [the original AEP](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4831-control-eviction-behavior) for more context and usage information.

## Limiting which namespaces are used

By default the VPA will run against all namespaces. You can limit that behaviour by setting the following options:

1. `ignored-vpa-object-namespaces` - A comma separated list of namespaces to ignore
1. `vpa-object-namespace` - A single namespace to monitor

These options cannot be used together and are mutually exclusive.

## Setting the webhook failurePolicy

It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller.
Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA.
Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Contents

- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-CPU-or-memory-settings)
- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-cpu-or-memory-settings)
- [How can I apply VPA to my Custom Resource?](#how-can-i-apply-vpa-to-my-custom-resource)
- [How can I use Prometheus as a history provider for the VPA recommender?](#how-can-i-use-prometheus-as-a-history-provider-for-the-vpa-recommender)
- [I get recommendations for my single pod replicaSet, but they are not applied. Why?](#i-get-recommendations-for-my-single-pod-replicaset-but-they-are-not-applied)
Expand Down Expand Up @@ -135,7 +135,7 @@ spec:
- --v=4
- --storage=prometheus
- --prometheus-address=http://prometheus.default.svc.cluster.local:9090
```
```

In this example, Prometheus is running in the default namespace.

Expand All @@ -148,9 +148,9 @@ Here you should see the flags that you set for the VPA recommender and you shoul

This means that the VPA recommender is now using Prometheus as the history provider.

### I get recommendations for my single pod replicaSet but they are not applied
### I get recommendations for my single pod replicaset but they are not applied

By default, the [`--min-replicas`](pkg/updater/main.go#L56) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](deploy/updater-deployment.yaml) file:
By default, the [`--min-replicas`](https://github.com/kubernetes/autoscaler/tree/master/pkg/updater/main.go#L44) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](https://github.com/kubernetes/autoscaler/tree/master/deploy/updater-deployment.yaml) file:

```yaml
spec:
Expand Down Expand Up @@ -179,7 +179,7 @@ election with the `--leader-elect=true` parameter.
The following startup parameters are supported for VPA recommender:

Name | Type | Description | Default
|-|-|-|-|
-|-|-|-
`recommendation-margin-fraction` | Float64 | Fraction of usage added as the safety margin to the recommended request | 0.15
`pod-recommendation-min-cpu-millicores` | Float64 | Minimum CPU recommendation for a pod | 25
`pod-recommendation-min-memory-mb` | Float64 | Minimum memory recommendation for a pod | 250
Expand Down Expand Up @@ -230,7 +230,7 @@ Name | Type | Description | Default
The following startup parameters are supported for VPA updater:

Name | Type | Description | Default
|-|-|-|-|
-|-|-|-
`pod-update-threshold` | Float64 | Ignore updates that have priority lower than the value of this flag | 0.1
`in-recommendation-bounds-eviction-lifetime-threshold` | Duration | Pods that live for at least that long can be evicted even if their request is within the [MinRecommended...MaxRecommended] range | time.Hour*12
`evict-after-oom-threshold` | Duration | Evict pod that has OOMed in less than evict-after-oom-threshold since start. | 10*time.Minute
Expand Down
18 changes: 18 additions & 0 deletions vertical-pod-autoscaler/docs/features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Features

## Contents

- [Limits control](#limits-control)

## Limits control

When setting limits VPA will conform to
[resource policies](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.2.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L95-L103).
It will maintain limit to request ratio specified for all containers.

VPA will try to cap recommendations between min and max of
[limit ranges](https://kubernetes.io/docs/concepts/policy/limit-range/). If limit range conflicts
with VPA resource policy, VPA will follow VPA policy (and set values outside the limit
range).

To disable getting VPA recommendations for an individual container, set `mode` to `"Off"` in `containerPolicies`.
Loading

0 comments on commit b5b760f

Please sign in to comment.