Merge pull request #7548 from adrianmoisey/move_around_docs

Move all VPA docs into ./docs
kubernetes · Dec 4, 2024 · b5b760f · b5b760f
2 parents 4ec3336 + 1621f41
commit b5b760f
Show file tree

Hide file tree

Showing 11 changed files with 569 additions and 538 deletions.
diff --git a/vertical-pod-autoscaler/README.md b/vertical-pod-autoscaler/README.md
diff --git a/vertical-pod-autoscaler/docs/components.md b/vertical-pod-autoscaler/docs/components.md
@@ -0,0 +1,135 @@
+# Components
+
+## Contents
+
+- [Components](#components)
+  - [Introduction](#introduction)
+  - [Recommender](#recommender)
+    - [Running](#running-the-recommender)
+    - [Implementation](#implementation-of-the-recommender)
+  - [Updater](#updater)
+    - [Current implementation](#current-implementation)
+    - [Missing Parts](#missing-parts)
+  - [Admission Controller](#admission-controller)
+    - [Running](#running-the-admission-controller)
+    - [Implementation](#implementation-of-the-admission-controller)
+
+## Introduction
+
+The VPA project consists of 3 components:
+
+- [Recommender](#recommender) - monitors the current and past resource consumption and, based on it,
+  provides recommended values for the containers' cpu and memory requests.
+
+- [Updater](#updater) - checks which of the managed pods have correct resources set and, if not,
+  kills them so that they can be recreated by their controllers with the updated requests.
+
+- [Admission Controller](#admission-controller) - sets the correct resource requests on new pods (either just created
+  or recreated by their controller due to Updater's activity).
+
+More on the architecture can be found [HERE](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md).
+
+## Recommender
+
+Recommender is the core binary of Vertical Pod Autoscaler system.
+It computes the recommended resource requests for pods based on
+historical and current usage of the resources.
+The current recommendations are put in status of the VPA resource, where they
+can be inspected.
+
+## Running the recommender
+
+- In order to have historical data pulled in by the recommender, install
+  Prometheus in your cluster and pass its address through a flag.
+- Create RBAC configuration from `../deploy/vpa-rbac.yaml`.
+- Create a deployment with the recommender pod from
+  `../deploy/recommender-deployment.yaml`.
+- The recommender will start running and pushing its recommendations to VPA
+  object statuses.
+
+### Implementation of the recommender
+
+The recommender is based on a model of the cluster that it builds in its memory.
+The model contains Kubernetes resources: *Pods*, *VerticalPodAutoscalers*, with
+their configuration (e.g. labels) as well as other information, e.g. usage data for
+each container.
+
+After starting the binary, recommender reads the history of running pods and
+their usage from Prometheus into the model.
+It then runs in a loop and at each step performs the following actions:
+
+- update model with recent information on resources (using listers based on
+  watch),
+- update model with fresh usage samples from Metrics API,
+- compute new recommendation for each VPA,
+- put any changed recommendations into the VPA resources.
+
+## Updater
+
+Updater component for Vertical Pod Autoscaler described in the [Vertical Pod Autoscaler - design proposal](https://github.com/kubernetes/community/pull/338)
+
+Updater runs in Kubernetes cluster and decides which pods should be restarted
+based on resources allocation recommendation calculated by Recommender.
+If a pod should be updated, Updater will try to evict the pod.
+It respects the pod disruption budget, by using Eviction API to evict pods.
+Updater does not perform the actual resources update, but relies on Vertical Pod Autoscaler admission plugin
+to update pod resources when the pod is recreated after eviction.
+
+### Current implementation
+
+Runs in a loop. On one iteration performs:
+
+- Fetching Vertical Pod Autoscaler configuration using a lister implementation.
+- Fetching live pods information with their current resource allocation.
+- For each replicated pods group calculating if pod update is required and how many replicas can be evicted.
+Updater will always allow eviction of at least one pod in replica set. Maximum ratio of evicted replicas is specified by flag.
+- Evicting pods if recommended resources significantly vary from the actual resources allocation.
+Threshold for evicting pods is specified by recommended min/max values from VPA resource.
+Priority of evictions within a set of replicated pods is proportional to sum of percentages of changes in resources
+(i.e. pod with 15% memory increase 15% cpu decrease recommended will be evicted
+before pod with 20% memory increase and no change in cpu).
+
+### Missing parts
+
+- Recommendation API for fetching data from Vertical Pod Autoscaler Recommender.
+
+## Admission-controller
+
+This is a binary that registers itself as a Mutating Admission Webhook
+and because of that is on the path of creating all pods.
+For each pod creation, it will get a request from the apiserver and it will
+either decide there's no matching VPA configuration or find the corresponding
+one and use current recommendation to set resource requests in the pod.
+
+### Running the admission-controller
+
+1. You should make sure your API server supports Mutating Webhooks.
+Its `--admission-control` flag should have `MutatingAdmissionWebhook` as one of
+the values on the list and its `--runtime-config` flag should include
+`admissionregistration.k8s.io/v1beta1=true`.
+To change those flags, ssh to your API Server instance, edit
+`/etc/kubernetes/manifests/kube-apiserver.manifest` and restart kubelet to pick
+up the changes: ```sudo systemctl restart kubelet.service```
+1. Generate certs by running `bash gencerts.sh`. This will use kubectl to create
+   a secret in your cluster with the certs.
+1. Create RBAC configuration for the admission controller pod by running
+   `kubectl create -f ../deploy/admission-controller-rbac.yaml`
+1. Create the pod:
+   `kubectl create -f ../deploy/admission-controller-deployment.yaml`.
+   The first thing this will do is it will register itself with the apiserver as
+   Webhook Admission Controller and start changing resource requirements
+   for pods on their creation & updates.
+1. You can specify a path for it to register as a part of the installation process
+   by setting `--register-by-url=true` and passing `--webhook-address` and `--webhook-port`.
+1. You can specify a minimum TLS version with `--min-tls-version` with acceptable values being `tls1_2` (default), or `tls1_3`.
+1. You can also specify a comma or colon separated list of ciphers for the server to use with `--tls-ciphers` if `--min-tls-version` is set to `tls1_2`.
+1. You can specify a comma separated list to set webhook labels with `--webhook-labels`, example format: key1:value1,key2:value2.
+
+### Implementation of the Admission Controller
+
+All VPA configurations in the cluster are watched with a lister.
+In the context of pod creation, there is an incoming https request from
+apiserver.
+The logic to serve that request involves finding the appropriate VPA, retrieving
+current recommendation from it and encodes the recommendation as a json patch to
+the Pod resource.
diff --git a/vertical-pod-autoscaler/docs/examples.md b/vertical-pod-autoscaler/docs/examples.md
@@ -0,0 +1,110 @@
+# Examples
+
+## Contents
+
+- [Examples](#examples)
+  - [Keeping limit proportional to request](#keeping-limit-proportional-to-request)
+  - [Capping to Limit Range](#capping-to-limit-range)
+  - [Resource Policy Overriding Limit Range](#resource-policy-overriding-limit-range)
+  - [Starting multiple recommenders](#starting-multiple-recommenders)
+  - [Using CPU management with static policy](#using-cpu-management-with-static-policy)
+  - [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource)
+  - [Limiting which namespaces are used](#limiting-which-namespaces-are-used)
+  - [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy)
+
+## Keeping limit proportional to request
+
+The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
+specifies resource limit of 2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA
+applies the recommendation, it will also set the memory limit to 4 GB.
+
+## Capping to Limit Range
+
+The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
+specifies resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container.
+VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA applies the recommendation, it will
+set the memory limit to 3 GB (to keep it within the allowed limit range) and the memory request to 1.5 GB (
+to maintain a 2:1 limit/request ratio from the template).
+
+## Resource Policy Overriding Limit Range
+
+The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also
+specifies a resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container.
+VPAs Container Resource Policy requires VPA to set containers request to at least 750 milli CPU and
+2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When applying the recommendation,
+VPA will set RAM request to 2 GB (following the resource policy) and RAM limit to 4 GB (to maintain
+the 2:1 limit/request ratio from the template).
+
+## Starting multiple recommenders
+
+It is possible to start one or more extra recommenders in order to use different percentile on different workload profiles.
+For example you could have 3 profiles:  [frugal](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-low.yaml),
+[standard](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment.yaml) and
+[performance](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-high.yaml) which will
+use different TargetCPUPercentile (50, 90 and 95) to calculate their recommendations.
+
+Please note the usage of the following arguments to override default names and percentiles:
+
+- --recommender-name=performance
+- --target-cpu-percentile=0.95
+
+You can then choose which recommender to use by setting `recommenders` inside the `VerticalPodAutoscaler` spec.
+
+## Custom memory bump-up after OOMKill
+
+After an OOMKill event was observed, VPA increases the memory recommendation based on the observed memory usage in the event according to this formula: `recommendation = memory-usage-in-oomkill-event + max(oom-min-bump-up-bytes, memory-usage-in-oomkill-event * oom-bump-up-ratio)`.
+You can configure the minimum bump-up as well as the multiplier by specifying startup arguments for the recommender:
+`oom-bump-up-ratio` specifies the memory bump up ratio when OOM occurred, default is `1.2`. This means, memory will be increased by 20% after an OOMKill event.
+`oom-min-bump-up-bytes` specifies minimal increase of memory after observing OOM. Defaults to `100 * 1024 * 1024` (=100MiB)
+
+Usage in recommender deployment
+
+```yaml
+  containers:
+  - name: recommender
+    args:
+      - --oom-bump-up-ratio=2.0
+      - --oom-min-bump-up-bytes=524288000
+```
+
+## Using CPU management with static policy
+
+If you are using the [CPU management with static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) for some containers,
+you probably want the CPU recommendation to be an integer. A dedicated recommendation pre-processor can perform a round up on the CPU recommendation. Recommendation capping still applies after the round up.
+To activate this feature, pass the flag `--cpu-integer-post-processor-enabled` when you start the recommender.
+The pre-processor only acts on containers having a specific configuration. This configuration consists in an annotation on your VPA object for each impacted container.
+The annotation format is the following:
+
+```yaml
+vpa-post-processor.kubernetes.io/{containerName}_integerCPU=true
+```
+
+## Controlling eviction behavior based on scaling direction and resource
+
+To limit disruptions caused by evictions, you can put additional constraints on the Updater's eviction behavior by specifying `.updatePolicy.EvictionRequirements` in the VPA spec. An `EvictionRequirement` contains a resource and a `ChangeRequirement`, which is evaluated by comparing a new recommendation against the currently set resources for a container
+
+Here is an example configuration which allows evictions only when CPU or memory get scaled up, but not when they both are scaled down
+
+```yaml
+ updatePolicy:
+   evictionRequirements:
+     - resources: ["cpu", "memory"]
+       changeRequirement: TargetHigherThanRequests
+```
+
+Note that this doesn't prevent scaling down entirely, as Pods may get recreated for different reasons, resulting in a new recommendation being applied. See [the original AEP](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4831-control-eviction-behavior) for more context and usage information.
+
+## Limiting which namespaces are used
+
+ By default the VPA will run against all namespaces. You can limit that behaviour by setting the following options:
+
+1. `ignored-vpa-object-namespaces` - A comma separated list of namespaces to ignore
+1. `vpa-object-namespace` - A single namespace to monitor
+
+These options cannot be used together and are mutually exclusive.
+
+## Setting the webhook failurePolicy
+
+It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller.
+Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA.
+Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk.
diff --git a/vertical-pod-autoscaler/FAQ.md → vertical-pod-autoscaler/docs/faq.md b/vertical-pod-autoscaler/FAQ.md → vertical-pod-autoscaler/docs/faq.md
@@ -2,7 +2,7 @@
 
 ## Contents
 
-- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-CPU-or-memory-settings)
+- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-cpu-or-memory-settings)
 - [How can I apply VPA to my Custom Resource?](#how-can-i-apply-vpa-to-my-custom-resource)
 - [How can I use Prometheus as a history provider for the VPA recommender?](#how-can-i-use-prometheus-as-a-history-provider-for-the-vpa-recommender)
 - [I get recommendations for my single pod replicaSet, but they are not applied. Why?](#i-get-recommendations-for-my-single-pod-replicaset-but-they-are-not-applied)
@@ -135,7 +135,7 @@ spec:
     - --v=4
     - --storage=prometheus
     - --prometheus-address=http://prometheus.default.svc.cluster.local:9090
-  ```
+```
 
 In this example, Prometheus is running in the default namespace.
 
@@ -148,9 +148,9 @@ Here you should see the flags that you set for the VPA recommender and you shoul
 
 This means that the VPA recommender is now using Prometheus as the history provider.
 
-### I get recommendations for my single pod replicaSet but they are not applied
+### I get recommendations for my single pod replicaset but they are not applied
 
-By default, the [`--min-replicas`](pkg/updater/main.go#L56) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](deploy/updater-deployment.yaml) file:
+By default, the [`--min-replicas`](https://github.com/kubernetes/autoscaler/tree/master/pkg/updater/main.go#L44) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](https://github.com/kubernetes/autoscaler/tree/master/deploy/updater-deployment.yaml) file:
 
 ```yaml
 spec:
@@ -179,7 +179,7 @@ election with the `--leader-elect=true` parameter.
 The following startup parameters are supported for VPA recommender:
 
 Name | Type | Description | Default
-|-|-|-|-|
+-|-|-|-
 `recommendation-margin-fraction` | Float64 | Fraction of usage added as the safety margin to the recommended request | 0.15
 `pod-recommendation-min-cpu-millicores` | Float64 | Minimum CPU recommendation for a pod | 25
 `pod-recommendation-min-memory-mb` | Float64 | Minimum memory recommendation for a pod | 250
@@ -230,7 +230,7 @@ Name | Type | Description | Default
 The following startup parameters are supported for VPA updater:
 
 Name | Type | Description | Default
-|-|-|-|-|
+-|-|-|-
 `pod-update-threshold` | Float64 | Ignore updates that have priority lower than the value of this flag | 0.1
 `in-recommendation-bounds-eviction-lifetime-threshold` | Duration | Pods that live for at least that long can be evicted even if their request is within the [MinRecommended...MaxRecommended] range | time.Hour*12
 `evict-after-oom-threshold` | Duration | Evict pod that has OOMed in less than evict-after-oom-threshold since start. | 10*time.Minute

diff --git a/vertical-pod-autoscaler/docs/features.md b/vertical-pod-autoscaler/docs/features.md
@@ -0,0 +1,18 @@
+# Features
+
+## Contents
+
+- [Limits control](#limits-control)
+
+## Limits control
+
+When setting limits VPA will conform to
+[resource policies](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.2.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L95-L103).
+It will maintain limit to request ratio specified for all containers.
+
+VPA will try to cap recommendations between min and max of
+[limit ranges](https://kubernetes.io/docs/concepts/policy/limit-range/). If limit range conflicts
+with VPA resource policy, VPA will follow VPA policy (and set values outside the limit
+range).
+
+To disable getting VPA recommendations for an individual container, set `mode` to `"Off"` in `containerPolicies`.