From 1621f41076b7046961a2032b4486f60630fcbafc Mon Sep 17 00:00:00 2001 From: Adrian Moisey Date: Sun, 1 Dec 2024 19:35:16 +0200 Subject: [PATCH] Move all VPA docs into ./docs This is to try make the docs decoupled from the code, hoping that this will allow for easier restructuring of the docs, and also lead to improved docs over time. --- vertical-pod-autoscaler/README.md | 425 +----------------- vertical-pod-autoscaler/docs/components.md | 135 ++++++ vertical-pod-autoscaler/docs/examples.md | 110 +++++ .../{FAQ.md => docs/faq.md} | 12 +- vertical-pod-autoscaler/docs/features.md | 18 + vertical-pod-autoscaler/docs/installation.md | 161 +++++++ .../docs/known-limitations.md | 29 ++ vertical-pod-autoscaler/docs/quickstart.md | 97 ++++ .../pkg/admission-controller/README.md | 46 -- .../pkg/recommender/README.md | 40 -- vertical-pod-autoscaler/pkg/updater/README.md | 34 -- 11 files changed, 569 insertions(+), 538 deletions(-) create mode 100644 vertical-pod-autoscaler/docs/components.md create mode 100644 vertical-pod-autoscaler/docs/examples.md rename vertical-pod-autoscaler/{FAQ.md => docs/faq.md} (97%) create mode 100644 vertical-pod-autoscaler/docs/features.md create mode 100644 vertical-pod-autoscaler/docs/installation.md create mode 100644 vertical-pod-autoscaler/docs/known-limitations.md create mode 100644 vertical-pod-autoscaler/docs/quickstart.md delete mode 100644 vertical-pod-autoscaler/pkg/admission-controller/README.md delete mode 100644 vertical-pod-autoscaler/pkg/recommender/README.md delete mode 100644 vertical-pod-autoscaler/pkg/updater/README.md diff --git a/vertical-pod-autoscaler/README.md b/vertical-pod-autoscaler/README.md index 6ae688a51474..d64097275a51 100644 --- a/vertical-pod-autoscaler/README.md +++ b/vertical-pod-autoscaler/README.md @@ -1,34 +1,15 @@ # Vertical Pod Autoscaler ## Contents + - [Contents](#contents) - [Intro](#intro) -- [Installation](#installation) - - [Compatibility](#compatibility) - - [Notice on deprecation of v1beta2 version (>=0.13.0)](#notice-on-deprecation-of-v1beta2-version-0130) - - [Notice on removal of v1beta1 version (>=0.5.0)](#notice-on-removal-of-v1beta1-version-050) - - [Prerequisites](#prerequisites) - - [Install command](#install-command) - - [Quick start](#quick-start) - - [Test your installation](#test-your-installation) - - [Example VPA configuration](#example-vpa-configuration) - - [Troubleshooting](#troubleshooting) - - [Components of VPA](#components-of-vpa) - - [Tear down](#tear-down) -- [Limits control](#limits-control) -- [Examples](#examples) - - [Keeping limit proportional to request](#keeping-limit-proportional-to-request) - - [Capping to Limit Range](#capping-to-limit-range) - - [Resource Policy Overriding Limit Range](#resource-policy-overriding-limit-range) - - [Starting multiple recommenders](#starting-multiple-recommenders) - - [Using CPU management with static policy](#using-cpu-management-with-static-policy) - - [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource) - - [Limiting which namespaces are used](#limiting-which-namespaces-are-used) - - [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy) -- [Known limitations](#known-limitations) +- [Getting started](#getting-started) +- [Components and Architecture](#components-and-architecture) +- [Features and Known limitations](#features-and-known-limitations) - [Related links](#related-links) -# Intro +## Intro Vertical Pod Autoscaler (VPA) frees users from the necessity of setting up-to-date resource requests for the containers in their pods. When @@ -50,402 +31,22 @@ resource recommendations are applied. To enable vertical pod autoscaling on your cluster please follow the installation procedure described below. -# Installation - -The current default version is Vertical Pod Autoscaler 1.2.1 - -### Compatibility - -| VPA version | Kubernetes version | -|-----------------|--------------------| -| 1.2.1 | 1.27+ | -| 1.2.0 | 1.27+ | -| 1.1.2 | 1.25+ | -| 1.1.1 | 1.25+ | -| 1.0 | 1.25+ | -| 0.14 | 1.25+ | -| 0.13 | 1.25+ | -| 0.12 | 1.25+ | -| 0.11 | 1.22 - 1.24 | -| 0.10 | 1.22+ | -| 0.9 | 1.16+ | -| 0.8 | 1.13+ | -| 0.4 to 0.7 | 1.11+ | -| 0.3.X and lower | 1.7+ | - -### Notice on CRD update (>=1.0.0) - -**NOTE:** In version 1.0.0, we have updated the CRD definition and added RBAC for the -status resource. If you are upgrading from version (<=0.14.0), you must update the CRD -definition and RBAC. -```shell -kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/vpa-release-1.0/vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml -kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/vpa-release-1.0/vertical-pod-autoscaler/deploy/vpa-rbac.yaml -``` -Another method is to re-execute the ./hack/vpa-process-yamls.sh script. -```shell -git clone https://github.com/kubernetes/autoscaler.git -cd autoscaler/vertical-pod-autoscaler -git checkout origin/vpa-release-1.0 -REGISTRY=registry.k8s.io/autoscaling TAG=1.0.0 ./hack/vpa-process-yamls.sh apply -``` - -If you need to roll back to version (<=0.14.0), please check out the release for your -rollback version and execute ./hack/vpa-process-yamls.sh. For example, to rollback to 0.14.0: -```shell -git checkout origin/vpa-release-0.14 -REGISTRY=registry.k8s.io/autoscaling TAG=0.14.0 ./hack/vpa-process-yamls.sh apply -kubectl delete clusterrole system:vpa-status-actor -kubectl delete clusterrolebinding system:vpa-status-actor -``` - -### Notice on deprecation of v1beta2 version (>=0.13.0) -**NOTE:** In 0.13.0 we deprecate `autoscaling.k8s.io/v1beta2` API. We plan to -remove this API version. While for now you can continue to use `v1beta2` API we -recommend using `autoscaling.k8s.io/v1` instead. `v1` and `v1beta2` APIs are -almost identical (`v1` API has some fields which are not present in `v1beta2`) -so simply changing which API version you're calling should be enough in almost -all cases. - -### Notice on removal of v1beta1 version (>=0.5.0) - -**NOTE:** In 0.5.0 we disabled the old version of the API - `autoscaling.k8s.io/v1beta1`. -The VPA objects in this version will no longer receive recommendations and -existing recommendations will be removed. The objects will remain present though -and a ConfigUnsupported condition will be set on them. - -This doc is for installing latest VPA. For instructions on migration from older versions see [Migration Doc](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/MIGRATE.md). - -### Prerequisites - -- `kubectl` should be connected to the cluster you want to install VPA. -- The metrics server must be deployed in your cluster. Read more about [Metrics Server](https://github.com/kubernetes-sigs/metrics-server). -- If you are using a GKE Kubernetes cluster, you will need to grant your current Google - identity `cluster-admin` role. Otherwise, you won't be authorized to grant extra - privileges to the VPA system components. - - ```console - $ gcloud info | grep Account # get current google identity - Account: [myname@example.org] - - $ kubectl create clusterrolebinding myname-cluster-admin-binding --clusterrole=cluster-admin --user=myname@example.org - Clusterrolebinding "myname-cluster-admin-binding" created - ``` - -- If you already have another version of VPA installed in your cluster, you have to tear down - the existing installation first with: - - ```console - ./hack/vpa-down.sh - ``` - -### Install command - -To install VPA, please download the source code of VPA (for example with `git clone https://github.com/kubernetes/autoscaler.git`) -and run the following command inside the `vertical-pod-autoscaler` directory: - -```console -./hack/vpa-up.sh -``` - -Note: the script currently reads environment variables: `$REGISTRY` and `$TAG`. -Make sure you leave them unset unless you want to use a non-default version of VPA. - -Note: If you are seeing following error during this step: -``` -unknown option -addext -``` -please upgrade openssl to version 1.1.1 or higher (needs to support -addext option) or use ./hack/vpa-up.sh on the [0.8 release branch](https://github.com/kubernetes/autoscaler/tree/vpa-release-0.8). - -The script issues multiple `kubectl` commands to the -cluster that insert the configuration and start all needed pods (see -[architecture](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md#architecture-overview)) -in the `kube-system` namespace. It also generates -and uploads a secret (a CA cert) used by VPA Admission Controller when communicating -with the API server. - -To print YAML contents with all resources that would be understood by -`kubectl diff|apply|...` commands, you can use - -```console -./hack/vpa-process-yamls.sh print -``` - -The output of that command won't include secret information generated by -[pkg/admission-controller/gencerts.sh](pkg/admission-controller/gencerts.sh) script. - -### Quick start - -After [installation](#installation) the system is ready to recommend and set -resource requests for your pods. -In order to use it, you need to insert a *Vertical Pod Autoscaler* resource for -each controller that you want to have automatically computed resource requirements. -This will be most commonly a **Deployment**. -There are four modes in which *VPAs* operate: - -- `"Auto"`: VPA assigns resource requests on pod creation as well as updates - them on existing pods using the preferred update mechanism. Currently, this is - equivalent to `"Recreate"` (see below). Once restart free ("in-place") update - of pod requests is available, it may be used as the preferred update mechanism by - the `"Auto"` mode. -- `"Recreate"`: VPA assigns resource requests on pod creation as well as updates - them on existing pods by evicting them when the requested resources differ significantly - from the new recommendation (respecting the Pod Disruption Budget, if defined). - This mode should be used rarely, only if you need to ensure that the pods are restarted - whenever the resource request changes. Otherwise, prefer the `"Auto"` mode which may take - advantage of restart-free updates once they are available. -- `"Initial"`: VPA only assigns resource requests on pod creation and never changes them - later. -- `"Off"`: VPA does not automatically change the resource requirements of the pods. - The recommendations are calculated and can be inspected in the VPA object. - -### Test your installation - -A simple way to check if Vertical Pod Autoscaler is fully operational in your -cluster is to create a sample deployment and a corresponding VPA config: - -```console -kubectl create -f examples/hamster.yaml -``` - -The above command creates a deployment with two pods, each running a single container -that requests 100 millicores and tries to utilize slightly above 500 millicores. -The command also creates a VPA config pointing at the deployment. -VPA will observe the behaviour of the pods, and after about 5 minutes, they should get -updated with a higher CPU request -(note that VPA does not modify the template in the deployment, but the actual requests -of the pods are updated). To see VPA config and current recommended resource requests run: - -```console -kubectl describe vpa -``` - -*Note: if your cluster has little free capacity these pods may be unable to schedule. -You may need to add more nodes or adjust examples/hamster.yaml to use less CPU.* - -### Example VPA configuration - -```yaml -apiVersion: autoscaling.k8s.io/v1 -kind: VerticalPodAutoscaler -metadata: - name: my-app-vpa -spec: - targetRef: - apiVersion: "apps/v1" - kind: Deployment - name: my-app - updatePolicy: - updateMode: "Auto" -``` - -### Troubleshooting - -To diagnose problems with a VPA installation, perform the following steps: - -- Check if all system components are running: - -```console -kubectl --namespace=kube-system get pods|grep vpa -``` - -The above command should list 3 pods (recommender, updater and admission-controller) -all in state Running. - -- Check if the system components log any errors. - For each of the pods returned by the previous command do: - -```console -kubectl --namespace=kube-system logs [pod name] | grep -e '^E[0-9]\{4\}' -``` - -- Check that the VPA Custom Resource Definition was created: - -```console -kubectl get customresourcedefinition | grep verticalpodautoscalers -``` - -### Components of VPA - -The project consists of 3 components: - -- [Recommender](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/recommender/README.md) - monitors the current and past resource consumption and, based on it, - provides recommended values for the containers' cpu and memory requests. - -- [Updater](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/updater/README.md) - checks which of the managed pods have correct resources set and, if not, - kills them so that they can be recreated by their controllers with the updated requests. - -- [Admission Plugin](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/pkg/admission-controller/README.md) - sets the correct resource requests on new pods (either just created - or recreated by their controller due to Updater's activity). - -More on the architecture can be found [HERE](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md). - -### Tear down - -Note that if you stop running VPA in your cluster, the resource requests -for the pods already modified by VPA will not change, but any new pods -will get resources as defined in your controllers (i.e. deployment or -replicaset) and not according to previous recommendations made by VPA. - -To stop using Vertical Pod Autoscaling in your cluster: - -- If running on GKE, clean up role bindings created in [Prerequisites](#prerequisites): - -```console -kubectl delete clusterrolebinding myname-cluster-admin-binding -``` - -- Tear down VPA components: - -```console -./hack/vpa-down.sh -``` - -# Limits control - -When setting limits VPA will conform to -[resource policies](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.2.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L95-L103). -It will maintain limit to request ratio specified for all containers. - -VPA will try to cap recommendations between min and max of -[limit ranges](https://kubernetes.io/docs/concepts/policy/limit-range/). If limit range conflicts -with VPA resource policy, VPA will follow VPA policy (and set values outside the limit -range). - -To disable getting VPA recommendations for an individual container, set `mode` to `"Off"` in `containerPolicies`. - -## Examples - -### Keeping limit proportional to request - -The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also -specifies resource limit of 2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA -applies the recommendation, it will also set the memory limit to 4 GB. - -### Capping to Limit Range - -The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also -specifies resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. -VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA applies the recommendation, it will -set the memory limit to 3 GB (to keep it within the allowed limit range) and the memory request to 1.5 GB ( -to maintain a 2:1 limit/request ratio from the template). - -### Resource Policy Overriding Limit Range - -The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also -specifies a resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. -VPAs Container Resource Policy requires VPA to set containers request to at least 750 milli CPU and -2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When applying the recommendation, -VPA will set RAM request to 2 GB (following the resource policy) and RAM limit to 4 GB (to maintain -the 2:1 limit/request ratio from the template). - -### Starting multiple recommenders - -It is possible to start one or more extra recommenders in order to use different percentile on different workload profiles. -For example you could have 3 profiles: [frugal](deploy/recommender-deployment-low.yaml), -[standard](deploy/recommender-deployment.yaml) and -[performance](deploy/recommender-deployment-high.yaml) which will -use different TargetCPUPercentile (50, 90 and 95) to calculate their recommendations. - -Please note the usage of the following arguments to override default names and percentiles: - -- --recommender-name=performance -- --target-cpu-percentile=0.95 - -You can then choose which recommender to use by setting `recommenders` inside the `VerticalPodAutoscaler` spec. - -### Custom memory bump-up after OOMKill - -After an OOMKill event was observed, VPA increases the memory recommendation based on the observed memory usage in the event according to this formula: `recommendation = memory-usage-in-oomkill-event + max(oom-min-bump-up-bytes, memory-usage-in-oomkill-event * oom-bump-up-ratio)`. -You can configure the minimum bump-up as well as the multiplier by specifying startup arguments for the recommender: -`oom-bump-up-ratio` specifies the memory bump up ratio when OOM occurred, default is `1.2`. This means, memory will be increased by 20% after an OOMKill event. -`oom-min-bump-up-bytes` specifies minimal increase of memory after observing OOM. Defaults to `100 * 1024 * 1024` (=100MiB) - -Usage in recommender deployment - -```yaml - containers: - - name: recommender - args: - - --oom-bump-up-ratio=2.0 - - --oom-min-bump-up-bytes=524288000 -``` - -### Using CPU management with static policy - -If you are using the [CPU management with static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) for some containers, -you probably want the CPU recommendation to be an integer. A dedicated recommendation pre-processor can perform a round up on the CPU recommendation. Recommendation capping still applies after the round up. -To activate this feature, pass the flag `--cpu-integer-post-processor-enabled` when you start the recommender. -The pre-processor only acts on containers having a specific configuration. This configuration consists in an annotation on your VPA object for each impacted container. -The annotation format is the following: - -``` -vpa-post-processor.kubernetes.io/{containerName}_integerCPU=true -``` - -### Controlling eviction behavior based on scaling direction and resource - -To limit disruptions caused by evictions, you can put additional constraints on the Updater's eviction behavior by specifying `.updatePolicy.EvictionRequirements` in the VPA spec. An `EvictionRequirement` contains a resource and a `ChangeRequirement`, which is evaluated by comparing a new recommendation against the currently set resources for a container - -Here is an example configuration which allows evictions only when CPU or memory get scaled up, but not when they both are scaled down - -```yaml - updatePolicy: - evictionRequirements: - - resources: ["cpu", "memory"] - changeRequirement: TargetHigherThanRequests -``` - -Note that this doesn't prevent scaling down entirely, as Pods may get recreated for different reasons, resulting in a new recommendation being applied. See [the original AEP](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4831-control-eviction-behavior) for more context and usage information. - -### Limiting which namespaces are used - - By default the VPA will run against all namespaces. You can limit that behaviour by setting the following options: +## Getting Started -1. `ignored-vpa-object-namespaces` - A comma separated list of namespaces to ignore -1. `vpa-object-namespace` - A single namespace to monitor +See [Installation](./docs/installation.md) for a guide on installation, followed by a the [Quick start](./docs/quickstart.md) guide. -These options cannot be used together and are mutually exclusive. +Also refer to the [FAQ](./docs/faq.md) for more. -### Setting the webhook failurePolicy +## Components and Architecture -It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller. -Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA. -Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk. +The Vertical Pod Autoscaler consists of three parts. The recommender, updater and admission-controller. Read more about them on the [components](./docs/components.md) page. -# Known limitations +## Features and Known limitations -- Whenever VPA updates the pod resources, the pod is recreated, which causes all - running containers to be recreated. The pod may be recreated on a different - node. -- VPA cannot guarantee that pods it evicts or deletes to apply recommendations - (when configured in `Auto` and `Recreate` modes) will be successfully - recreated. This can be partly - addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). -- VPA does not update resources of pods which are not run under a controller. -- Vertical Pod Autoscaler **should not be used with the [Horizontal Pod - Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-resource-metrics) - (HPA) on the same resource metric (CPU or memory)** at this moment. However, you can use [VPA with - HPA on separate resource metrics](https://github.com/kubernetes/autoscaler/issues/6247) (e.g. VPA - on memory and HPA on CPU) as well as with [HPA on custom and external - metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-on-custom-metrics). -- The VPA admission controller is an admission webhook. If you add other admission webhooks - to your cluster, it is important to analyze how they interact and whether they may conflict - with each other. The order of admission controllers is defined by a flag on API server. -- VPA reacts to most out-of-memory events, but not in all situations. -- VPA performance has not been tested in large clusters. -- VPA recommendation might exceed available resources (e.g. Node size, available - size, available quota) and cause **pods to go pending**. This can be partly - addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). -- Multiple VPA resources matching the same pod have undefined behavior. -- Running the vpa-recommender with leader election enabled (`--leader-elect=true`) in a GKE cluster - causes contention with a lease called `vpa-recommender` held by the GKE system component of the - same name. To run your own VPA in GKE, make sure to specify a different lease name using - `--leader-elect-resource-name=vpa-recommender-lease` (or specify your own lease name). +You can also read about the [features](./docs/features.md) and [known limitations](./docs/known-limitaions.md) of the VPA. -# Related links +## Related links -- [FAQ](FAQ.md) - [Design proposal](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md) - [API diff --git a/vertical-pod-autoscaler/docs/components.md b/vertical-pod-autoscaler/docs/components.md new file mode 100644 index 000000000000..74fbceb9ae01 --- /dev/null +++ b/vertical-pod-autoscaler/docs/components.md @@ -0,0 +1,135 @@ +# Components + +## Contents + +- [Components](#components) + - [Introduction](#introduction) + - [Recommender](#recommender) + - [Running](#running-the-recommender) + - [Implementation](#implementation-of-the-recommender) + - [Updater](#updater) + - [Current implementation](#current-implementation) + - [Missing Parts](#missing-parts) + - [Admission Controller](#admission-controller) + - [Running](#running-the-admission-controller) + - [Implementation](#implementation-of-the-admission-controller) + +## Introduction + +The VPA project consists of 3 components: + +- [Recommender](#recommender) - monitors the current and past resource consumption and, based on it, + provides recommended values for the containers' cpu and memory requests. + +- [Updater](#updater) - checks which of the managed pods have correct resources set and, if not, + kills them so that they can be recreated by their controllers with the updated requests. + +- [Admission Controller](#admission-controller) - sets the correct resource requests on new pods (either just created + or recreated by their controller due to Updater's activity). + +More on the architecture can be found [HERE](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md). + +## Recommender + +Recommender is the core binary of Vertical Pod Autoscaler system. +It computes the recommended resource requests for pods based on +historical and current usage of the resources. +The current recommendations are put in status of the VPA resource, where they +can be inspected. + +## Running the recommender + +- In order to have historical data pulled in by the recommender, install + Prometheus in your cluster and pass its address through a flag. +- Create RBAC configuration from `../deploy/vpa-rbac.yaml`. +- Create a deployment with the recommender pod from + `../deploy/recommender-deployment.yaml`. +- The recommender will start running and pushing its recommendations to VPA + object statuses. + +### Implementation of the recommender + +The recommender is based on a model of the cluster that it builds in its memory. +The model contains Kubernetes resources: *Pods*, *VerticalPodAutoscalers*, with +their configuration (e.g. labels) as well as other information, e.g. usage data for +each container. + +After starting the binary, recommender reads the history of running pods and +their usage from Prometheus into the model. +It then runs in a loop and at each step performs the following actions: + +- update model with recent information on resources (using listers based on + watch), +- update model with fresh usage samples from Metrics API, +- compute new recommendation for each VPA, +- put any changed recommendations into the VPA resources. + +## Updater + +Updater component for Vertical Pod Autoscaler described in the [Vertical Pod Autoscaler - design proposal](https://github.com/kubernetes/community/pull/338) + +Updater runs in Kubernetes cluster and decides which pods should be restarted +based on resources allocation recommendation calculated by Recommender. +If a pod should be updated, Updater will try to evict the pod. +It respects the pod disruption budget, by using Eviction API to evict pods. +Updater does not perform the actual resources update, but relies on Vertical Pod Autoscaler admission plugin +to update pod resources when the pod is recreated after eviction. + +### Current implementation + +Runs in a loop. On one iteration performs: + +- Fetching Vertical Pod Autoscaler configuration using a lister implementation. +- Fetching live pods information with their current resource allocation. +- For each replicated pods group calculating if pod update is required and how many replicas can be evicted. +Updater will always allow eviction of at least one pod in replica set. Maximum ratio of evicted replicas is specified by flag. +- Evicting pods if recommended resources significantly vary from the actual resources allocation. +Threshold for evicting pods is specified by recommended min/max values from VPA resource. +Priority of evictions within a set of replicated pods is proportional to sum of percentages of changes in resources +(i.e. pod with 15% memory increase 15% cpu decrease recommended will be evicted +before pod with 20% memory increase and no change in cpu). + +### Missing parts + +- Recommendation API for fetching data from Vertical Pod Autoscaler Recommender. + +## Admission-controller + +This is a binary that registers itself as a Mutating Admission Webhook +and because of that is on the path of creating all pods. +For each pod creation, it will get a request from the apiserver and it will +either decide there's no matching VPA configuration or find the corresponding +one and use current recommendation to set resource requests in the pod. + +### Running the admission-controller + +1. You should make sure your API server supports Mutating Webhooks. +Its `--admission-control` flag should have `MutatingAdmissionWebhook` as one of +the values on the list and its `--runtime-config` flag should include +`admissionregistration.k8s.io/v1beta1=true`. +To change those flags, ssh to your API Server instance, edit +`/etc/kubernetes/manifests/kube-apiserver.manifest` and restart kubelet to pick +up the changes: ```sudo systemctl restart kubelet.service``` +1. Generate certs by running `bash gencerts.sh`. This will use kubectl to create + a secret in your cluster with the certs. +1. Create RBAC configuration for the admission controller pod by running + `kubectl create -f ../deploy/admission-controller-rbac.yaml` +1. Create the pod: + `kubectl create -f ../deploy/admission-controller-deployment.yaml`. + The first thing this will do is it will register itself with the apiserver as + Webhook Admission Controller and start changing resource requirements + for pods on their creation & updates. +1. You can specify a path for it to register as a part of the installation process + by setting `--register-by-url=true` and passing `--webhook-address` and `--webhook-port`. +1. You can specify a minimum TLS version with `--min-tls-version` with acceptable values being `tls1_2` (default), or `tls1_3`. +1. You can also specify a comma or colon separated list of ciphers for the server to use with `--tls-ciphers` if `--min-tls-version` is set to `tls1_2`. +1. You can specify a comma separated list to set webhook labels with `--webhook-labels`, example format: key1:value1,key2:value2. + +### Implementation of the Admission Controller + +All VPA configurations in the cluster are watched with a lister. +In the context of pod creation, there is an incoming https request from +apiserver. +The logic to serve that request involves finding the appropriate VPA, retrieving +current recommendation from it and encodes the recommendation as a json patch to +the Pod resource. diff --git a/vertical-pod-autoscaler/docs/examples.md b/vertical-pod-autoscaler/docs/examples.md new file mode 100644 index 000000000000..ed5d5108601b --- /dev/null +++ b/vertical-pod-autoscaler/docs/examples.md @@ -0,0 +1,110 @@ +# Examples + +## Contents + +- [Examples](#examples) + - [Keeping limit proportional to request](#keeping-limit-proportional-to-request) + - [Capping to Limit Range](#capping-to-limit-range) + - [Resource Policy Overriding Limit Range](#resource-policy-overriding-limit-range) + - [Starting multiple recommenders](#starting-multiple-recommenders) + - [Using CPU management with static policy](#using-cpu-management-with-static-policy) + - [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource) + - [Limiting which namespaces are used](#limiting-which-namespaces-are-used) + - [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy) + +## Keeping limit proportional to request + +The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also +specifies resource limit of 2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA +applies the recommendation, it will also set the memory limit to 4 GB. + +## Capping to Limit Range + +The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also +specifies resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. +VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA applies the recommendation, it will +set the memory limit to 3 GB (to keep it within the allowed limit range) and the memory request to 1.5 GB ( +to maintain a 2:1 limit/request ratio from the template). + +## Resource Policy Overriding Limit Range + +The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also +specifies a resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. +VPAs Container Resource Policy requires VPA to set containers request to at least 750 milli CPU and +2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When applying the recommendation, +VPA will set RAM request to 2 GB (following the resource policy) and RAM limit to 4 GB (to maintain +the 2:1 limit/request ratio from the template). + +## Starting multiple recommenders + +It is possible to start one or more extra recommenders in order to use different percentile on different workload profiles. +For example you could have 3 profiles: [frugal](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-low.yaml), +[standard](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment.yaml) and +[performance](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-high.yaml) which will +use different TargetCPUPercentile (50, 90 and 95) to calculate their recommendations. + +Please note the usage of the following arguments to override default names and percentiles: + +- --recommender-name=performance +- --target-cpu-percentile=0.95 + +You can then choose which recommender to use by setting `recommenders` inside the `VerticalPodAutoscaler` spec. + +## Custom memory bump-up after OOMKill + +After an OOMKill event was observed, VPA increases the memory recommendation based on the observed memory usage in the event according to this formula: `recommendation = memory-usage-in-oomkill-event + max(oom-min-bump-up-bytes, memory-usage-in-oomkill-event * oom-bump-up-ratio)`. +You can configure the minimum bump-up as well as the multiplier by specifying startup arguments for the recommender: +`oom-bump-up-ratio` specifies the memory bump up ratio when OOM occurred, default is `1.2`. This means, memory will be increased by 20% after an OOMKill event. +`oom-min-bump-up-bytes` specifies minimal increase of memory after observing OOM. Defaults to `100 * 1024 * 1024` (=100MiB) + +Usage in recommender deployment + +```yaml + containers: + - name: recommender + args: + - --oom-bump-up-ratio=2.0 + - --oom-min-bump-up-bytes=524288000 +``` + +## Using CPU management with static policy + +If you are using the [CPU management with static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) for some containers, +you probably want the CPU recommendation to be an integer. A dedicated recommendation pre-processor can perform a round up on the CPU recommendation. Recommendation capping still applies after the round up. +To activate this feature, pass the flag `--cpu-integer-post-processor-enabled` when you start the recommender. +The pre-processor only acts on containers having a specific configuration. This configuration consists in an annotation on your VPA object for each impacted container. +The annotation format is the following: + +```yaml +vpa-post-processor.kubernetes.io/{containerName}_integerCPU=true +``` + +## Controlling eviction behavior based on scaling direction and resource + +To limit disruptions caused by evictions, you can put additional constraints on the Updater's eviction behavior by specifying `.updatePolicy.EvictionRequirements` in the VPA spec. An `EvictionRequirement` contains a resource and a `ChangeRequirement`, which is evaluated by comparing a new recommendation against the currently set resources for a container + +Here is an example configuration which allows evictions only when CPU or memory get scaled up, but not when they both are scaled down + +```yaml + updatePolicy: + evictionRequirements: + - resources: ["cpu", "memory"] + changeRequirement: TargetHigherThanRequests +``` + +Note that this doesn't prevent scaling down entirely, as Pods may get recreated for different reasons, resulting in a new recommendation being applied. See [the original AEP](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4831-control-eviction-behavior) for more context and usage information. + +## Limiting which namespaces are used + + By default the VPA will run against all namespaces. You can limit that behaviour by setting the following options: + +1. `ignored-vpa-object-namespaces` - A comma separated list of namespaces to ignore +1. `vpa-object-namespace` - A single namespace to monitor + +These options cannot be used together and are mutually exclusive. + +## Setting the webhook failurePolicy + +It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller. +Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA. +Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk. diff --git a/vertical-pod-autoscaler/FAQ.md b/vertical-pod-autoscaler/docs/faq.md similarity index 97% rename from vertical-pod-autoscaler/FAQ.md rename to vertical-pod-autoscaler/docs/faq.md index 53a83ff489b8..80f1c8774af1 100644 --- a/vertical-pod-autoscaler/FAQ.md +++ b/vertical-pod-autoscaler/docs/faq.md @@ -2,7 +2,7 @@ ## Contents -- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-CPU-or-memory-settings) +- [VPA restarts my pods but does not modify CPU or memory settings. Why?](#vpa-restarts-my-pods-but-does-not-modify-cpu-or-memory-settings) - [How can I apply VPA to my Custom Resource?](#how-can-i-apply-vpa-to-my-custom-resource) - [How can I use Prometheus as a history provider for the VPA recommender?](#how-can-i-use-prometheus-as-a-history-provider-for-the-vpa-recommender) - [I get recommendations for my single pod replicaSet, but they are not applied. Why?](#i-get-recommendations-for-my-single-pod-replicaset-but-they-are-not-applied) @@ -135,7 +135,7 @@ spec: - --v=4 - --storage=prometheus - --prometheus-address=http://prometheus.default.svc.cluster.local:9090 - ``` +``` In this example, Prometheus is running in the default namespace. @@ -148,9 +148,9 @@ Here you should see the flags that you set for the VPA recommender and you shoul This means that the VPA recommender is now using Prometheus as the history provider. -### I get recommendations for my single pod replicaSet but they are not applied +### I get recommendations for my single pod replicaset but they are not applied -By default, the [`--min-replicas`](pkg/updater/main.go#L56) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](deploy/updater-deployment.yaml) file: +By default, the [`--min-replicas`](https://github.com/kubernetes/autoscaler/tree/master/pkg/updater/main.go#L44) flag on the updater is set to 2. To change this, you can supply the arg in the [deploys/updater-deployment.yaml](https://github.com/kubernetes/autoscaler/tree/master/deploy/updater-deployment.yaml) file: ```yaml spec: @@ -179,7 +179,7 @@ election with the `--leader-elect=true` parameter. The following startup parameters are supported for VPA recommender: Name | Type | Description | Default -|-|-|-|-| +-|-|-|- `recommendation-margin-fraction` | Float64 | Fraction of usage added as the safety margin to the recommended request | 0.15 `pod-recommendation-min-cpu-millicores` | Float64 | Minimum CPU recommendation for a pod | 25 `pod-recommendation-min-memory-mb` | Float64 | Minimum memory recommendation for a pod | 250 @@ -230,7 +230,7 @@ Name | Type | Description | Default The following startup parameters are supported for VPA updater: Name | Type | Description | Default -|-|-|-|-| +-|-|-|- `pod-update-threshold` | Float64 | Ignore updates that have priority lower than the value of this flag | 0.1 `in-recommendation-bounds-eviction-lifetime-threshold` | Duration | Pods that live for at least that long can be evicted even if their request is within the [MinRecommended...MaxRecommended] range | time.Hour*12 `evict-after-oom-threshold` | Duration | Evict pod that has OOMed in less than evict-after-oom-threshold since start. | 10*time.Minute diff --git a/vertical-pod-autoscaler/docs/features.md b/vertical-pod-autoscaler/docs/features.md new file mode 100644 index 000000000000..ff8ced24041b --- /dev/null +++ b/vertical-pod-autoscaler/docs/features.md @@ -0,0 +1,18 @@ +# Features + +## Contents + +- [Limits control](#limits-control) + +## Limits control + +When setting limits VPA will conform to +[resource policies](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.2.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L95-L103). +It will maintain limit to request ratio specified for all containers. + +VPA will try to cap recommendations between min and max of +[limit ranges](https://kubernetes.io/docs/concepts/policy/limit-range/). If limit range conflicts +with VPA resource policy, VPA will follow VPA policy (and set values outside the limit +range). + +To disable getting VPA recommendations for an individual container, set `mode` to `"Off"` in `containerPolicies`. diff --git a/vertical-pod-autoscaler/docs/installation.md b/vertical-pod-autoscaler/docs/installation.md new file mode 100644 index 000000000000..69b53d5d1fac --- /dev/null +++ b/vertical-pod-autoscaler/docs/installation.md @@ -0,0 +1,161 @@ +# Installation + +## Contents + +- [Installation](#installation) + - [Compatibility](#compatibility) + - [Notice on deprecation of v1beta2 version (>=0.13.0)](#notice-on-deprecation-of-v1beta2-version-0130) + - [Notice on removal of v1beta1 version (>=0.5.0)](#notice-on-removal-of-v1beta1-version-050) + - [Prerequisites](#prerequisites) + - [Install command](#install-command) + - [Tear down](#tear-down) + +The current default version is Vertical Pod Autoscaler 1.2.1 + +## Compatibility + +| VPA version | Kubernetes version | +|-----------------|--------------------| +| 1.2.1 | 1.27+ | +| 1.2.0 | 1.27+ | +| 1.1.2 | 1.25+ | +| 1.1.1 | 1.25+ | +| 1.0 | 1.25+ | +| 0.14 | 1.25+ | +| 0.13 | 1.25+ | +| 0.12 | 1.25+ | +| 0.11 | 1.22 - 1.24 | +| 0.10 | 1.22+ | +| 0.9 | 1.16+ | +| 0.8 | 1.13+ | +| 0.4 to 0.7 | 1.11+ | +| 0.3.X and lower | 1.7+ | + +## Notice on CRD update (>=1.0.0) + +**NOTE:** In version 1.0.0, we have updated the CRD definition and added RBAC for the +status resource. If you are upgrading from version (<=0.14.0), you must update the CRD +definition and RBAC. + +```shell +kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/vpa-release-1.0/vertical-pod-autoscaler/deploy/vpa-v1-crd-gen.yaml +kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/vpa-release-1.0/vertical-pod-autoscaler/deploy/vpa-rbac.yaml +``` + +Another method is to re-execute the ./hack/vpa-process-yamls.sh script. + +```shell +git clone https://github.com/kubernetes/autoscaler.git +cd autoscaler/vertical-pod-autoscaler +git checkout origin/vpa-release-1.0 +REGISTRY=registry.k8s.io/autoscaling TAG=1.0.0 ./hack/vpa-process-yamls.sh apply +``` + +If you need to roll back to version (<=0.14.0), please check out the release for your +rollback version and execute ./hack/vpa-process-yamls.sh. For example, to rollback to 0.14.0: + +```shell +git checkout origin/vpa-release-0.14 +REGISTRY=registry.k8s.io/autoscaling TAG=0.14.0 ./hack/vpa-process-yamls.sh apply +kubectl delete clusterrole system:vpa-status-actor +kubectl delete clusterrolebinding system:vpa-status-actor +``` + +## Notice on deprecation of v1beta2 version (>=0.13.0) + +**NOTE:** In 0.13.0 we deprecate `autoscaling.k8s.io/v1beta2` API. We plan to +remove this API version. While for now you can continue to use `v1beta2` API we +recommend using `autoscaling.k8s.io/v1` instead. `v1` and `v1beta2` APIs are +almost identical (`v1` API has some fields which are not present in `v1beta2`) +so simply changing which API version you're calling should be enough in almost +all cases. + +## Notice on removal of v1beta1 version (>=0.5.0) + +**NOTE:** In 0.5.0 we disabled the old version of the API - `autoscaling.k8s.io/v1beta1`. +The VPA objects in this version will no longer receive recommendations and +existing recommendations will be removed. The objects will remain present though +and a ConfigUnsupported condition will be set on them. + +This doc is for installing latest VPA. For instructions on migration from older versions see [Migration Doc](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/MIGRATE.md). + +## Prerequisites + +- `kubectl` should be connected to the cluster you want to install VPA. +- The metrics server must be deployed in your cluster. Read more about [Metrics Server](https://github.com/kubernetes-sigs/metrics-server). +- If you are using a GKE Kubernetes cluster, you will need to grant your current Google + identity `cluster-admin` role. Otherwise, you won't be authorized to grant extra + privileges to the VPA system components. + + ```console + $ gcloud info | grep Account # get current google identity + Account: [myname@example.org] + + $ kubectl create clusterrolebinding myname-cluster-admin-binding --clusterrole=cluster-admin --user=myname@example.org + Clusterrolebinding "myname-cluster-admin-binding" created + ``` + +- If you already have another version of VPA installed in your cluster, you have to tear down + the existing installation first with: + + ```console + ./hack/vpa-down.sh + ``` + +## Install command + +To install VPA, please download the source code of VPA (for example with `git clone https://github.com/kubernetes/autoscaler.git`) +and run the following command inside the `vertical-pod-autoscaler` directory: + +```console +./hack/vpa-up.sh +``` + +Note: the script currently reads environment variables: `$REGISTRY` and `$TAG`. +Make sure you leave them unset unless you want to use a non-default version of VPA. + +Note: If you are seeing following error during this step: + +```console +unknown option -addext +``` + +please upgrade openssl to version 1.1.1 or higher (needs to support -addext option) or use ./hack/vpa-up.sh on the [0.8 release branch](https://github.com/kubernetes/autoscaler/tree/vpa-release-0.8). + +The script issues multiple `kubectl` commands to the +cluster that insert the configuration and start all needed pods (see +[architecture](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md#architecture-overview)) +in the `kube-system` namespace. It also generates +and uploads a secret (a CA cert) used by VPA Admission Controller when communicating +with the API server. + +To print YAML contents with all resources that would be understood by +`kubectl diff|apply|...` commands, you can use + +```console +./hack/vpa-process-yamls.sh print +``` + +The output of that command won't include secret information generated by +[pkg/admission-controller/gencerts.sh](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/pkg/admission-controller/gencerts.sh) script. + +## Tear down + +Note that if you stop running VPA in your cluster, the resource requests +for the pods already modified by VPA will not change, but any new pods +will get resources as defined in your controllers (i.e. deployment or +replicaset) and not according to previous recommendations made by VPA. + +To stop using Vertical Pod Autoscaling in your cluster: + +- If running on GKE, clean up role bindings created in [Prerequisites](#prerequisites): + +```console +kubectl delete clusterrolebinding myname-cluster-admin-binding +``` + +- Tear down VPA components: + +```console +./hack/vpa-down.sh +``` diff --git a/vertical-pod-autoscaler/docs/known-limitations.md b/vertical-pod-autoscaler/docs/known-limitations.md new file mode 100644 index 000000000000..a6e08c849016 --- /dev/null +++ b/vertical-pod-autoscaler/docs/known-limitations.md @@ -0,0 +1,29 @@ +# Known limitations + +- Whenever VPA updates the pod resources, the pod is recreated, which causes all + running containers to be recreated. The pod may be recreated on a different + node. +- VPA cannot guarantee that pods it evicts or deletes to apply recommendations + (when configured in `Auto` and `Recreate` modes) will be successfully + recreated. This can be partly + addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). +- VPA does not update resources of pods which are not run under a controller. +- Vertical Pod Autoscaler **should not be used with the [Horizontal Pod + Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#support-for-resource-metrics) + (HPA) on the same resource metric (CPU or memory)** at this moment. However, you can use [VPA with + HPA on separate resource metrics](https://github.com/kubernetes/autoscaler/issues/6247) (e.g. VPA + on memory and HPA on CPU) as well as with [HPA on custom and external + metrics](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#scaling-on-custom-metrics). +- The VPA admission controller is an admission webhook. If you add other admission webhooks + to your cluster, it is important to analyze how they interact and whether they may conflict + with each other. The order of admission controllers is defined by a flag on API server. +- VPA reacts to most out-of-memory events, but not in all situations. +- VPA performance has not been tested in large clusters. +- VPA recommendation might exceed available resources (e.g. Node size, available + size, available quota) and cause **pods to go pending**. This can be partly + addressed by using VPA together with [Cluster Autoscaler](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#basics). +- Multiple VPA resources matching the same pod have undefined behavior. +- Running the vpa-recommender with leader election enabled (`--leader-elect=true`) in a GKE cluster + causes contention with a lease called `vpa-recommender` held by the GKE system component of the + same name. To run your own VPA in GKE, make sure to specify a different lease name using + `--leader-elect-resource-name=vpa-recommender-lease` (or specify your own lease name). diff --git a/vertical-pod-autoscaler/docs/quickstart.md b/vertical-pod-autoscaler/docs/quickstart.md new file mode 100644 index 000000000000..7ef784d30009 --- /dev/null +++ b/vertical-pod-autoscaler/docs/quickstart.md @@ -0,0 +1,97 @@ +# Quick start + +## Contents + +- [Quick start](#quick-start) + - [Test your installation](#test-your-installation) + - [Example VPA configuration](#example-vpa-configuration) + - [Troubleshooting](#troubleshooting) + +After [installation](./installation.md) the system is ready to recommend and set +resource requests for your pods. +In order to use it, you need to insert a *Vertical Pod Autoscaler* resource for +each controller that you want to have automatically computed resource requirements. +This will be most commonly a **Deployment**. +There are four modes in which *VPAs* operate: + +- `"Auto"`: VPA assigns resource requests on pod creation as well as updates + them on existing pods using the preferred update mechanism. Currently, this is + equivalent to `"Recreate"` (see below). Once restart free ("in-place") update + of pod requests is available, it may be used as the preferred update mechanism by + the `"Auto"` mode. +- `"Recreate"`: VPA assigns resource requests on pod creation as well as updates + them on existing pods by evicting them when the requested resources differ significantly + from the new recommendation (respecting the Pod Disruption Budget, if defined). + This mode should be used rarely, only if you need to ensure that the pods are restarted + whenever the resource request changes. Otherwise, prefer the `"Auto"` mode which may take + advantage of restart-free updates once they are available. +- `"Initial"`: VPA only assigns resource requests on pod creation and never changes them + later. +- `"Off"`: VPA does not automatically change the resource requirements of the pods. + The recommendations are calculated and can be inspected in the VPA object. + +## Test your installation + +A simple way to check if Vertical Pod Autoscaler is fully operational in your +cluster is to create a sample deployment and a corresponding VPA config: + +```console +kubectl create -f examples/hamster.yaml +``` + +The above command creates a deployment with two pods, each running a single container +that requests 100 millicores and tries to utilize slightly above 500 millicores. +The command also creates a VPA config pointing at the deployment. +VPA will observe the behaviour of the pods, and after about 5 minutes, they should get +updated with a higher CPU request +(note that VPA does not modify the template in the deployment, but the actual requests +of the pods are updated). To see VPA config and current recommended resource requests run: + +```console +kubectl describe vpa +``` + +*Note: if your cluster has little free capacity these pods may be unable to schedule. +You may need to add more nodes or adjust examples/hamster.yaml to use less CPU.* + +## Example VPA configuration + +```yaml +apiVersion: autoscaling.k8s.io/v1 +kind: VerticalPodAutoscaler +metadata: + name: my-app-vpa +spec: + targetRef: + apiVersion: "apps/v1" + kind: Deployment + name: my-app + updatePolicy: + updateMode: "Auto" +``` + +## Troubleshooting + +To diagnose problems with a VPA installation, perform the following steps: + +- Check if all system components are running: + +```console +kubectl --namespace=kube-system get pods|grep vpa +``` + +The above command should list 3 pods (recommender, updater and admission-controller) +all in state Running. + +- Check if the system components log any errors. + For each of the pods returned by the previous command do: + +```console +kubectl --namespace=kube-system logs [pod name] | grep -e '^E[0-9]\{4\}' +``` + +- Check that the VPA Custom Resource Definition was created: + +```console +kubectl get customresourcedefinition | grep verticalpodautoscalers +``` diff --git a/vertical-pod-autoscaler/pkg/admission-controller/README.md b/vertical-pod-autoscaler/pkg/admission-controller/README.md deleted file mode 100644 index 1f11552cad66..000000000000 --- a/vertical-pod-autoscaler/pkg/admission-controller/README.md +++ /dev/null @@ -1,46 +0,0 @@ -# VPA Admission Controller - -- [Intro](#intro) -- [Running](#running) -- [Implementation](#implementation) - -## Intro - -This is a binary that registers itself as a Mutating Admission Webhook -and because of that is on the path of creating all pods. -For each pod creation, it will get a request from the apiserver and it will -either decide there's no matching VPA configuration or find the corresponding -one and use current recommendation to set resource requests in the pod. - -## Running - -1. You should make sure your API server supports Mutating Webhooks. -Its `--admission-control` flag should have `MutatingAdmissionWebhook` as one of -the values on the list and its `--runtime-config` flag should include -`admissionregistration.k8s.io/v1beta1=true`. -To change those flags, ssh to your API Server instance, edit -`/etc/kubernetes/manifests/kube-apiserver.manifest` and restart kubelet to pick -up the changes: ```sudo systemctl restart kubelet.service``` -1. Generate certs by running `bash gencerts.sh`. This will use kubectl to create - a secret in your cluster with the certs. -1. Create RBAC configuration for the admission controller pod by running - `kubectl create -f ../deploy/admission-controller-rbac.yaml` -1. Create the pod: - `kubectl create -f ../deploy/admission-controller-deployment.yaml`. - The first thing this will do is it will register itself with the apiserver as - Webhook Admission Controller and start changing resource requirements - for pods on their creation & updates. -1. You can specify a path for it to register as a part of the installation process - by setting `--register-by-url=true` and passing `--webhook-address` and `--webhook-port`. -1. You can specify a minimum TLS version with `--min-tls-version` with acceptable values being `tls1_2` (default), or `tls1_3`. -1. You can also specify a comma or colon separated list of ciphers for the server to use with `--tls-ciphers` if `--min-tls-version` is set to `tls1_2`. -1. You can specify a comma separated list to set webhook labels with `--webhook-labels`, example format: key1:value1,key2:value2. - -## Implementation - -All VPA configurations in the cluster are watched with a lister. -In the context of pod creation, there is an incoming https request from -apiserver. -The logic to serve that request involves finding the appropriate VPA, retrieving -current recommendation from it and encodes the recommendation as a json patch to -the Pod resource. diff --git a/vertical-pod-autoscaler/pkg/recommender/README.md b/vertical-pod-autoscaler/pkg/recommender/README.md deleted file mode 100644 index 9b3c73b9945a..000000000000 --- a/vertical-pod-autoscaler/pkg/recommender/README.md +++ /dev/null @@ -1,40 +0,0 @@ -# VPA Recommender - -- [Intro](#intro) -- [Running](#running) -- [Implementation](#implementation) - -## Intro - -Recommender is the core binary of Vertical Pod Autoscaler system. -It computes the recommended resource requests for pods based on -historical and current usage of the resources. -The current recommendations are put in status of the VPA resource, where they -can be inspected. - -## Running - -- In order to have historical data pulled in by the recommender, install - Prometheus in your cluster and pass its address through a flag. -- Create RBAC configuration from `../deploy/vpa-rbac.yaml`. -- Create a deployment with the recommender pod from - `../deploy/recommender-deployment.yaml`. -- The recommender will start running and pushing its recommendations to VPA - object statuses. - -## Implementation - -The recommender is based on a model of the cluster that it builds in its memory. -The model contains Kubernetes resources: *Pods*, *VerticalPodAutoscalers*, with -their configuration (e.g. labels) as well as other information, e.g. usage data for -each container. - -After starting the binary, recommender reads the history of running pods and -their usage from Prometheus into the model. -It then runs in a loop and at each step performs the following actions: - -- update model with recent information on resources (using listers based on - watch), -- update model with fresh usage samples from Metrics API, -- compute new recommendation for each VPA, -- put any changed recommendations into the VPA resources. diff --git a/vertical-pod-autoscaler/pkg/updater/README.md b/vertical-pod-autoscaler/pkg/updater/README.md deleted file mode 100644 index 6d783ffd2b0e..000000000000 --- a/vertical-pod-autoscaler/pkg/updater/README.md +++ /dev/null @@ -1,34 +0,0 @@ -# Vertical Pod Autoscaler - Updater - -- [Introduction](#introduction) -- [Current implementation](current-implementation) -- [Missing parts](#missing-parts) - -# Introduction - -Updater component for Vertical Pod Autoscaler described in https://github.com/kubernetes/community/pull/338 - -Updater runs in Kubernetes cluster and decides which pods should be restarted -based on resources allocation recommendation calculated by Recommender. -If a pod should be updated, Updater will try to evict the pod. -It respects the pod disruption budget, by using Eviction API to evict pods. -Updater does not perform the actual resources update, but relies on Vertical Pod Autoscaler admission plugin -to update pod resources when the pod is recreated after eviction. - -# Current implementation - -Runs in a loop. On one iteration performs: - -- Fetching Vertical Pod Autoscaler configuration using a lister implementation. -- Fetching live pods information with their current resource allocation. -- For each replicated pods group calculating if pod update is required and how many replicas can be evicted. -Updater will always allow eviction of at least one pod in replica set. Maximum ratio of evicted replicas is specified by flag. -- Evicting pods if recommended resources significantly vary from the actual resources allocation. -Threshold for evicting pods is specified by recommended min/max values from VPA resource. -Priority of evictions within a set of replicated pods is proportional to sum of percentages of changes in resources -(i.e. pod with 15% memory increase 15% cpu decrease recommended will be evicted -before pod with 20% memory increase and no change in cpu). - -# Missing parts - -- Recommendation API for fetching data from Vertical Pod Autoscaler Recommender.