-
Notifications
You must be signed in to change notification settings - Fork 4k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #7548 from adrianmoisey/move_around_docs
Move all VPA docs into ./docs
- Loading branch information
Showing
11 changed files
with
569 additions
and
538 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
# Components | ||
|
||
## Contents | ||
|
||
- [Components](#components) | ||
- [Introduction](#introduction) | ||
- [Recommender](#recommender) | ||
- [Running](#running-the-recommender) | ||
- [Implementation](#implementation-of-the-recommender) | ||
- [Updater](#updater) | ||
- [Current implementation](#current-implementation) | ||
- [Missing Parts](#missing-parts) | ||
- [Admission Controller](#admission-controller) | ||
- [Running](#running-the-admission-controller) | ||
- [Implementation](#implementation-of-the-admission-controller) | ||
|
||
## Introduction | ||
|
||
The VPA project consists of 3 components: | ||
|
||
- [Recommender](#recommender) - monitors the current and past resource consumption and, based on it, | ||
provides recommended values for the containers' cpu and memory requests. | ||
|
||
- [Updater](#updater) - checks which of the managed pods have correct resources set and, if not, | ||
kills them so that they can be recreated by their controllers with the updated requests. | ||
|
||
- [Admission Controller](#admission-controller) - sets the correct resource requests on new pods (either just created | ||
or recreated by their controller due to Updater's activity). | ||
|
||
More on the architecture can be found [HERE](https://github.com/kubernetes/design-proposals-archive/blob/main/autoscaling/vertical-pod-autoscaler.md). | ||
|
||
## Recommender | ||
|
||
Recommender is the core binary of Vertical Pod Autoscaler system. | ||
It computes the recommended resource requests for pods based on | ||
historical and current usage of the resources. | ||
The current recommendations are put in status of the VPA resource, where they | ||
can be inspected. | ||
|
||
## Running the recommender | ||
|
||
- In order to have historical data pulled in by the recommender, install | ||
Prometheus in your cluster and pass its address through a flag. | ||
- Create RBAC configuration from `../deploy/vpa-rbac.yaml`. | ||
- Create a deployment with the recommender pod from | ||
`../deploy/recommender-deployment.yaml`. | ||
- The recommender will start running and pushing its recommendations to VPA | ||
object statuses. | ||
|
||
### Implementation of the recommender | ||
|
||
The recommender is based on a model of the cluster that it builds in its memory. | ||
The model contains Kubernetes resources: *Pods*, *VerticalPodAutoscalers*, with | ||
their configuration (e.g. labels) as well as other information, e.g. usage data for | ||
each container. | ||
|
||
After starting the binary, recommender reads the history of running pods and | ||
their usage from Prometheus into the model. | ||
It then runs in a loop and at each step performs the following actions: | ||
|
||
- update model with recent information on resources (using listers based on | ||
watch), | ||
- update model with fresh usage samples from Metrics API, | ||
- compute new recommendation for each VPA, | ||
- put any changed recommendations into the VPA resources. | ||
|
||
## Updater | ||
|
||
Updater component for Vertical Pod Autoscaler described in the [Vertical Pod Autoscaler - design proposal](https://github.com/kubernetes/community/pull/338) | ||
|
||
Updater runs in Kubernetes cluster and decides which pods should be restarted | ||
based on resources allocation recommendation calculated by Recommender. | ||
If a pod should be updated, Updater will try to evict the pod. | ||
It respects the pod disruption budget, by using Eviction API to evict pods. | ||
Updater does not perform the actual resources update, but relies on Vertical Pod Autoscaler admission plugin | ||
to update pod resources when the pod is recreated after eviction. | ||
|
||
### Current implementation | ||
|
||
Runs in a loop. On one iteration performs: | ||
|
||
- Fetching Vertical Pod Autoscaler configuration using a lister implementation. | ||
- Fetching live pods information with their current resource allocation. | ||
- For each replicated pods group calculating if pod update is required and how many replicas can be evicted. | ||
Updater will always allow eviction of at least one pod in replica set. Maximum ratio of evicted replicas is specified by flag. | ||
- Evicting pods if recommended resources significantly vary from the actual resources allocation. | ||
Threshold for evicting pods is specified by recommended min/max values from VPA resource. | ||
Priority of evictions within a set of replicated pods is proportional to sum of percentages of changes in resources | ||
(i.e. pod with 15% memory increase 15% cpu decrease recommended will be evicted | ||
before pod with 20% memory increase and no change in cpu). | ||
|
||
### Missing parts | ||
|
||
- Recommendation API for fetching data from Vertical Pod Autoscaler Recommender. | ||
|
||
## Admission-controller | ||
|
||
This is a binary that registers itself as a Mutating Admission Webhook | ||
and because of that is on the path of creating all pods. | ||
For each pod creation, it will get a request from the apiserver and it will | ||
either decide there's no matching VPA configuration or find the corresponding | ||
one and use current recommendation to set resource requests in the pod. | ||
|
||
### Running the admission-controller | ||
|
||
1. You should make sure your API server supports Mutating Webhooks. | ||
Its `--admission-control` flag should have `MutatingAdmissionWebhook` as one of | ||
the values on the list and its `--runtime-config` flag should include | ||
`admissionregistration.k8s.io/v1beta1=true`. | ||
To change those flags, ssh to your API Server instance, edit | ||
`/etc/kubernetes/manifests/kube-apiserver.manifest` and restart kubelet to pick | ||
up the changes: ```sudo systemctl restart kubelet.service``` | ||
1. Generate certs by running `bash gencerts.sh`. This will use kubectl to create | ||
a secret in your cluster with the certs. | ||
1. Create RBAC configuration for the admission controller pod by running | ||
`kubectl create -f ../deploy/admission-controller-rbac.yaml` | ||
1. Create the pod: | ||
`kubectl create -f ../deploy/admission-controller-deployment.yaml`. | ||
The first thing this will do is it will register itself with the apiserver as | ||
Webhook Admission Controller and start changing resource requirements | ||
for pods on their creation & updates. | ||
1. You can specify a path for it to register as a part of the installation process | ||
by setting `--register-by-url=true` and passing `--webhook-address` and `--webhook-port`. | ||
1. You can specify a minimum TLS version with `--min-tls-version` with acceptable values being `tls1_2` (default), or `tls1_3`. | ||
1. You can also specify a comma or colon separated list of ciphers for the server to use with `--tls-ciphers` if `--min-tls-version` is set to `tls1_2`. | ||
1. You can specify a comma separated list to set webhook labels with `--webhook-labels`, example format: key1:value1,key2:value2. | ||
|
||
### Implementation of the Admission Controller | ||
|
||
All VPA configurations in the cluster are watched with a lister. | ||
In the context of pod creation, there is an incoming https request from | ||
apiserver. | ||
The logic to serve that request involves finding the appropriate VPA, retrieving | ||
current recommendation from it and encodes the recommendation as a json patch to | ||
the Pod resource. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,110 @@ | ||
# Examples | ||
|
||
## Contents | ||
|
||
- [Examples](#examples) | ||
- [Keeping limit proportional to request](#keeping-limit-proportional-to-request) | ||
- [Capping to Limit Range](#capping-to-limit-range) | ||
- [Resource Policy Overriding Limit Range](#resource-policy-overriding-limit-range) | ||
- [Starting multiple recommenders](#starting-multiple-recommenders) | ||
- [Using CPU management with static policy](#using-cpu-management-with-static-policy) | ||
- [Controlling eviction behavior based on scaling direction and resource](#controlling-eviction-behavior-based-on-scaling-direction-and-resource) | ||
- [Limiting which namespaces are used](#limiting-which-namespaces-are-used) | ||
- [Setting the webhook failurePolicy](#setting-the-webhook-failurepolicy) | ||
|
||
## Keeping limit proportional to request | ||
|
||
The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also | ||
specifies resource limit of 2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA | ||
applies the recommendation, it will also set the memory limit to 4 GB. | ||
|
||
## Capping to Limit Range | ||
|
||
The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also | ||
specifies resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. | ||
VPA recommendation is 1000 milli CPU and 2 GB of RAM. When VPA applies the recommendation, it will | ||
set the memory limit to 3 GB (to keep it within the allowed limit range) and the memory request to 1.5 GB ( | ||
to maintain a 2:1 limit/request ratio from the template). | ||
|
||
## Resource Policy Overriding Limit Range | ||
|
||
The container template specifies resource request for 500 milli CPU and 1 GB of RAM. The template also | ||
specifies a resource limit of 2 GB RAM. A limit range sets a maximum limit to 3 GB RAM per container. | ||
VPAs Container Resource Policy requires VPA to set containers request to at least 750 milli CPU and | ||
2 GB RAM. VPA recommendation is 1000 milli CPU and 2 GB of RAM. When applying the recommendation, | ||
VPA will set RAM request to 2 GB (following the resource policy) and RAM limit to 4 GB (to maintain | ||
the 2:1 limit/request ratio from the template). | ||
|
||
## Starting multiple recommenders | ||
|
||
It is possible to start one or more extra recommenders in order to use different percentile on different workload profiles. | ||
For example you could have 3 profiles: [frugal](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-low.yaml), | ||
[standard](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment.yaml) and | ||
[performance](https://github.com/kubernetes/autoscaler/blob/master/vertical-pod-autoscaler/deploy/recommender-deployment-high.yaml) which will | ||
use different TargetCPUPercentile (50, 90 and 95) to calculate their recommendations. | ||
|
||
Please note the usage of the following arguments to override default names and percentiles: | ||
|
||
- --recommender-name=performance | ||
- --target-cpu-percentile=0.95 | ||
|
||
You can then choose which recommender to use by setting `recommenders` inside the `VerticalPodAutoscaler` spec. | ||
|
||
## Custom memory bump-up after OOMKill | ||
|
||
After an OOMKill event was observed, VPA increases the memory recommendation based on the observed memory usage in the event according to this formula: `recommendation = memory-usage-in-oomkill-event + max(oom-min-bump-up-bytes, memory-usage-in-oomkill-event * oom-bump-up-ratio)`. | ||
You can configure the minimum bump-up as well as the multiplier by specifying startup arguments for the recommender: | ||
`oom-bump-up-ratio` specifies the memory bump up ratio when OOM occurred, default is `1.2`. This means, memory will be increased by 20% after an OOMKill event. | ||
`oom-min-bump-up-bytes` specifies minimal increase of memory after observing OOM. Defaults to `100 * 1024 * 1024` (=100MiB) | ||
|
||
Usage in recommender deployment | ||
|
||
```yaml | ||
containers: | ||
- name: recommender | ||
args: | ||
- --oom-bump-up-ratio=2.0 | ||
- --oom-min-bump-up-bytes=524288000 | ||
``` | ||
## Using CPU management with static policy | ||
If you are using the [CPU management with static policy](https://kubernetes.io/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) for some containers, | ||
you probably want the CPU recommendation to be an integer. A dedicated recommendation pre-processor can perform a round up on the CPU recommendation. Recommendation capping still applies after the round up. | ||
To activate this feature, pass the flag `--cpu-integer-post-processor-enabled` when you start the recommender. | ||
The pre-processor only acts on containers having a specific configuration. This configuration consists in an annotation on your VPA object for each impacted container. | ||
The annotation format is the following: | ||
|
||
```yaml | ||
vpa-post-processor.kubernetes.io/{containerName}_integerCPU=true | ||
``` | ||
|
||
## Controlling eviction behavior based on scaling direction and resource | ||
|
||
To limit disruptions caused by evictions, you can put additional constraints on the Updater's eviction behavior by specifying `.updatePolicy.EvictionRequirements` in the VPA spec. An `EvictionRequirement` contains a resource and a `ChangeRequirement`, which is evaluated by comparing a new recommendation against the currently set resources for a container | ||
|
||
Here is an example configuration which allows evictions only when CPU or memory get scaled up, but not when they both are scaled down | ||
|
||
```yaml | ||
updatePolicy: | ||
evictionRequirements: | ||
- resources: ["cpu", "memory"] | ||
changeRequirement: TargetHigherThanRequests | ||
``` | ||
|
||
Note that this doesn't prevent scaling down entirely, as Pods may get recreated for different reasons, resulting in a new recommendation being applied. See [the original AEP](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/enhancements/4831-control-eviction-behavior) for more context and usage information. | ||
|
||
## Limiting which namespaces are used | ||
|
||
By default the VPA will run against all namespaces. You can limit that behaviour by setting the following options: | ||
|
||
1. `ignored-vpa-object-namespaces` - A comma separated list of namespaces to ignore | ||
1. `vpa-object-namespace` - A single namespace to monitor | ||
|
||
These options cannot be used together and are mutually exclusive. | ||
|
||
## Setting the webhook failurePolicy | ||
|
||
It is possible to set the failurePolicy of the webhook to `Fail` by passing `--webhook-failure-policy-fail=true` to the VPA admission controller. | ||
Please use this option with caution as it may be possible to break Pod creation if there is a failure with the VPA. | ||
Using it in conjunction with `--ignored-vpa-object-namespaces=kube-system` or `--vpa-object-namespace` to reduce risk. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Features | ||
|
||
## Contents | ||
|
||
- [Limits control](#limits-control) | ||
|
||
## Limits control | ||
|
||
When setting limits VPA will conform to | ||
[resource policies](https://github.com/kubernetes/autoscaler/blob/vertical-pod-autoscaler-1.2.1/vertical-pod-autoscaler/pkg/apis/autoscaling.k8s.io/v1/types.go#L95-L103). | ||
It will maintain limit to request ratio specified for all containers. | ||
|
||
VPA will try to cap recommendations between min and max of | ||
[limit ranges](https://kubernetes.io/docs/concepts/policy/limit-range/). If limit range conflicts | ||
with VPA resource policy, VPA will follow VPA policy (and set values outside the limit | ||
range). | ||
|
||
To disable getting VPA recommendations for an individual container, set `mode` to `"Off"` in `containerPolicies`. |
Oops, something went wrong.