Skip to content

Commit 58af0b0

Browse files
Merge pull request #14 from Phantom-Intruder/karpenter-tuning
Karpenter tuning
2 parents 6599104 + 2bd938d commit 58af0b0

31 files changed

+4432
-9
lines changed

Autoscaler101/autoscaler-lab.md

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# Lab
2+
3+
You will need a Kubernetes cluster. A single node [Minikube cluster](https://minikube.sigs.k8s.io/docs/start/) will do just fine. Once the cluster is setup, you will have to install the metrics server, since the autoscalers use this to read the resource usage metrics. To do this, run:
4+
5+
```
6+
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
7+
```
8+
9+
We will start with a base application that will have the scaling performed in it. In this case, we will use a sample nginx deployment. Create a file `nginx-deployment.yaml` and paste the below contents to it:
10+
11+
12+
13+
```
14+
apiVersion: apps/v1
15+
kind: Deployment
16+
metadata:
17+
name: nginx-deployment
18+
spec:
19+
replicas: 3
20+
selector:
21+
matchLabels:
22+
app: nginx
23+
template:
24+
metadata:
25+
labels:
26+
app: nginx
27+
spec:
28+
containers:
29+
- name: nginx-container
30+
image: nginx:1.21.5
31+
resources:
32+
requests:
33+
cpu: 100m
34+
memory: 128Mi
35+
---
36+
apiVersion: v1
37+
kind: Service
38+
metadata:
39+
name: nginx-service
40+
spec:
41+
selector:
42+
app: nginx
43+
ports:
44+
- protocol: TCP
45+
port: 80
46+
targetPort: 80
47+
```
48+
49+
This will start an nginx container that has at least 100m CPU & 128Mb memory, but not more than 200m CPU and 256Mb memory. It will also start the service that points to this deployment on port 80. Deploy this application onto your Kubernetes cluster:
50+
51+
```
52+
kubectl apply -f nginx-deployment.yaml
53+
```
54+
55+
Now, when the application reaches the CPU or memory limit, it will affect application performance since it is not allowed to go beyond that. So let's introduce the autoscaler. We will start with the vertical pod autoscaler. Create a new file called "nginx-vpa.yaml" and paste the contents of the below script there.
56+
57+
```
58+
apiVersion: autoscaling.k8s.io/v1
59+
kind: VerticalPodAutoscaler
60+
metadata:
61+
name: nginx-vpa
62+
spec:
63+
targetRef:
64+
apiVersion: "apps/v1"
65+
kind: Deployment
66+
name: nginx-deployment
67+
updatePolicy:
68+
updateMode: "Auto"
69+
resourcePolicy:
70+
containerPolicies:
71+
- containerName: "*" # Apply policies to all containers in the pod
72+
minAllowed:
73+
cpu: 50m
74+
memory: 64Mi
75+
maxAllowed:
76+
cpu: 500m
77+
memory: 512Mi
78+
```
79+
80+
The resource itself is fairly self-explanatory. The spec section contains the specifications for the VPA. The targetRef section specifies the workload that the VPA is targeting for autoscaling. In this example, it's targeting a Deployment named "nginx-deployment." The updatePolicy section configures the update mode. In "Auto" mode, VPA automatically applies the recommended changes to the pod resources without manual intervention. The resourcePolicy section specifies the resource policies for individual containers within the pod. Within it, you have the containerPolicies section which defines policies for containers. In this case, it uses a wildcard ("*") to apply policies to all containers in the pod. It also has the minAllowed section which specifies the minimum allowed resources. VPA won't recommend going below these values. For example, the minimum allowed CPU is 50 milliCPU (50m), and the minimum allowed memory is 64 megabytes (64Mi). The maxAllowed section specifies the maximum allowed resources. VPA won't recommend going above these values. For example, the maximum allowed CPU is 500 milliCPU (500m), and the maximum allowed memory is 512 megabytes (512Mi).
81+
82+
Now deploy this into the Kubernetes cluster:
83+
84+
```
85+
kubectl apply -f nginx-vpa.yaml
86+
```
87+
88+
Once the deployment is complete, we need to load-test the deployment to see the VPA in action. An important thing to note here is that if you placed the VPA memory/CPU limit too low, this will result in the pod starting up replicas immediately upon pod creation since the limit will be reached as soon as the pod comes up. This is why it is important to be aware of your average and peak loads before you begin implementing the VPA.
89+
90+
To load test the deployment, we will be using Apache Benchmark. Install it with `apt` or `yum`. You can do the installation on the Kubernetes node that has started. Next, note down the URL you want to load-test. To get this, use:
91+
92+
```
93+
kubectl get svc
94+
```
95+
96+
This will list all the services. Pick the nginx service from this list, copy its IP, and use Benchmark as below:
97+
98+
```
99+
ab -n 1000 -c 50 http://<nginx-service-ip>/
100+
```
101+
102+
This command will send 1000 requests with a concurrency of 50 to the NGINX service. You can adjust the -n (total requests) and -c (concurrency) parameters based on your specific load testing requirements. You can then analyze the results. Apache Benchmark will provide detailed output, including request per second (RPS), connection times, and more. For example:
103+
104+
```
105+
Connection Times (ms)
106+
min mean[+/-sd] median max
107+
Connect: 0 1 2.8 0 10
108+
Processing: 104 271 144.3 217 1184
109+
Waiting: 104 270 144.2 217 1184
110+
Total: 104 272 144.5 217 1185
111+
```
112+
113+
Now it's time to check if autoscaling has started:
114+
115+
```
116+
kubectl get po -n default
117+
```
118+
119+
Watch the pods, and you will see that the resource limits are reached, after which a new pod with more resources is created. Keep an eye on the resource usage and you will notice that the new resources have higher limits. Once the requests have been handled, the pod will immediately reduce the resource consumption. However, a new pod with lower resource requirements will not show up to replace the old pod. In fact, if you were to push a new version of the deployment into the cluster, it would still have space for a large amount of requests. However, this will reduce eventually if the amount of resources consumed continues to be low.
120+
121+
Now that we have gotten a complete look at the vertical pod autoscaler, let's take a look at the HPA. Create a file nginx-hpa.yml and paste the below contents into it.
122+
123+
```
124+
apiVersion: autoscaling/v2beta2
125+
kind: HorizontalPodAutoscaler
126+
metadata:
127+
name: nginx-hpa
128+
spec:
129+
scaleTargetRef:
130+
apiVersion: apps/v1
131+
kind: Deployment
132+
name: nginx-deployment
133+
minReplicas: 1
134+
maxReplicas: 5
135+
metrics:
136+
- type: Resource
137+
resource:
138+
name: cpu
139+
target:
140+
type: Utilization
141+
averageUtilization: 80
142+
```
143+
144+
The above HPA definition has a lot of similarities to the VPA definition. The differences lie in the minReplicas and maxReplicas sections which define the minimum and maximum number of pod replicas that the HPA should maintain. In this case, it's set to have a minimum of 2 replicas and a maximum of 5 replicas. The VPA didn't have a metrics section that the HPA has, but its resourcePolicy section is pretty similar to this, where the metrics configure the metric used for autoscaling. In this example, it's using the CPU utilization metric.`type: Resource:` Specifies that the metric is a resource metric (in this case, CPU). The `resource` section specifies the resource metric details. `name: cpu` Indicates that the metric is CPU utilization. The target section specifies the target value for the metric and `type: Utilization` indicates that the target is based on resource utilization. `averageUtilization` sets the target average CPU utilization to 80%.
145+
146+
Before you deploy this file into your cluster, make sure to remove the VPA since having two types of autoscalers running for the same pod can cause some obvious problems. So first run:
147+
148+
```
149+
kubectl delete -f nginx-vpa.yaml
150+
```
151+
152+
Then deploy the HPA:
153+
154+
```
155+
kubectl apply -f nginx-hpa.yaml
156+
```
157+
158+
You can see the status of the HPA as it starts up using `describe`:
159+
160+
```
161+
kubectl describe hpa nginx-hpa
162+
```
163+
164+
You might see some errors about the HPA being unable to retrieve metrics, however, these can be ignored since this is an issue that occurs only when the HPA starts up for the first time. Now, let's go back to the apache benchmark and add load to the nginx service so that we can see the HPA in action. Let's start it up in the same manner as before:
165+
166+
```
167+
ab -n 1000 -c 50 http://<nginx-service-ip>/
168+
```
169+
170+
A thousand requests should start being sent to the service. Start watching the nginx pod to see if replicas are being created:
171+
172+
```
173+
kubectl get po -n default --watch
174+
```
175+
176+
You should be able to see the memory limit getting reached, after which the number of pods will increase. This will keep happening until the number of pods reaches the maximum specified value (5) or the memory requests are satisfied.
177+
178+
179+
## Conclusion
180+
181+
That sums up the lab on autoscalers. In here, we discussed the two most commonly used in-built autoscalers: HPA and VPA. We also took a hands-on look at how the autoscalers worked. This is just the tip of the iceberg when it comes to scaling, however, and the subject of custom scalers that can scale based on metrics other than memory and CPU is vast. If you are interested in looking at more complicated scaling techniques, you could take a look at the [KEDA section](../Keda101/what-is-keda.md) to get some idea of the keda autoscaler.

0 commit comments

Comments
 (0)