Member-only story
Horizontal Pod Autoscaling with Kubernetes using external metric (cluster worker nodes count)
In the world of Kubernetes or Microservices , You might have heard of 2 types of scaling —
1 — Vertical (adding more power to existing nodes — i.e more cpu and memory)
2 — Horizontal (adding more instances of resource (worker node) to handle the demand)
In this article , we would be looking at an example of Horizontal scaling where Kubernetes HPA is leveraged as it helps us autoscale a kubernetes deployment based on some underlying metric/information. So in this case the resource which will be auto-scaled is a kubernetes pod — it’s replicas will increase or decrease with demand.
Requirement — Consider a case that we have a deployment running nginx container and we need to scale them based on number of nodes in the system , meaning — if worker nodes == 2 , nginx replicas should scale to 2 ; If worker nodes == 3 , nginx replicas should scale to 3 and so on.
A typical HPA definition looks like ->
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
namespace: kube-system
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 3
metrics:
- type: External
external:
metric:
name: node_scale_indicator
target:
type: Value
value: 2
where -
scaleTargetRef — target deployment which needs to be autoscaled (nginx)
min and maxReplicas — the min and max limit of replicas hpa will manage to scale
metrics — type — the type of metric hpa bases the autoscaling decision on. Here we have used a metric from kube-state-metrics which fits for an external metric.
- metric — name → name of the external metrics
- target — value -> if the result of metric is higher or lower than target , the autoscaling-up and autoscaling-down is performed respectively.
How HPA works / Requirements for HPA to work
1 → If we want to use HPA based on resource (CPU and Memory) metrics -