Kubernetes Node Autoscaling in AWS EKS

Anupam Mahapatra
4 min readJan 25, 2021

An auto-scaler is a component that automatically adjusts the size of a Kubernetes Cluster so that all pods have a place to run and there are no unneeded nodes.
In this article, we assume that there is a EKS cluster running in AWS with applications in it. The cluster is not capable of judging if it needs to scale when the Kubernetes control plane tries to schedule pod into it. To Enable this , we will deploy the component autoscaler published by google.

1. Enable the cluster role to autoscale itself.

This is done by updating the role attached to the managed nodegroup

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"autoscaling:DescribeAutoScalingGroups",
"autoscaling:DescribeAutoScalingInstances",
"autoscaling:DescribeLaunchConfigurations",
"autoscaling:SetDesiredCapacity",
"autoscaling:TerminateInstanceInAutoScalingGroup",
"autoscaling:DescribeTags",
"ec2:DescribeLaunchTemplateVersions"
],
"Resource": ["*"]
}
]
}

2. Deploy the GCR Kubernetes autoscaler application.

This application monitors the requirements within the cluster by communicating with the kube-proxy and kubelets to gauge the compute requirements. If there is a new pod requesting to be scheduled and there are not enough resources to schedule it, this application will request for a new node addition. It will propagate that request as the node has the permissions from step 1.

---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
name: cluster-autoscaler
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["events", "endpoints"]
verbs: ["create", "patch"]
- apiGroups: [""]
resources: ["pods/eviction"]
verbs: ["create"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["update"]
- apiGroups: [""]
resources: ["endpoints"]
resourceNames: ["cluster-autoscaler"]
verbs: ["get", "update"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["watch", "list", "get", "update"]
- apiGroups: [""]
resources:
- "pods"
- "services"
- "replicationcontrollers"
- "persistentvolumeclaims"
- "persistentvolumes"
verbs: ["watch", "list", "get"]
- apiGroups: ["extensions"]
resources: ["replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["policy"]
resources: ["poddisruptionbudgets"]
verbs: ["watch", "list"]
- apiGroups: ["apps"]
resources: ["statefulsets", "replicasets", "daemonsets"]
verbs: ["watch", "list", "get"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses", "csinodes"]
verbs: ["watch", "list", "get"]
- apiGroups: ["batch", "extensions"]
resources: ["jobs"]
verbs: ["get", "list", "watch", "patch"]
- apiGroups: ["coordination.k8s.io"]
resources: ["leases"]
verbs: ["create"]
- apiGroups: ["coordination.k8s.io"]
resourceNames: ["cluster-autoscaler"]
resources: ["leases"]
verbs: ["get", "update"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["create","list","watch"]
- apiGroups: [""]
resources: ["configmaps"]
resourceNames: ["cluster-autoscaler-status", "cluster-autoscaler-priority-expander"]
verbs: ["delete", "get", "update", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-autoscaler
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
k8s-addon: cluster-autoscaler.addons.k8s.io
k8s-app: cluster-autoscaler
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: cluster-autoscaler
subjects:
- kind: ServiceAccount
name: cluster-autoscaler
namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: cluster-autoscaler
namespace: kube-system
labels:
app: cluster-autoscaler
spec:
replicas: 1
selector:
matchLabels:
app: cluster-autoscaler
template:
metadata:
labels:
app: cluster-autoscaler
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8085'
spec:
serviceAccountName: cluster-autoscaler
containers:
- image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.17.3
name: cluster-autoscaler
resources:
limits:
cpu: 100m
memory: 300Mi
requests:
cpu: 100m
memory: 300Mi
command:
- ./cluster-autoscaler
- --v=4
- --stderrthreshold=info
- --cloud-provider=aws
- --skip-nodes-with-local-storage=false
- --expander=least-waste
- --nodes=1:10:<AUTO-SCALING-GROUP-NAME-OF-THE-NODE-GROUP>
volumeMounts:
- name: ssl-certs
mountPath: /etc/ssl/certs/ca-certificates.crt
readOnly: true
imagePullPolicy: "Always"
volumes:
- name: ssl-certs
hostPath:
path: "/etc/ssl/certs/ca-bundle.crt"

Note: In the template we have to edit the following :

— nodes=1:10:<AUTO-SCALING-GROUP-NAME-OF-THE-NODE-GROUP>

to the appropriate auto-scaling group name for the dedicated node group.
The 1:10 indicates the max and minimum of the autoscaling group. Please makes these the same as specified at the auto-scaling group level. This is specified in the node group definition in the cloud formation template with the params:
NodeAutoScalingGroupMinSize
NodeAutoScalingGroupMaxSize

3. Customization:

The following parameters are exposed by the scaling application along with the default values which can be overridden in the helm chart.

--address=":8085"
--alsologtostderr="false"
--aws-use-static-instance-list="false"
--balance-similar-node-groups="false"
--cloud-config=""
--cloud-provider="aws"
--cloud-provider-gce-l7lb-src-cidrs="130.211.0.0/22,35.191.0.0/16"
--cloud-provider-gce-lb-src-cidrs="130.211.0.0/22,209.85.152.0/22,209.85.204.0/22,35.191.0.0/16"
--cluster-name=""
--cores-total="0:320000"
--estimator="binpacking"
--expander="least-waste"
--expendable-pods-priority-cutoff="-10"
--filter-out-schedulable-pods-uses-packing="true"
--gpu-total="[]"
--ignore-daemonsets-utilization="false"
--ignore-mirror-pods-utilization="false"
--ignore-taint="[]"
--kubeconfig=""
--kubernetes=""
--leader-elect="true"
--leader-elect-lease-duration="15s"
--leader-elect-renew-deadline="10s"
--leader-elect-resource-lock="leases"
--leader-elect-resource-name=""
--leader-elect-resource-namespace=""
--leader-elect-retry-period="2s"
--log-backtrace-at=":0"
--log-dir=""
--log-file=""
--log-file-max-size="1800"
--logtostderr="true"
--max-autoprovisioned-node-group-count="15"
--max-bulk-soft-taint-count="10"
--max-bulk-soft-taint-time="3s"
--max-empty-bulk-delete="10"
--max-failing-time="15m0s"
--max-graceful-termination-sec="600"
--max-inactivity="10m0s"
--max-node-provision-time="15m0s"
--max-nodes-total="0"
--max-total-unready-percentage="45"
--memory-total="0:6400000"
--min-replica-count="0"
--namespace="kube-system"
--new-pod-scale-up-delay="0s"
--node-autoprovisioning-enabled="false"
--node-deletion-delay-timeout="2m0s"
--node-group-auto-discovery="[]"
--nodes="[1:3:inf-ss-eks-cluster01-managednodegroup-autoscaler01]"
--ok-total-unready-count="3"
--regional="false"
--scale-down-candidates-pool-min-count="50"
--scale-down-candidates-pool-ratio="0.1"
--scale-down-delay-after-add="10m0s"
--scale-down-delay-after-delete="0s"
--scale-down-delay-after-failure="3m0s"
--scale-down-enabled="true"
--scale-down-gpu-utilization-threshold="0.5"
--scale-down-non-empty-candidates-count="30"
--scale-down-unneeded-time="10m0s"
--scale-down-unready-time="20m0s"
--scale-down-utilization-threshold="0.5"
--scale-up-from-zero="true"
--scan-interval="10s"
--skip-headers="false"
--skip-log-headers="false"
--skip-nodes-with-local-storage="false"
--skip-nodes-with-system-pods="true"
--stderrthreshold="0"
--unremovable-node-recheck-timeout="5m0s"
--v="4"
--vmodule=""
--write-status-configmap="true"

References:

--

--