[Kubernetes] Auto Scaling

IT/Kubernetes

[Kubernetes] Auto Scaling

깅지수 2022. 5. 27. 20:47

Auto Scaling

* 참고 : 파드 및 컨테이너 리소스 관리 | Kubernetes

파드 및 컨테이너 리소스 관리

파드를 지정할 때, 컨테이너에 필요한 각 리소스의 양을 선택적으로 지정할 수 있다. 지정할 가장 일반적인 리소스는 CPU와 메모리(RAM) 그리고 다른 것들이 있다. 파드에서 컨테이너에 대한 리소

kubernetes.io

Resource Request & Limit

1. 요청 : request

2. 제한 : limit

3. 요청 & 제한 둘다

(요청 =< 제한)

* QoS(Quality of Service : 서비스 품질) Class:

BestEffort : 요청 및 제한이 설정되어 있지 않을 때 (가장 나쁨)
Burstable : 요청 < 제한
Guaranteed : 요청 = 제한 (가장 좋음)

pod.spec.containers.resources

requests
- cpu
- memory
limits
- cpu
- memory

CPU 요청 & 제한 : milicore 단위
ex) 1500m = cpu 1.5개, 1000m = cpu 1개
ex) 1.5, 0.1
Memory 요청 & 제한: M, G, T, Mi, Gi, Ti

myweb-reqlim.yaml

apiVersion: v1
kind: Pod
metadata:
  name: myweb-reqlim
spec:
  containers:
    - name: myweb
      image: ghcr.io/c1t1d0s7/go-myweb
      resources:
        requests:
          cpu: 200m
          memory: 200M
        limits:
          cpu: 200m
          memory: 200M

 kubectl describe pod myweb-reqlim

QoS Class:                   Guaranteed

- 노드별 CPU/Memory 사용량 확인

kubectl top nodes

- 파드별 CPU/Memory 사용량 확인

kubectl top pods
kubectl top pods -A

- 리소스 모니터링 (인프라 모니터링)
~~Heapster~~ (예전 리소스 모니터링 도구)
→ metric-server : (Storage가 x 때문에) 오직 실시간 cpu/memory 모니터링
→ prometheus : 실시간/이전 cpu/memory/network/disk 모니터링

- 노드별 요청/제한 양 확인

kubectl describe nodes node1

실행 할 수 없는 리소스 → 요청 cpu가 host가 가진 cpu를 초과했기 때문
myweb-big.yaml

apiVersion: v1
kind: Pod
metadata:
  name: myweb-big
spec:
  containers:
    - name: myweb
      image: ghcr.io/c1t1d0s7/go-myweb
      resources:
        limits:
          cpu: 3000m
          memory: 4000M

→ (request를 설정하지 않고) limit만 설정한 경우 : 자동으로 limit과 같은 값으로 request 설정됨

but (limit을 설정하지 않고) request만 설정한 경우 : 자동으로 limit은 설정되지 x

HPA : Horisontal Pod AutoScaler

* 참고 : Horizontal Pod Autoscaling | Kubernetes

Horizontal Pod Autoscaling

쿠버네티스에서, HorizontalPodAutoscaler 는 워크로드 리소스(예: 디플로이먼트 또는 스테이트풀셋)를 자동으로 업데이트하며, 워크로드의 크기를 수요에 맞게 자동으로 스케일링하는 것을 목표로 한

kubernetes.io

AutoScaling

Pod
- HPA (수평)
- VPA : Vertical (수직) Pod Autoscaler
Node
- ClusterAutoScaler

HPA : Deployment, ReplicaSet, StatefulSet의 복제본 개수를 조정

안정화 윈도우 (해당 시간까지는 scaling하지 않고 봐주면서 기다리기)

(default) 스케일 아웃: 180초 / 스케일 인: 300초

원하는 레플리카 수 = ceil[현재 레플리카 수 * ( 현재 메트릭 값 / 원하는 메트릭 값 )]

myweb-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myweb-deploy
spec:
  replicas: 2
  selector:
    matchLabels:
      app: web
  template:
    metadata:
      labels:
        app: web
    spec:
      containers:
        - name: myweb
          image: ghcr.io/c1t1d0s7/go-myweb:alpine
          ports:
            - containerPort: 8080
          resources:
            requests:
              cpu: 200m
            limits:
              cpu: 200m

* HPA를 위해 최소 request는 설정되여 함

myweb-hpa.yaml

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  name: myweb-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  targetCPUUtilizationPercentage: 50
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myweb-deploy

부하

kubectl exec <POD> -- sha256sum /dev/zero

myweb-hpa-v2beta2.yaml

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: myweb-hpa
spec:
  minReplicas: 1
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          avarageUtilization: 50
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: myweb-deploy