K8s Insights:  Requests and Limits - CKA

K8s Insights: Requests and Limits - CKA

As applications grow in complexity and scale, efficient resource management becomes critical to ensure stability, performance, and cost-effectiveness. In a Kubernetes environment, where multiple applications and services run concurrently on shared infrastructure, controlling how resources like CPU and memory are allocated to each application is paramount. This is where Kubernetes' "Requests" and "Limits" come into play.

Kubernetes allows developers and operators to define the minimum and maximum resources each container can use. By setting resource requests and limits, you can ensure that applications get the resources they need to function properly without overstepping their bounds and affecting other applications.

In this blog, we will dive deep into Requests and Limits, exploring their significance and providing hands-on practice to help you implement effective resource management in your Kubernetes environment.

Resource Request

A resource request is the minimum amount of CPU or memory that a container requires. The Kubernetes scheduler uses requests to decide which node to place a Pod on, ensuring that the node has sufficient resources to accommodate the Pod.

  • CPU Requests: Measured in CPU units. One CPU unit is equivalent to one vCPU/Core for cloud providers or one hyperthread on bare-metal Intel processors. For example, 500m represents half a CPU.

  • Memory Requests: Measured in bytes. For example, 128Mi represents 128 mebibytes of memory.

Resource Limits

A resource limit is the maximum amount of CPU or memory that a container is allowed to use. If a container tries to exceed its limit, it may be throttled or terminated, depending on the resource.

  • CPU Limits: The maximum CPU units a container can use. If a container exceeds its CPU limit, the Kubernetes scheduler will throttle its CPU usage, which means it will slow down the container.

  • Memory Limits: The maximum memory a container can use. If a container exceeds its memory limit, it will be terminated, and the Pod may be restarted based on its restart policy.

Why We Need a Metrics Server

Before we dive into hands-on practice, we need to set up a metrics server. The metrics server collects and aggregates resource usage data (CPU and memory) from the nodes and Pods in the cluster. This data is crucial for monitoring the health and performance of your applications, as well as for enabling features like autoscaling.

Setting Up the Metrics Server

Create a file named metric-server.yaml and paste the following content:

apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
    rbac.authorization.k8s.io/aggregate-to-admin: "true"
    rbac.authorization.k8s.io/aggregate-to-edit: "true"
    rbac.authorization.k8s.io/aggregate-to-view: "true"
  name: system:aggregated-metrics-reader
rules:
- apiGroups:
  - metrics.k8s.io
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
rules:
- apiGroups:
  - ""
  resources:
  - nodes/metrics
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  verbs:
  - get
  - list
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server-auth-reader
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server:system:auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    k8s-app: metrics-server
  name: system:metrics-server
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
---
apiVersion: v1
kind: Service
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  ports:
  - name: https
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    k8s-app: metrics-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: metrics-server
  name: metrics-server
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxUnavailable: 0
  template:
    metadata:
      labels:
        k8s-app: metrics-server
    spec:
      containers:
      - args:
        - --cert-dir=/tmp
        - --secure-port=10250
        - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
        - --kubelet-use-node-status-port
        - --kubelet-insecure-tls
        - --metric-resolution=15s
        image: registry.k8s.io/metrics-server/metrics-server:v0.7.1
        imagePullPolicy: IfNotPresent
        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /livez
            port: https
            scheme: HTTPS
          periodSeconds: 10
        name: metrics-server
        ports:
        - containerPort: 10250
          name: https
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /readyz
            port: https
            scheme: HTTPS
          initialDelaySeconds: 20
          periodSeconds: 10
        resources:
          requests:
            cpu: 100m
            memory: 200Mi
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 1000
          seccompProfile:
            type: RuntimeDefault
        volumeMounts:
        - mountPath: /tmp
          name: tmp-dir
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      volumes:
      - emptyDir: {}
        name: tmp-dir
---
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  labels:
    k8s-app: metrics-server
  name: v1beta1.metrics.k8s.io
spec:
  group: metrics.k8s.io
  groupPriorityMinimum: 100
  insecureSkipTLSVerify: true
  service:
    name: metrics-server
    namespace: kube-system
  version: v1beta1
  versionPriority: 100

Now apply this file using:

kubectl apply -f metric-server.yaml

The metrics server will run in the kube-system namespace. You can monitor CPU and memory usage of your nodes using the kubectl top nodes command.

Creating a Namespace for Hands-On Practice

Create a namespace mem-example for hands-on practice using the following command:

Kubectl create ns mem-example

Example 1: Specify a Memory Request and a Memory Limit

To specify a memory request for a container, include the resources.requests field in the container's resource manifest. To specify a memory limit, include resources.limits.

In this example, we create a Pod that has one container. The container has a memory request of 100 MiB and a memory limit of 200 MiB. Here's the configuration file(mem1.yaml) for the Pod:

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
  namespace: mem-example
spec:
  containers:
  - name: memory-demo-ctr
    image: polinux/stress
    resources:
      requests:
        memory: "100Mi"
      limits:
        memory: "200Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]

The args section in the configuration file provides arguments for the container when it starts. The --vm-bytes 150M arguments tell the container to attempt to allocate 150 MiB of memory.

Create the Pod:

kubectl apply -f mem1.yaml

Verify that the pod container is running:

kubectl get pods -n mem-example

Run kubectl top to fetch the metrics for the pod:

kubectl top pod memory-demo -n mem-example

The output shows that the Pod is using about 150Mi memory, which is greater than the Pod's 100 MiB request, but within the Pod's 200 MiB limit.

Delete the Pod:

kubectl delete pod memory-demo -n mem-example

Example2: Exceed a Container's memory limit

A Container can exceed its memory request if the Node has memory available. But a Container is not allowed to use more than its memory limit. If a Container allocates more memory than its limit, the Container becomes a candidate for termination. If the Container continues to consume memory beyond its limit, the Container is terminated. If a terminated Container can be restarted, the kubelet restarts it, as with any other type of runtime failure.

In this exercise, we create a Pod that attempts to allocate more memory than its limit. Here is the configuration file for a Pod that has one Container with a memory request of 50 MiB and a memory limit of 100 MiB:

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo-2
  namespace: mem-example
spec:
  containers:
  - name: memory-demo-2-ctr
    image: polinux/stress
    resources:
      requests:
        memory: "50Mi"
      limits:
        memory: "100Mi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "250M", "--vm-hang", "1"]

In the args section of the configuration file(name: mem2.yaml), we can see that the Container will attempt to allocate 250 MiB of memory, which is well above the 100 MiB limit.

Create the Pod:

kubectl apply -f mem2.yaml -n mem-example

View detailed information about the Pod:

kubectl get pods -n mem-example

At this point, the Container might be running or killed. Repeat the preceding command until the Container is killed.

The output shows that the Container was killed because it is out of memory (OOM).

we can also describe our pod for detail information:

kubectl describe pod memory-demo-2 -n mem-example

Delete the Pod:

kubectl delete pod memory-demo-2 -n mem-example

Example3: Specify a memory request that is too big for your Nodes

In this example, we create a Pod that has a memory request so big that it exceeds the capacity of any Node in your cluster. Here is the configuration file for a Pod that has one Container with a request for 1000 GiB of memory, which likely exceeds the capacity of any Node in your cluster.

apiVersion: v1
kind: Pod
metadata:
  name: memory-demo-3
  namespace: mem-example
spec:
  containers:
  - name: memory-demo-3-ctr
    image: polinux/stress
    resources:
      requests:
        memory: "1000Gi"
      limits:
        memory: "1000Gi"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]

Create the Pod:

kubectl apply -f mem3.yaml

View the pod status:

kubectl get pod -n mem-example

The output shows that the Pod status is PENDING. That is, the Pod is not scheduled to run on any Node, and it will remain in the PENDING state indefinitely:

Describe the Pod:

kubectl describe pod memory-demo-3 -n mem-example

You will see in logs also that it mentions that there is insufficient memory in nodes.

Delete the Pod:

kubectl delete pod memory-demo-3 -n mem-example

If we do not specify a memory limit

If we do not specify a memory limit for a Container, one of the following situations applies:

  • The Container has no upper bound on the amount of memory it uses. The Container could use all of the memory available on the Node where it is running which in turn could invoke the OOM Killer. Further, in case of an OOM Kill, a container with no resource limits will have a greater chance of being killed.

  • The Container is running in a namespace that has a default memory limit, and the Container is automatically assigned the default limit. Cluster administrators can use a LimitRange to specify a default value for the memory limit.

Clean up

Delete your namespace. This deletes all the Pods that you created for this task:

kubectl delete namespace mem-example

Conclusion

Requests and Limits are powerful features in Kubernetes that help ensure your applications get the resources they need while maintaining cluster stability and performance. By correctly configuring these values, we can optimize resource utilization, prevent resource contention, and enhance the reliability of your applications. In this blog, we covered the fundamentals of Requests and Limits, set up a metrics server for resource monitoring, and walked through practical examples to solidify your understanding.