Pod memory and CPU resources

Contents

Introduction

The CPU and memory requests and limits for WebLogic Server Pods usually need to be tuned where the optimal values depend on your workload, applications, and the Kubernetes environment. Requests and limits should be configured based on the expected traffic during peak usage. For example:

  • Tune CPU and memory high enough to handle expected peak workloads for applications that require large amounts of in-memory processing or very CPU intensive calculations.
  • Tune memory high enough so that WebLogic JMS messaging applications that generate large backlogs of unprocessed persistent or non-persistent messages can expect JMS to efficiently cache the backlogs in memory.
  • CPU requirements are sometimes significantly higher when a WebLogic Server is starting. This means that a low CPU allocation that might be suitable for light runtime workloads risks causing unacceptably slow startup times.

Requirements vary considerably between use cases. You may need to experiment and make adjustments based on monitoring resource usage in your environment.

The operator creates a container in its own Pod for each domain’s WebLogic Server instances and for the short-lived introspector job that is automatically launched before WebLogic Server Pods are launched. You can tune container memory and CPU usage by configuring Kubernetes resource requests and limits, and you can tune a WebLogic JVM heap usage using the USER_MEM_ARGS environment variable in your Domain YAML file. By default, the introspector job pod uses the same CPU and memory settings as the domain’s WebLogic Administration Server pod. Similarly, the operator created init containers in the introspector job pod for the Auxiliary Images based domains use the same CPU and memory settings as the domain’s WebLogic Administration Server pod. Beginning with operator version 4.0.5, you can override the settings of the introspector job pod using the domain.spec.introspector.serverPod element. A resource request sets the minimum amount of a resource that a container requires. A resource limit is the maximum amount of a resource a container is given and prevents a container from using more than its share of a resource. Additionally, resource requests and limits determine a Pod’s quality of service.

This FAQ discusses tuning these parameters so WebLogic Server instances run efficiently.

Setting resource requests and limits in a Domain or Cluster resource

You can set Kubernetes memory and CPU requests and limits in a Domain or Cluster YAML file using its domain.spec.serverPod.resources stanza, and you can override the setting for individual WebLogic Server instances using the serverPod.resources element in domain.spec.adminServer, or domain.spec.managedServers. You can override the setting for member servers of a cluster using the cluster.spec.serverPod element. Note that the introspector job pod uses the same settings as the WebLogic Administration Server pod. Beginning with operator version 4.0.5, you can override the settings of the introspector job pod using the domain.spec.introspector.serverPod element.

Values set in the .serverPod stanzas for a more specific type of pod, override the same values if they are also set for a more general type of pod, and inherit any other values set in the more general pod. The domain.spec.adminServer.serverPod, domain.spec.managedServers.serverPod, and cluster.spec.serverPod stanzas all inherit from and override the domain.spec.serverPod stanza. When a domain.spec.managedServers.serverPod stanza refers to a pod that is part of a cluster, it inherits from and overrides from its cluster’s cluster.spec.serverPod setting (if any), which in turn inherits from and overrides the domain’s domain.spec.serverPod setting.

  spec:
    serverPod:
      resources:
        requests:
          cpu: "250m"
          memory: "768Mi"
        limits:
          cpu: "2"
          memory: "2Gi"

Limits and requests for CPU resources are measured in CPU units. One CPU, in Kubernetes, is equivalent to 1 vCPU/Core for cloud providers and 1 hyperthread on bare-metal Intel processors. An m suffix in a CPU attribute indicates ‘milli-CPU’, so 250m is 25% of a CPU.

Memory can be expressed in various units, where one Mi is one IEC unit mega-byte (1024^2), and one Gi is one IEC unit giga-byte (1024^3).

See also Managing Resources for Containers, Assign Memory Resources to Containers and Pods and Assign CPU Resources to Containers and Pods in the Kubernetes documentation.

Determining Pod Quality Of Service

A Pod’s Quality of Service (QoS) is based on whether it’s configured with resource requests and limits:

  • Best Effort QoS (lowest priority): If you don’t configure requests and limits for a Pod, then the Pod is given a best-effort QoS. In cases where a Node runs out of non-shareable resources, the default out-of-resource eviction policy evicts running Pods with the best-effort QoS first.

  • Burstable QoS (medium priority): If you configure both resource requests and limits for a Pod, and set the requests to be less than their respective limits, then the Pod will be given a burstable QoS. Similarly, if you only configure resource requests (without limits) for a Pod, then the Pod QoS is also burstable. If a Node runs out of non-shareable resources, the Node’s kubelet will evict burstable Pods only when there are no more running best-effort Pods.

  • Guaranteed QoS (highest priority): If you set a Pod’s requests and the limits to equal values, then the Pod will have a guaranteed QoS. These settings indicate that your Pod will consume a fixed amount of memory and CPU. With this configuration, if a Node runs out of shareable resources, then the Node’s kubelet will evict best-effort and burstable QoS Pods before terminating guaranteed QoS Pods.

For most use cases, Oracle recommends configuring WebLogic Pods with memory and CPU requests and limits, and furthermore, setting requests equal to their respective limits to ensure a guaranteed QoS.

In later versions of Kubernetes, it is possible to fine tune scheduling and eviction policies using Pod Priority Preemption in combination with the serverPod.priorityClassName Domain field. Note that Kubernetes already ships with two PriorityClasses: system-cluster-critical and system-node-critical. These are common classes and are used to ensure that critical components are always scheduled first.

Java heap size and memory resource considerations

Oracle recommends configuring Java heap sizes for WebLogic JVMs instead of relying on the defaults. For detailed information about memory settings when running WLST from the pod where the WLS server is running, see Use kubectl exec.

Importance of setting heap size and memory resources

It’s extremely important to set correct heap sizes, memory requests, and memory limits for WebLogic JVMs and Pods.

A WebLogic JVM heap must be sufficiently sized to run its applications and services, but should not be sized too large so as not to waste memory resources.

A Pod memory limit must be sufficiently sized to accommodate the configured heap and native memory requirements, but should not be sized too large so as not to waste memory resources. If a JVM’s memory usage (sum of heap and native memory) exceeds its Pod’s limit, then the JVM process will be abruptly killed due to an out-of-memory error and the WebLogic container will consequently automatically restart due to a liveness probe failure.

Oracle recommends setting minimum and maximum heap (or heap percentages) and at least a container memory request.

If resource requests and resource limits are set too high, then your Pods may not be scheduled due to a lack of Node resources. It will unnecessarily use up CPU shared resources that could be used by other Pods, or may prevent other Pods from running.

Default heap sizes

With the latest Java versions, Java 8 update 191 and later, or Java 11, if you don’t configure a heap size (no -Xms or -Xms), the default heap size is dynamically determined:

  • If you configure the memory limit for a container, then the JVM default maximum heap size will be 25% (1/4th) of container memory limit and the default minimum heap size will be 1.56% (1/64th) of the limit value.

    In this case, the default JVM heap settings are often too conservative because the WebLogic JVM is the only major process running in the container.

  • If no memory limit is configured, then the JVM default maximum heap size will be 25% (1/4th) of its Node’s machine RAM and the default minimum heap size will be 1.56% (1/64th) of the RAM.

    In this case, the default JVM heap settings can have undesirable behavior, including using unnecessary amounts of memory to the point where it might affect other Pods that run on the same Node.

Configuring heap size

If you specify Pod memory limits, Oracle recommends configuring WebLogic Server heap sizes as a percentage. The JVM will interpret the percentage as a fraction of the limit. This is done using the JVM -XX:InitialRAMPercentage and -XX:MaxRAMPercentage options in the USER_MEM_ARGS Domain environment variable. For example:

  spec:
    resources:
      env:
      - name: USER_MEM_ARGS
        value: "-XX:InitialRAMPercentage=25.0 -XX:MaxRAMPercentage=50.0 -Djava.security.egd=file:/dev/./urandom"

Additionally, there’s a node-manager process that’s running in the same container as the WebLogic Server, which has its own heap and native memory requirements. Its heap is tuned by using -Xms and -Xmx in the NODEMGR_MEM_ARGS environment variable. Oracle recommends setting the Node Manager heap memory to fixed sizes, instead of percentages, where the default tuning is usually sufficient.

Notice that the NODEMGR_MEM_ARGS, USER_MEM_ARGS, and WLST_EXTRA_PROPERTIES environment variables all include -Djava.security.egd=file:/dev/./urandom by default. This helps to speed up the Node Manager and WebLogic Server startup on systems with low entropy, plus similarly helps to speed up introspection job usage of the WLST encrypt command. We have included this property in the previous example for specifying a custom USER_MEM_ARGS value to preserve this speedup. See the environment variable defaults documentation for more information.

In some cases, you might only want to configure memory resource requests but not configure memory resource limits. In such scenarios, you can use the traditional fixed heap size settings (-Xms and -Xmx) in your WebLogic Server USER_MEM_ARGS instead of the percentage settings (-XX:InitialRAMPercentage and -XX:MaxRAMPercentage).

CPU resource considerations

It’s important to set both a CPU request and a limit for WebLogic Server Pods. This ensures that all WebLogic Server Pods have enough CPU resources, and, as discussed earlier, if the request and limit are set to the same value, then they get a guaranteed QoS. A guaranteed QoS ensures that the Pods are handled with a higher priority during scheduling and as such, are the least likely to be evicted.

If a CPU request and limit are not configured for a WebLogic Server Pod:

  • The Pod can end up using all the CPU resources available on its Node and starve other containers from using shareable CPU cycles.

  • The WebLogic Server JVM may choose an unsuitable garbage collection (GC) strategy.

  • A WebLogic Server self-tuning work-manager may incorrectly optimize the number of threads it allocates for the default thread pool.

It’s also important to keep in mind that if you set a value of CPU core count that’s larger than the core count of your biggest Node, then the Pod will never be scheduled. Let’s say you have a Pod that needs 4 cores but you have a Kubernetes cluster that’s comprised of 2 core VMs. In this case, your Pod will never be scheduled and will have Pending status. For example:

$ kubectl get pod sample-domain1-managed-server1 -n sample-domain1-ns
NAME                              READY   STATUS    RESTARTS   AGE
sample-domain1-managed-server1    0/1     Pending   0          65s
$ kubectl describe pod sample-domain1-managed-server1 -n sample-domain1-ns
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  16s (x3 over 26s)  default-scheduler  0/2 nodes are available: 2 Insufficient cpu.

Operator sample heap and resource configuration

The operator samples configure non-default minimum and maximum heap sizes for WebLogic Server JVMs of at least 256MB and 512MB respectively. You can edit a sample’s template or Domain or Cluster YAML file resources.env USER_MEM_ARGS to have different values. See Configuring heap size.

Similarly, the operator samples configure CPU and memory resource requests to at least 250m and 768Mi respectively.

There’s no memory or CPU limit configured by default in samples and so the default QoS for sample WebLogic Server Pod’s is burstable.

If you wish to set resource requests or limits differently on a sample Domain or Cluster YAML file or template, see Setting resource requests and limits in a Domain or Cluster resource. Or, for samples that generate their Domain resource using an “inputs” YAML file, see the serverPodMemoryRequest, serverPodMemoryLimit, serverPodCpuRequest, and serverPodCpuLimit parameters in the sample’s create-domain.sh inputs file.

Burstable pods and JDK active processor count calculation

If you have Burstable Pods that configure the CPU resource requests but have no CPU limits, then the JDK can incorrectly calculate the active processor count. The JDK interprets the value of the --cpu-shares parameter (which maps to spec.containers[].resources.requests.cpu) to limit how many CPUs the current process can use, as explained in JDK-8288367. This might cause the JVM to use fewer CPUs than available, leading to an under utilization of CPU resources when running in a Kubernetes environment. Updating the JDK to newer versions, JDK 8u371 or JDK 11.0.17, fixes this.

To override the number of CPUs that the JVM automatically detects and uses when creating threads for various subsystems, use the -XX:ActiveProcessorCount Java option.

Configuring CPU affinity

A Kubernetes hosted WebLogic Server may exhibit high lock contention in comparison to an on-premises deployment. This lock contention may be due to a lack of CPU cache affinity or scheduling latency when workloads move between different CPU cores.

In an on-premises deployment, CPU cache affinity, and therefore reduced lock contention, can be achieved by binding WLS Java process to a particular CPU core(s) (using the taskset command).

In a Kubernetes deployment, similar cache affinity can be achieved by doing the following:

  • Ensuring a Pod’s CPU resource request and limit are set and equal (to ensure a guaranteed QoS).
  • Configuring the kubelet CPU manager policy to be static (the default is none). See Control CPU Management Policies on the Node. Note that some Kubernetes environments may not allow changing the CPU management policy.

Measuring JVM heap, Pod CPU, and Pod memory

You can monitor JVM heap, Pod CPU, and Pod memory using Prometheus and Grafana. Also, see Tools for Monitoring Resources in the Kubernetes documentation.

References

  1. Managing Resources for Containers in the Kubernetes documentation.
  2. Assign Memory Resources to Containers and Pods in the Kubernetes documentation.
  3. Assign CPU Resources to Containers and Pods in the Kubernetes documentation.
  4. Pod Priority Preemption in the Kubernetes documentation.
  5. GCP Kubernetes best practices: Resource requests and limits
  6. Tools for Monitoring Resources in the Kubernetes documentation.
  7. Blog – Docker support in Java 8. (Discusses Java container support in general.)
  8. Blog – Kubernetes Patterns : Capacity Planning