You may get the following message while creating the WebLogic domain: "the job status is not Completed!"
status on iteration 20 of 20
pod domain1-create-weblogic-sample-domain-job-nj7wl status is Init:0/1
The create domain job is not showing status completed after waiting 300 seconds.
Check the log output for errors.
Error from server (BadRequest): container "create-weblogic-sample-domain-job" in pod "domain1-create-weblogic-sample-domain-job-nj7wl" is waiting to start: PodInitializing
[ERROR] Exiting due to failure - the job status is not Completed!
You can get further error details by running kubectl describe pod, as shown here:
$ kubectl describe pod <your-pod-name>
This is an output example:
$ kubectl describe pod domain1-create-weblogic-sample-domain-job-nj7wl
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 4m2s default-scheduler Successfully assigned default/domain1-create-weblogic-sample-domain-job-qqv6k to aks-nodepool1-58449474-vmss000001
Warning FailedMount 119s kubelet, aks-nodepool1-58449474-vmss000001 Unable to mount volumes for pod "domain1-create-weblogic-sample-domain-job-qqv6k_default(15706980-73cb-11ea-b804-b2c91b494b00)": timeout expired waiting for volumes to attach or mount for pod "default"/"domain1-create-weblogic-sample-domain-job-qqv6k". list of unmounted volumes=[weblogic-sample-domain-storage-volume]. list of unattached volumes=[create-weblogic-sample-domain-job-cm-volume weblogic-sample-domain-storage-volume weblogic-credentials-volume default-token-zr7bq]
Warning FailedMount 114s (x9 over 4m2s) kubelet, aks-nodepool1-58449474-vmss000001 MountVolume.SetUp failed for volume "wls-azurefile" : Couldn't get secret default/azure-secrea
Fail to access Administration Console
Here are some common reasons for this failure, along with some tips to help you investigate.
Create WebLogic domain job fails
Check the deploy log and find the failure details with kubectl describe pod podname.
Please go
Getting pod error details
.
Process of starting the servers is still running
Check with kubectl get svc and if domainUID-admin-server, domainUID-managed-server1, and domainUID-managed-server2 are not listed,
we need to wait some more for the Administration Server to start.
The following output is an example of when the Administration Server has started.
If services are up but the WLS Administration Console is still not available, use kubectl describe domain to check domain status.
$ kubectl describe domain domain1
Make sure the status of cluster-1 is ServersReady and Available. The status of admin-server, managed-server1, and managed-server2 should be RUNNING. Otherwise, the cluster is likely still in the process of becoming fully ready.
For some suggestions for debugging problems with Model in Image after your Domain YAML file is deployed, see
Debugging
.
WSL2 bad timestamp
If you are running with WSL2, you may run into the
bad timestamp issue
, which blocks Azure CLI. You may see the following error:
$ kubectl get pod
Unable to connect to the server: x509: certificate has expired or is not yet valid: current time 2020-11-25T15:58:10+08:00 is before 2020-11-27T04:25:04Z
You can run the following command to update WSL2 system time:
# Fix the outdated systime time
$ sudo hwclock -s
# Check systime time
$ data
Fri Nov 27 13:07:14 CST 2020
Timeout for the operator installation
You may run into a timeout while installing the operator and get the following error:
If your version of WIT is older than 1.9.8, you will get an error running ./imagetool/bin/imagetool.sh if the Docker buildkit is enabled.
Here is the warning message shown:
failed to solve with frontend dockerfile.v0: failed to create LLB definition: failed to parse stage name "WDT_BUILD": invalid reference format: repository name must be lowercase
To resolve the error, either upgrade to a newer version of WIT or disable the Docker buildkit with the following commands and run the imagetool command again.
$ export DOCKER_BUILDKIT=0
$ export COMPOSE_DOCKER_CLI_BUILD=0
WebLogic Kubernetes Operator installation failure
Currently, we meet two cases that block the operator installation:
The system pods in the AKS cluster are pending.
The operator image is unavailable.
Follow these steps to dig into the error.
The AKS cluster system pods are pending
If system pods in the AKS cluster are pending, it will block the operator installation.
This is an error example with warning message no nodes available to schedule pods.
$ kubectl describe pod weblogic-operator-f86b879fd-v2xrz -n sample-weblogic-operator-ns
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 71s (x25 over 36m) default-scheduler no nodes available to schedule pods
If you run into this error, remove the AKS cluster and create a new one.
Run the kubectl get pod -A to make sure all the system pods are running.
If you got an error of ErrImagePull from pod status, use docker pull to check the operator image. If an error occurs, you can switch to a version that is greater than 3.1.1.
First, find the objectId of the service principal used when the AKS cluster was created. You will need the output from az ad sp create-for-rbac, which you were directed to save to a file. Within the output, you need the value of the name property. It will start with http. Get the objectId with this command.
$ az ad sp show --id http://<your-name-from-the-saved-output> | grep objectId
BadRequestError: Operation failed with status: 'Bad Request'. Details: Virtual Machine size: 'Standard_DS2_v2' is not supported for subscription subscription-id in location 'eastus'. The available VM sizes are 'basic_a0,basic_a1,basic_a2,basic_a3,basic_a4,standard_a2'. Please refer to aka.ms/aks-vm-sizes for the details.
ResourceNotFoundError: The Resource 'Microsoft.ContainerService/managedClusters/wlsaks1613726008' under resource group 'wlsresourcegroup1613726008' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix
As shown in the example, you can use standard_a2; pay attention to the CPU and memory of that size; make sure it meets your memory requirements.
exec /weblogic-operator/scripts/introspectDomain.sh: exec format error