This issue occurs after migrating Jenkins version from 2.319.1 to 2.361.1.
On Jenkins version 2.319.1 doesnt occur. Only occurs on Jenkins version 2.361.1
Error occurs intermittently: Kubernetes JNLP container terminated with error code 255.
With Kubernetes plugin version 1.31 on Jenkins 2.319.1 it retries the pod creation while on plugin version 3704.va_08f0206b_95e on Jenkins 2.361.1 it doesn’t retry.
Is there any configuration parameter for that theme in the plugin?
On the other hand, is there a parameter in the plugin to configure the waiting time for the POD to wake up?
Jenkins version 2.319.1 with Kubernetes plugin version 1.31 does not have any timeout options configured in the arguments.
Jenkins version 2.361.1 with Kubernetes plugin version 3704.va_08f0206b_95e has these arguments configured: -Dkubernetes.websocket.timeout=60000 -Dorg.csanchez.jenkins.plugins.kubernetes.pipeline.ContainerExecDecorator.websocketConnectionTimeout=60000
It would be handier next time if you could paste your anonymized logs in a text format.
It seems that there are some changes between the Kubernetes plugin versions and Jenkins versions that might be causing the issue. It’s possible that the Kubernetes plugin version 3704.va_08f0206b_95e has different default behavior or configuration compared to version 1.31 that is causing the pod creation to fail.
Regarding your question about the waiting time for the pod to wake up, there is a configuration parameter in the Kubernetes plugin called podRetention. This parameter specifies the time period for which a pod will be retained after a build has finished. You can configure this parameter to control the waiting time for the pod to wake up.
In addition, you can try increasing the timeout value using the -Dkubernetes.websocket.timeout argument in the Jenkins JVM options. This argument increases the timeout value for WebSocket connections between Jenkins and the Kubernetes cluster.
Thank you very much for the answer and welcome to the community! I’m in for the next one to add the text instead of the image
Would you know where that PodRetention configuration is configured?
Thanks again.
With Jenkins version 2.319.1 with Kubernetes plugin version 1.31 we can see on log:
Created Pod: kubernetes namespace/pod-name-wb39h
[Normal][namespace/pod-name-wb39h][Scheduled] Successfully assigned namespace/pod-name-wb39h to aks-npworker-XXXXXXXXXXX
[Normal][namespace/pod-name-wb39h][Pulled] Container image "image1" already present on machine
[Normal][namespace/pod-name-wb39h][Created] Created container builder
[Normal][namespace/pod-name-wb39h][Started] Started container builder
[Normal][namespace/pod-name-wb39h][Pulled] Container image "image2" already present on machine
[Normal][namespace/pod-name-wb39h][Created] Created container azure-cli
[Normal][namespace/pod-name-wb39h][Started] Started container azure-cli
[Normal][namespace/pod-name-wb39h][Pulled] Container image "image3" already present on machine
[Normal][namespace/pod-name-wb39h][Created] Created container maven
[Normal][namespace/pod-name-wb39h][Started] Started container maven
[Normal][namespace/pod-name-wb39h][Pulled] Container image "image4" already present on machine
[Normal][namespace/pod-name-wb39h][Created] Created container security
[Normal][namespace/pod-name-wb39h][Started] Started container security
And with that versión retry PODs creation if is necessary. But with the other versión doesn’t appear that log and doesn’t retry, so the POD creation and pipeline fails. With other version, If the POD creation fails it is retried until it is created.
Hi All, After a support ticket in Microsoft, our problem is in the POD to POD communication due to the use of Pod Identity. Our cluster is heavily scaled/descaled causing delays due to the use of Pod Identity. That functionality is already deprecated and recommends migration to Workload Identity.
In previous versions of Jenkins/plugins the error was not displayed because internally the plugin made retries until the POD was fully up. But with the indicated version, those retries are not performed making the problem evident. I hope this information is useful to you.
Hi @juandyego1983 , so this issue is resolved on your side? Did you make any changes in your pod template as well or its completely on infrastructure side?
I have an aws environment and still finding workaround.
Hi, on Azure Kubernetes Service we had to modify POD yaml to use worload identity instead of pod identity. To use Azure workload identity you must to add these labels: