Jenkins jobs intermittently fail to launch in Kubernetes

Jenkins setup:

We need your assistance to resolve an issue where Jenkins jobs intermittently fail to launch in Kubernetes. The following error was observed and we suspect the problem may be related to the connection between Jenkins and Kubernetes.

Error in while submit jobs.

  • failed to start websocket connection: io.fabric8 kubernetes-client.KubernetesClientException: An error has occurred.

Error in System logs

  • WARNING .org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch

We run Jenkins under AWS EKS and it works rather well, albeit with a lot of tweaks on how to manage CPU and RAM resources.

I might not be able to help you much but I would recommend that you share more details about you setup such as:

  • kubernetes versions
  • native OS of the k8s nodes
  • jenkins versions
  • plugins and their versions
  • jenkins agent version
  • the yaml of your pods
  • and your overall Kubernetes configuration under jenkins

One issue we faced many years ago was that our EC2 nodes were running Amazon Linux and there was a known, ancient, kernel bug that froze the network for several minutes then recovered. We switched from Amazon Linux to Ubuntu for these EC2 nodes and all these intermittent issues went away. RedHat based Linux distributions such as Amazon Linux are often plagued by bugs no longer present in fresher distribution and as such I avoid them as much as possible.

I strongly recommend that you get yourself familiar with groups v1 and v2 and make sure that your pipelines run a background script to report per container kernel, cache and rss memory usage to help you understand why pipelines might get killed for apparently no reason.

You will want to be careful if you are using git with a mono repo as the git commands can easily get your agents OOM killed for no apparent reason. The git processes will try to eagerly use as much RAM as possible and this can be mitigated by setting up a strict configuration for git, either hardcoded in a the container or through a fake git installer that will be called before the first git call. We ended up doing a fake installer that would create a fake git shell script to massage the git commands to work around design bugs in the git plugin for jenkins.