Kubernetes Pugin cloud's default agent broken with failing to save jgit config file

Jenkins setup:

  • Jenkins in Kubernetes (k3s) installed using helm chart
  • Kubernetes version: Kubernetes v1.29.6+k3s1 using k3s
  • Jenkins: 2.452.2
  • OS: Linux - 5.14.0-427.13.1.el9_4.x86_64
  • Java: 17.0.11 - Eclipse Adoptium (OpenJDK 64-Bit Server VM)
  • Kubernetes plugin version: 4253.v7700d91739e5
  • Jenkins default inbound-agent version: jenkins/inbound-agent:3248.v65ecb_254c298-2"

Problem:
Jeknins failed to save jgit config file with default agent container. Below is the log from jnlp container

SEVERE: Cannot save config file 'FileBasedConfig[/home/jenkins/?/.config/jgit/config]'
java.io.IOException: Creating directories for /home/jenkins/?/.config/jgit failed at upgrade-remoting-to-3244.vf7f977e04755-or-higher 0x6be340e4//org.eclipse.jgit.util.FileUtils.mkdirs(FileUtils.java:421) at upgrade-remoting-to-3244.vf7f977e04755-or-higher 0x6be340e4//org.eclipse.jgit.internal.storage.file.LockFile.lock(LockFile.java:144) at upgrade-remoting-to-3244.vf7f977e04755-or-higher 0x6be340e4//org.eclipse.jgit.storage.file.FileBasedConfig.save(FileBasedConfig.java:201) at upgrade-remoting-to-3244.vf7f977e04755-or-higher 0x6be340e4//org.eclipse.jgit.util.FS$FileStoreAttributes.saveToConfig(FS.java:776) at upgrade-remoting-to-3244.vf7f977e04755-or-higher 0x6be340e4//org.eclipse.jgit.util.FS$FileStoreAttributes.lambda$5(FS.java:458) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source)

The build reports different error:

Caused by: java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
	at okhttp3.internal.ws.RealWebSocket.checkUpgradeSuccess$okhttp(RealWebSocket.kt:224)
	at okhttp3.internal.ws.RealWebSocket$connect$1.onResponse(RealWebSocket.kt:170)
	... 4 more
Failed to start websocket connection: io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.

This is probably due to the agent pod exiting.

Additional note:
this is my pod template that errors:

apiVersion: "v1"
kind: "Pod"
metadata:
  annotations:
    kubernetes.jenkins.io/last-refresh: "1720149310303"
    buildUrl: <redacted>
    runUrl: <redacted>
  labels:
    jenkins: "slave"
    jenkins/label-digest: "6e0eab9e99499f49f08616d4cfcdb07ce91f4c84"
    jenkins/label: <redacted>
    kubernetes.jenkins.io/controller: "http___jenkins_jenkins_8080x"
  name: <redacted>
  namespace: "jenkins"
spec:
  containers:
  - command:
    - "cat"
    image: "bitnami/git:2.45.2"
    name: "git"
    resources:
      limits:
        memory: "2500Mi"
        cpu: "2000m"
      requests:
        memory: "125Mi"
        cpu: "100m"
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - "ALL"
      runAsNonRoot: true
      seccompProfile:
        type: "RuntimeDefault"
    tty: true
    volumeMounts:
    - mountPath: "/home/jenkins/agent"
      name: "workspace-volume"
      readOnly: false
  - env:
    - name: "JENKINS_SECRET"
      value: "********"
    - name: "JENKINS_TUNNEL"
      value: "jenkins-agent.jenkins:50000"
    - name: "JENKINS_AGENT_NAME"
      value: <redacted>
    - name: "REMOTING_OPTS"
      value: "-noReconnectAfter 1d"
    - name: "JENKINS_NAME"
      value: <redacted>
    - name: "JENKINS_AGENT_WORKDIR"
      value: "/home/jenkins/agent"
    - name: "JENKINS_URL"
      value: "http://jenkins.jenkins:8080/"
    image: "jenkins/inbound-agent:3248.v65ecb_254c298-2"
    name: "jnlp"
    resources:
      requests:
        memory: "256Mi"
        cpu: "100m"
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - "ALL"
      runAsNonRoot: true
      seccompProfile:
        type: "RuntimeDefault"
    volumeMounts:
    - mountPath: "/home/jenkins/agent"
      name: "workspace-volume"
      readOnly: false
  nodeSelector:
    kubernetes.io/os: "linux"
  restartPolicy: "Never"
  securityContext:
    fsGroup: 100
    runAsGroup: 100
    runAsUser: 100
  volumes:
  - emptyDir:
      medium: ""
    name: "workspace-volume"

I’ve solved the jgit problem; the issue is the security context fsgroup, runasgroup, and runasuser. Jnlp uses uid 1000, so changing it to 1000 works. The problem now is the build log: io.fabric8.kubernetes.client.http.WebSocketHandshakeException and Caused by: java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'. Jnlp log does not show anything

I’ve solved the jgit problem; the issue is the security context fsgroup, runasgroup, and runasuser. Jnlp uses uid 1000, so changing it to 1000 works.

For error Caused by: java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden was solved with adding rbac rules to the role:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: jenkins
  name: jenkins-agent
rules:
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
  resources: ["pods/exec"]
  verbs: ["create","delete","get","list","patch","update","watch"]
- apiGroups: [""]
  resources: ["pods/log"]
  verbs: ["get","list","watch"]
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get"]
1 Like