InterruptedException with Docker Worfklow Plugin and large Images

Dear community,

I have issues using large Docker images (e.g., texlive Docker image, 4.74 GB) with the Docker workflow plugin.

This is my job:

 pipeline {
     //environment {
     //    CLIENT_TIMEOUT = 1200
     //}
     agent {
         docker {
           //image 'ghcr.io/kjarosh/latex:2025.1-minimal'
           image 'haproxy.lan:5000/texlive:latest'
           label 'podman'
           args '-u root --entrypoint=""'
         }
     }

     stages {
         stage('runInDocker') {
             steps {
                sh 'latexmk -v && exit 1'
             }
         }
     }
 }

The error I’m currently investigating:

Started by user in0rdr

[Pipeline] Start of Pipeline
[Pipeline] node
Running on jenkins-podman-39814ab83b18
 in /home/jenkins/workspace/test
[Pipeline] {
[Pipeline] isUnix
[Pipeline] withEnv
[Pipeline] {
[Pipeline] sh
+ docker inspect -f . haproxy.lan:5000/texlive:latest
.
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] withDockerContainer
jenkins-podman-39814ab83b18 does not seem to be running inside a container
$ docker run -t -d -u 1312:1312 -u root --entrypoint= -w /home/jenkins/workspace/test -v /home/jenkins/workspace/test:/home/jenkins/workspace/test:rw,z -v /home/jenkins/workspace/test@tmp:/home/jenkins/workspace/test@tmp:rw,z -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** -e ******** haproxy.lan:5000/texlive:latest cat
[Pipeline] // withDockerContainer
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Also:   org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: 00224b7f-785a-4704-9253-7cfab56e6ef4
java.lang.InterruptedException
	at java.base/java.lang.Object.wait0(Native Method)
	at java.base/java.lang.Object.wait(Unknown Source)
	at hudson.remoting.Request.call(Request.java:179)
	at hudson.remoting.Channel.call(Channel.java:1107)
	at hudson.remoting.RemoteInvocationHandler.invoke(RemoteInvocationHandler.java:306)
	at jdk.proxy2/jdk.proxy2.$Proxy142.join(Unknown Source)
	at hudson.Launcher$RemoteLauncher$ProcImpl.join(Launcher.java:1213)
	at hudson.Proc.joinWithTimeout(Proc.java:172)
	at PluginClassLoader for docker-workflow//org.jenkinsci.plugins.docker.workflow.client.DockerClient.launch(DockerClient.java:314)
	at PluginClassLoader for docker-workflow//org.jenkinsci.plugins.docker.workflow.client.DockerClient.run(DockerClient.java:144)
	at PluginClassLoader for docker-workflow//org.jenkinsci.plugins.docker.workflow.WithContainerStep$Execution.start(WithContainerStep.java:200)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:322)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:195)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:124)
	at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.call(PogoMetaClassSite.java:47)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.sandbox.DefaultInvoker.methodCall(DefaultInvoker.java:20)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.LoggingInvoker.methodCall(LoggingInvoker.java:118)
	at org.jenkinsci.plugins.docker.workflow.Docker$Image.inside(Docker.groovy:140)
	at org.jenkinsci.plugins.docker.workflow.Docker.node(Docker.groovy:66)
	at org.jenkinsci.plugins.docker.workflow.Docker$Image.inside(Docker.groovy:125)
	at org.jenkinsci.plugins.docker.workflow.declarative.DockerPipelineScript.runImage(DockerPipelineScript.groovy:53)
	at org.jenkinsci.plugins.docker.workflow.declarative.AbstractDockerPipelineScript.configureRegistry(AbstractDockerPipelineScript.groovy:58)
	at org.jenkinsci.plugins.docker.workflow.declarative.AbstractDockerPipelineScript.run(AbstractDockerPipelineScript.groovy:46)
	at org.jenkinsci.plugins.pipeline.modeldefinition.agent.CheckoutScript.checkoutAndRun(CheckoutScript.groovy:66)
	at org.jenkinsci.plugins.pipeline.modeldefinition.agent.CheckoutScript.doCheckout2(CheckoutScript.groovy:46)
	at org.jenkinsci.plugins.pipeline.modeldefinition.agent.impl.LabelScript.run(LabelScript.groovy:49)
	at ___cps.transform___(Native Method)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ContinuationGroup.methodCall(ContinuationGroup.java:90)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.dispatchOrArg(FunctionCallBlock.java:114)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.FunctionCallBlock$ContinuationImpl.fixArg(FunctionCallBlock.java:83)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(Unknown Source)
	at java.base/java.lang.reflect.Method.invoke(Unknown Source)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ClosureBlock.eval(ClosureBlock.java:46)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.Next.step(Next.java:83)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:147)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:17)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:49)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:181)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:437)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:345)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:298)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.lambda$wrap$4(CpsVmExecutorService.java:140)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
	at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.call(CpsVmExecutorService.java:53)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.call(CpsVmExecutorService.java:50)
	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136)
	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.lambda$categoryThreadFactory$0(CpsVmExecutorService.java:50)
	at java.base/java.lang.Thread.run(Unknown Source)
Finished: FAILURE

Do you encounter similar issues with larger images?

I can run the same pipeline without issues using smaller images, such as the commented Package latex · GitHub.

As you can see, I already experimented with the CLIENT_TIMEOUT (of the docker-workflow plugin), but that did not improve the situation.

In the Jenkins server, I see following message after exactly 5m (300s):

2025-09-06T09:16:28.692919635+01:00 stderr F 2025-09-06 08:16:28.687+0000 [id=59]	INFO	hudson.slaves.NodeProvisioner#update: jenkins-podman-387dedb32dcf provisioning successfully completed. We have now 2 computer(s)
2025-09-06T09:21:35.401054439+01:00 stderr F 2025-09-06 08:21:35.397+0000 [id=412]	WARNING	h.Launcher$RemoteLauncher$ProcImpl#join: Process hudson.Launcher$RemoteLauncher$ProcImpl@52a02fb3 has not really finished after the join() method completion

My Nomad cloud setup (the worker node, configured with infrastructure code plugin):

  clouds:
  - nomad:
      name: "nomad"
      nomadUrl: "https://{{env "attr.unique.network.ip-address"}}:4646"
      tlsEnabled: true
      serverCertificate: "/etc/ssl/certs/nomad-agent-ca.p12"
      # the truststore only contains public certificates, password is irrelevant here
      serverPassword: "123456"
      clientPassword:
      prune: true
      templates:
      - idleTerminationInMinutes: 10
        jobTemplate: |-
          {
            "Job": {
              "Region": "global",
              "ID": "%WORKER_NAME%",
              "Namespace": "default",
              "Type": "batch",
              "Datacenters": [
                "dc1"
              ],
              "TaskGroups": [
                {
                  "Name": "jenkins-podman-worker-taskgroup",
                  "Count": 1,
                  "RestartPolicy": {
                    "Attempts": 0,
                    "Interval": 10000000000,
                    "Mode": "fail",
                    "Delay": 1000000000
                  },
                  "Tasks": [
                    {
                      "Name": "jenkins-podman-worker",
                      "Driver": "podman",
                      "User": "1312",
                      "Config": {
                        "volumes": [
                          "/run/user/1312/podman/podman.sock:/home/jenkins/agent/podman.sock",
                          "/etc/containers/registries.conf:/etc/containers/registries.conf",
                          "/home/jenkins/workspace:/home/jenkins/workspace"
                        ],
                        "force_pull": true,
                        "image": "127.0.0.1:5000/jenkins-inbound-agent:3327.v868139a_d00e0-v9"
                      },
                      "Env": {
                        "REMOTING_OPTS": "-url http://{{ env "NOMAD_ADDR_jenkins" }} -name %WORKER_NAME% -secret %WORKER_SECRET% -tunnel {{ env "NOMAD_ADDR_jnlp" }}",
                        "DOCKER_HOST": "unix:///home/jenkins/agent/podman.sock"
                      },
                      "Resources": {
                        "CPU": 500,
                        "MemoryMB": 512,
                        "MemoryMaxMB": 1024
                      }
                    }
                  ],
                  "EphemeralDisk": {
                    "SizeMB": 300
                  }
                }
              ]
            }
          }
        labels: "nomad podman" # use the 'podman' label in the Jenkins pipeline spec
        numExecutors: 1
        prefix: "jenkins-podman"
        reusable: true
      workerTimeout: 1

Jenkins setup:

I’m helpful for any tips you might had.

Thank you already!

I could fix it by configuring the overlay storage driver for Podman.

The default storage driver for UID 0 is configured in containers-storage.conf(5) in rootless mode), and is vfs for non-root users when fuse-overlayfs is not available.

The problem was that the default storage driver VFS (used in my system for rootless usage) is way too slow to start/run certain images. See Podman performance.

Because I uninstalled fuse-overlayfs recently, the default behavior with VFS storage kicked in, even though I run a recent kernel (6.12.41) which already supports native overlayfs!

The Overlay file system (OverlayFS) is not supported with kernels prior to 5.12.9 in rootless mode.

So it was not related to Jenkins in that sense. All I had to do was tell Podman that it should use the native overlayfs, even in rootless mode (because I run a kernel > 5.12.9 I can run the native implementation and do not need fuse at all).

Was not aware that the rootless Podman uses “vfs” storage driver by default (i.e., it does not check the kernel version do determine the storage driver for rootless usage).

# /etc/containers/storage.conf
[storage]
driver="overlay"
runroot = "/var/run/containers/storage"
graphroot = "/var/lib/containers/storage"

To migrate from VFS to overlay I followed the upstream docs (podman system reset) in rootless (user Jenkins) mode.

I checked that the rootless Podman (user Jenkins) indeed runs with (native) overlay storage driver.

$ podman info | grep graph
  graphDriverName: overlay

I think that’s fixed. Thanks anyways. Hope it helps if anyone bumps into similar issues.