SH pipeline steps wait for a minute before finishing completely

I have a fresh Jenkins installation I’m fine tuning (so only a handful of plug-ins at this point). It runs on ECS via EC2, masters and workers. The master persists via EFS but the workers mount to an EBS volume. I only have some light configuration atop the base Docker image jenkins/jenkins:2.401.1-lts-jdk11 and the worker agent is a plain jenkins/inbound-agent:jdk11.

I’m using the ECS plug-in to spin up workers. But now the problem is, I noticed that my pipeline steps, when ran on workers, seem to incur an overhead of 1 minute each step. This doesn’t happen when the pipeline job runs on the master node. I made a very simple test job on which I can reliably observe this behavior:

    node ("ec2-workers"){
        stage ("plain curl") {
            sh "curl wttr.in/Tokyo"
        }
        
        stage("ls") {
            sh "ls"
        }    
    }

In this job, the sh calls would each take a minute. Speed/durability override does not have any effect. While the job is running, I get the following thread dump:

at DSL.sh(completed process (code 0) in /home/jenkins/workspace/test/pipeline tester@tmp/durable-bed04ebe on JNLP4-connect connection from ip-xx-xxx-xx-xxx.eu-west-1.compute.internal/xx.xxx.xx.xxx:44422; recurrence period: 300000ms; check task scheduled; cancelled? false done? false) 

On the other hand, a freestyle project that only has a shell script build step doing the same performs ok regardless of where I run it.

That recurrence period seems awfully high though I’m not sure what that value implies exactly nor where it could be configured. Does this behavior sound familiar? What setting should I check?

Hi @skytreader and welcome to this community. :wave:

Doesn’t the recurrence period refer to the interval at which the Jenkins agent (worker) checks for tasks to execute?
In your case, it seems that the recurrence period is set to 300,000 milliseconds (5 minutes). This means that the agent checks for new tasks every 5 minutes, which can introduce a delay in starting the pipeline steps.

Isn’t there a configuration option related to the recurrence period in your ECS configuration?

So, I made a test where I ran a sleep command on the master node. Recall that the problem I’m having does not occur when the pipeline is ran on the master node. This sleep command gave me enough time to get a similar thread dump on the master node. (Naturally, after the sleep command, the subsequent sh nodes executed as normally as expected.)

The 300000ms value is still there. My conclusion is that the recurrence period is normal and not a problem at all. What seems suspect to me now is the part of the log saying “done? false”. This is despite the fact that in the worker @tmp workspace, jenkins-result.txt is already 0 (and is also reflected at the beginning of the log I provided).

1 Like