Is it possible to add a timeout on agent selection?

Hi,

Is it possible to limit the time a stage wait for executor, and skip it if reached? (the agent+node block)

We have a job with auto schedules that run both on Mac and Windows agents using parallel stages.

We only have a few Mac agents for cost reasons, so sometimes our jobs get stuck on waiting for next available node with label "mac".

The workaround we found is to put a timeout on the mac stage itself, but the issue is it also affect the execution time limit of the stage.

It would be great to have different time limit for node selection and node steps.
Here is a pipeline example:

pipeline {
	agent any
	options {
		skipDefaultCheckout()
	}
	stages {
		stage("Parallel stages") {
			failFast false // Do not abort all parallel stages if one of them fails
			parallel {
				stage('Windows') {
					agent {
						node {
							label "win"
							customWorkspace "${BRANCH_NAME}Build"
						}
					}
					stages {
						stage('Stage 1 (win)') {
							steps {
								script {
									python("stage1.py")
								}
							}
						}
						stage('Stage 2 (win)') {
							steps {
								script {
									python("stage2.py")
								}
							}
						}
					}
					post {
						always {
							script {
								println("win post block")
							}
						}
					}
				}
				// -----------------------------------------------------------------------------------------------------------------------------
				stage('MacOS') {
					agent {
						node {
							label "mac"
							customWorkspace "${BRANCH_NAME}Build"
						}
					}
					// our workaround, but the issue is it is also affecting mac stages max execution time
					// add timeout to not wait forever when no mac agents are available
					options {
						timeout(time:8, unit:"HOURS") 
					}
					stages {
						stage('Stage 1 (mac)') {
							steps {
								script {
									// stage content
								}
							}
						}
						stage('Stage 2 (mac)') {
							steps {
								script {
									// stage content
								}
							}
						}
						stage('Stage 3 (mac)') {
							steps {
								script {
									// stage content
								}
							}
						}
					}
					post {
						always {
							script {
								println("mac post block")
							}
						}
					}
				}
			}
		}
	}
	post {
		always {
			script {
				println("global post block")
			}
		}
	}
}

Thank you!

Here is my Jenkins config:

Jenkins setup:

Jenkins: 2.440.1
OS: Linux - 6.1.85+
Java: 17.0.10 - Eclipse Adoptium (OpenJDK 64-Bit Server VM)
---
analysis-model-api:12.1.0
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
apache-httpcomponents-client-5-api:5.3.1-1.0
authentication-tokens:1.53.v1c90fd9191a_b_
authorize-project:1.7.1
blueocean:1.27.11
blueocean-autofavorite:1.2.5
blueocean-bitbucket-pipeline:1.27.11
blueocean-commons:1.27.11
blueocean-config:1.27.11
blueocean-core-js:1.27.11
blueocean-dashboard:1.27.11
blueocean-display-url:2.4.2
blueocean-events:1.27.11
blueocean-git-pipeline:1.27.11
blueocean-github-pipeline:1.27.11
blueocean-i18n:1.27.11
blueocean-jwt:1.27.11
blueocean-personalization:1.27.11
blueocean-pipeline-api-impl:1.27.11
blueocean-pipeline-editor:1.27.11
blueocean-pipeline-scm-api:1.27.11
blueocean-rest:1.27.11
blueocean-rest-impl:1.27.11
blueocean-web:1.27.11
bootstrap5-api:5.3.2-4
bouncycastle-api:2.30.1.77-225.v26ea_c9455fd9
branch-api:2.1152.v6f101e97dd77
build-blocker-plugin:1.7.9
build-failure-analyzer:2.5.0
build-monitor-plugin:1.14-860.vd06ef2568b_3f
build-timeout:1.32
build-timestamp:1.0.3
caffeine-api:3.1.8-133.v17b_1ff2e0599
checks-api:2.0.2
cloud-stats:336.v788e4055508b_
cloudbees-bitbucket-branch-source:877.vb_b_d5243f6794
cloudbees-disk-usage-simple:203.v3f46a_7462b_1a_
cloudbees-folder:6.901.vb_4c7a_da_75da_3
command-launcher:107.v773860566e2e
commons-compress-api:1.26.1-2
commons-httpclient3-api:3.1-3
commons-lang3-api:3.13.0-62.v7d18e55f51e2
commons-text-api:1.11.0-95.v22a_d30ee5d36
conditional-buildstep:1.4.3
configuration-as-code:1775.v810dc950b_514
credentials:1319.v7eb_51b_3a_c97b_
credentials-binding:657.v2b_19db_7d6e6d
data-tables-api:1.13.8-4
depgraph-view:1.0.5
discard-old-build:1.07
disk-usage:1.2
display-url-api:2.200.vb_9327d658781
docker-build-publish:1.4.0
docker-commons:439.va_3cb_0a_6a_fb_29
docker-java-api:3.3.4-86.v39b_a_5ede342c
docker-plugin:1.6
docker-workflow:572.v950f58993843
durable-task:550.v0930093c4b_a_6
echarts-api:5.4.3-4
email-ext:2.104
emailext-template:1.5
extended-read-permission:53.v6499940139e5
favorite:2.208.v91d65b_7792a_c
font-awesome-api:6.5.1-3
forensics-api:2.4.0
git:5.2.1
git-client:4.6.0
git-server:114.v068a_c7cc2574
github:1.38.0
github-api:1.318-461.v7a_c09c9fa_d63
github-branch-source:1772.va_69eda_d018d4
google-container-registry-auth:0.3
google-login:109.v022b_cf87b_e5b_
google-oauth-plugin:1.330.vf5e86021cb_ec
gson-api:2.10.1-15.v0d99f670e0a_7
handy-uri-templates-2-api:2.1.8-30.v7e777411b_148
htmlpublisher:1.32
instance-identity:185.v303dc7c645f9
ionicons-api:56.v1b_1c8c49374e
jackson2-api:2.16.1-373.ve709c6871598
jakarta-activation-api:2.0.1-3
jakarta-mail-api:2.0.1-3
javadoc:243.vb_b_503b_b_45537
javax-activation-api:1.2.0-6
javax-mail-api:1.6.2-9
jaxb:2.3.9-1
jdk-tool:73.vddf737284550
jenkins-design-language:1.27.11
jjwt-api:0.11.5-77.v646c772fddb_0
joda-time-api:2.12.7-29.v5a_b_e3a_82269a_
jquery3-api:3.7.1-2
jsch:0.2.16-86.v42e010d9484b_
json-api:20240205-27.va_007549e895c
json-path-api:2.9.0-33.v2527142f2e1d
junit:1259.v65ffcef24a_88
kubernetes:4186.v1d804571d5d4
kubernetes-client-api:6.10.0-240.v57880ce8b_0b_2
kubernetes-credentials:0.11
lockable-resources:1243.v346d600eea_24
mailer:463.vedf8358e006b_
markdown-formatter:167.v8a_428ca_49f89
matrix-auth:3.2.1
matrix-project:822.824.v14451b_c0fd42
maven-plugin:3.23
metrics:4.2.21-449.v6960d7c54c69
mina-sshd-api-common:2.12.0-90.v9f7fb_9fa_3d3b_
mina-sshd-api-core:2.12.0-90.v9f7fb_9fa_3d3b_
monitoring:1.98.0
oauth-credentials:0.646.v02b_66dc03d2e
okhttp-api:4.11.0-172.vda_da_1feeb_c6e
p4:1.15.1
parameterized-scheduler:262.v00f3d90585cc
parameterized-trigger:787.v665fcf2a_830b_
periodic-reincarnation:1.13
pipeline-build-step:540.vb_e8849e1a_b_d8
pipeline-github-lib:42.v0739460cda_c4
pipeline-graph-analysis:216.vfd8b_ece330ca_
pipeline-groovy-lib:704.vc58b_8890a_384
pipeline-input-step:491.vb_07d21da_1a_fb_
pipeline-milestone-step:111.v449306f708b_7
pipeline-model-api:2.2175.v76a_fff0a_2618
pipeline-model-definition:2.2175.v76a_fff0a_2618
pipeline-model-extensions:2.2175.v76a_fff0a_2618
pipeline-rest-api:2.34
pipeline-stage-step:305.ve96d0205c1c6
pipeline-stage-tags-metadata:2.2175.v76a_fff0a_2618
pipeline-stage-view:2.34
pipeline-utility-steps:2.17.0
plain-credentials:143.v1b_df8b_d3b_e48
plugin-util-api:4.1.0
pollscm:1.5
prism-api:1.29.0-13
prometheus:2.5.1
prqa-plugin:3.3.5
pubsub-light:1.18
resource-disposer:0.23
role-strategy:689.v731678c3e0eb_
run-condition:1.7
saml:4.464.vea_cb_75d7f5e0
scm-api:683.vb_16722fb_b_80b_
script-security:1326.vdb_c154de8669
slack:684.v833089650554
snakeyaml-api:2.2-111.vc6598e30cc65
sse-gateway:1.26
ssh-credentials:308.ve4497b_ccd8f4
ssh-slaves:2.948.vb_8050d697fec
sshd:3.322.v159e91f6a_550
structs:337.v1b_04ea_4df7c8
throttle-concurrents:2.14
timestamper:1.26
token-macro:400.v35420b_922dcb_
trilead-api:2.133.vfb_8a_7b_9c5dd1
validating-string-parameter:183.v3748e79b_9737
variant:60.v7290fc0eb_b_cd
warnings-ng:11.1.0
workflow-aggregator:596.v8c21c963d92d
workflow-api:1291.v51fd2a_625da_7
workflow-basic-steps:1042.ve7b_140c4a_e0c
workflow-cps:3880.vb_ef4b_5cfd270
workflow-durable-task-step:1331.vc8c2fed35334
workflow-job:1400.v7fd111b_ec82f
workflow-multibranch:773.vc4fe1378f1d5
workflow-scm-step:415.v434365564324
workflow-step-api:657.v03b_e8115821b_
workflow-support:865.v43e78cc44e0d
ws-cleanup:0.45

To address your requirement of setting different timeouts for node selection and execution time in Jenkins, you can use a combination of waitUntil and timeout within a parallel block. This approach allows you to specify a timeout for the node allocation separately from the execution time of the stage.

also you can do this:

startTime = System.currentTimeMillis()
timeout(activity: true, time: 2, unit: 'HOURS') {
    node('Slave_Node') {
      // Will run on the slave
    }
}

I have used Okami: Seek How Know Now to answer this

But when I try to put a timeout around node block, Jenkins says that it’s an invalid syntax.

timeout is only allowed inside steps block it seems.

stage('MacOS') {
  agent {
    timeout(activity: true, time: 2, unit: 'HOURS') {
      node {
        label "mac"
        customWorkspace "${BRANCH_NAME}Build"
      }
    }	
  }

00:00:00.102 WorkflowScript: 44: Invalid agent type “timeout” specified. Must be one of [any, docker, dockerContainer, dockerfile, kubernetes, label, none] @ line 44, column 13.
00:00:00.102 timeout(activity: true, time: 2, unit: ‘HOURS’) {
00:00:00.102 ^

Neither this:

stage('MacOS') {
  timeout(activity: true, time: 2, unit: 'HOURS') {
    agent {
      node {
        label "mac"
        customWorkspace "${BRANCH_NAME}Build"
      }
    }	
  }

00:00:00.123 WorkflowScript: 42: Unknown stage section “timeout”. Starting with version 0.5, steps in a stage must be in a ‘steps’ block. @ line 42, column 5.
00:00:00.123 stage(‘MacOS’) {
00:00:00.123 ^
00:00:00.123

you will need to put everything in the steps then
I tested this script and it works. If I use a label that doesn’t exist the job will terminate after 30 sec and be in status success

pipeline {
    agent none
    stages {
        stage("test") {
            steps {
                script {
                    def macBuildStarted=false
                    def waitingForTooLong=false
                    def counter=1
                    p = [:]
                    p["failFast"] = true
                    p["wait"] = { 
                        waitUntil(initialRecurrencePeriod: 15000) {
                            echo "$counter"
                            if (macBuildStarted) {
                                echo "Build started"
                                return true
                            }
                            if (counter > 2) {
                                waitingForTooLong=true
                                error("")
                            }
                            counter++
                            return false
                        }
                    }
                    p["build"] = {
                        node("mac") {
                            macBuildStarted=true
                            ws("${BRANCH_NAME}Build") {
                              // run the build 
                            }
                        }
                    }
                    try {
                        parallel p
                    } catch (err) {
                        if (waitingForTooLong) {
                            echo "node took too long to start"
                        } else {
                            throw err
                        }
                    }
                }
            }
        }
    }
}

If the build fails within the ws step, then the job will also fail

2 Likes