Definition of idle

Dear all,

Jenkins setup:
Jenkins: 2.492.1
OS: Linux - 5.10.205-195.807.amzn2.x86_64
Java: 17.0.12 - Amazon.com Inc. (OpenJDK 64-Bit Server VM)

We surely have more than 50 different plugins installed, please refer below for the full list.

tl;dr

We are looking for a concise definition of the idle property on the /computer/api/json? API. How exactly is this property to be interpreted?

Long Version
We built some automation around Jenkins that relies on the /computer/api/json? Jenkins API to fetch the executors and idle properties on Jenkins Agents. We were generally expecting idle=true if the respective Jenkins agent is running no builds at all, and idle=false if at least one executor is taken. We occasionally have failures and after investigation are inclined to believe that our interpretation of idle is wrong. In particular, when we have two Pipelines pipelineA and pipelineB whereas the former starts the latter via the Pipeline Build Step, then the former does not seem to occupy an agent/executor at all while pipelineB is underway. This behaviour is likely rooted in an erroneous understanding of aforementioned property.

Please excuse it should I inadvertently have omitted pertinent information - just let me know and I shall swiftly furnish it!

Thank you in advance for any help!
Kind regards,

Office-365-Connector:4.20.0
ace-editor:1.1
active-directory:2.33
allure-jenkins-plugin:2.31.1
amazon-ecr:1.114.vfd22430621f5
analysis-core:1.96
ansicolor:1.0.4
ant:497.v94e7d9fffa_b_9
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
apache-httpcomponents-client-5-api:5.2.1-1.1
asm-api:9.7-33.v4d23ef79fcc8
atlassian-jira-software-cloud:2.0.9
authentication-tokens:1.53.v1c90fd9191a_b_
aws-bucket-credentials:1.0.0
aws-credentials:231.v08a_59f17d742
aws-java-sdk:1.12.529-406.vdeff15e5817d
aws-java-sdk-cloudformation:1.12.529-406.vdeff15e5817d
aws-java-sdk-codebuild:1.12.529-406.vdeff15e5817d
aws-java-sdk-ec2:1.12.529-406.vdeff15e5817d
aws-java-sdk-ecr:1.12.529-406.vdeff15e5817d
aws-java-sdk-ecs:1.12.529-406.vdeff15e5817d
aws-java-sdk-efs:1.12.529-406.vdeff15e5817d
aws-java-sdk-elasticbeanstalk:1.12.529-406.vdeff15e5817d
aws-java-sdk-iam:1.12.529-406.vdeff15e5817d
aws-java-sdk-kinesis:1.12.529-406.vdeff15e5817d
aws-java-sdk-logs:1.12.529-406.vdeff15e5817d
aws-java-sdk-minimal:1.12.772-474.v7f79a_2046a_fb_
aws-java-sdk-secretsmanager:1.12.529-406.vdeff15e5817d
aws-java-sdk-sns:1.12.529-406.vdeff15e5817d
aws-java-sdk-sqs:1.12.529-406.vdeff15e5817d
aws-java-sdk-ssm:1.12.529-406.vdeff15e5817d
aws-parameter-store:1.2.2
bitbucket:223.vd12f2bca5430
bitbucket-build-status-notifier:1.4.2
bootstrap4-api:4.6.0-6
bootstrap5-api:5.3.2-1
bouncycastle-api:2.30.1.78.1-248.ve27176eb_46cb_
branch-api:2.1128.v717130d4f816
build-failure-analyzer:2.4.2
build-monitor-plugin:1.14-745.ve2023a_305f40
build-timeout:1.31
caffeine-api:3.1.8-133.v17b_1ff2e0599
checks-api:2.0.2
cloud-stats:320.v96b_65297a_4b_b_
cloudbees-folder:6.848.ve3b_fd7839a_81
command-launcher:107.v773860566e2e
commons-compress-api:1.26.1-2
commons-httpclient3-api:3.1-3
commons-lang3-api:3.13.0-62.v7d18e55f51e2
commons-text-api:1.10.0-78.v3e7b_ea_d5a_fe1
conditional-buildstep:1.4.3
config-file-provider:959.vcff671a_4518b_
copyartifact:722.v0662a_9b_e22a_c
credentials:1384.vf0a_2ed06f9c6
credentials-binding:681.vf91669a_32e45
cucumber-reports:5.7.6
cucumber-testresult-plugin:0.10.1
cygpath:1.5
dark-theme:439.vdef09f81f85e
data-tables-api:1.13.6-4
dependency-check-jenkins-plugin:5.4.3
dependency-track:4.3.1
discard-old-build:1.07
display-url-api:2.200.vb_9327d658781
docker-build-publish:1.4.0
docker-build-step:2.9
docker-commons:439.va_3cb_0a_6a_fb_29
docker-java-api:3.3.1-79.v20b_53427e041
docker-plugin:1.5
docker-workflow:572.v950f58993843
durable-task:555.v6802fe0f0b_82
ec2:1628.v6d7b_fc58b_a_1d
echarts-api:5.4.0-6
eddsa-api:0.3.0-4.v84c6f0f4969e
email-ext:2.102
embeddable-build-status:412.v09da_db_1dee68
envinject:2.908.v66a_774b_31d93
envinject-api:1.199.v3ce31253ed13
external-monitor-job:215.v2e88e894db_f8
font-awesome-api:6.4.2-1
git:5.2.0
git-client:4.5.0
git-parameter:0.9.19
git-server:126.v0d945d8d2b_39
gradle:2.8.2
groovy-events-listener-plugin:2.210.v8a_4107f66127
gson-api:2.11.0-41.v019fcf6125dc
h2-api:11.1.4.199-12.v9f4244395f7a_
handlebars:3.0.8
htmlpublisher:1.32
instance-identity:201.vd2a_b_5a_468a_a_6
ionicons-api:74.v93d5eb_813d5f
jackson2-api:2.17.0-379.v02de8ec9f64c
jacoco:3.3.4
jakarta-activation-api:2.0.1-3
jakarta-mail-api:2.0.1-3
javadoc:243.vb_b_503b_b_45537
javax-activation-api:1.2.0-6
javax-mail-api:1.6.2-9
jaxb:2.3.9-1
jdk-tool:73.vddf737284550
jersey2-api:2.40-1
jira:3.11
jira-integration:5.2.0-23.v990dc373a_0b_f
jira-trigger:1.0.3
jnr-posix-api:3.1.18-1
job-dsl:1.85
jobConfigHistory:1229.v3039470161a_d
jobcacher:573.v33fa_12644a_91
joda-time-api:2.13.0-93.v9934da_29b_a_e9
jquery:1.12.4-1
jquery-detached:1.2.1
jquery3-api:3.7.1-1
jsch:0.2.8-65.v052c39de79b_2
json-api:20240303-41.v94e11e6de726
json-path-api:2.9.0-58.v62e3e85b_a_655
junit:1240.vf9529b_881428
ldap:701.vf8619de9160a_
lockable-resources:1185.v0c528656ce04
m2release:0.16.4
mailer:463.vedf8358e006b_
mapdb-api:1.0.9-28.vf251ce40855d
matrix-auth:3.2.1
matrix-project:808.v5a_b_5f56d6966
maven-plugin:3.23
mercurial:1260.vdfb_723cdcc81
metrics:4.2.21-458.vcf496cb_839e4
mina-sshd-api-common:2.14.0-133.vcc091215a_358
mina-sshd-api-core:2.14.0-133.vcc091215a_358
momentjs:1.1.1
multiple-scms:0.8
neoload-jenkins-plugin:2.2.11
node-iterator-api:49.v58a_8b_35f8363
nodejs:1.6.1
okhttp-api:4.11.0-157.v6852a_a_fa_ec11
pam-auth:1.10
parameterized-trigger:2.46
performance:945.v3c982cb_1a_9a_9
pipeline-aws:1.43
pipeline-build-step:505.v5f0844d8d126
pipeline-graph-analysis:202.va_d268e64deb_3
pipeline-groovy-lib:689.veec561a_dee13
pipeline-input-step:477.v339683a_8d55e
pipeline-maven:1342.vfc697b_789147
pipeline-maven-api:1342.vfc697b_789147
pipeline-milestone-step:111.v449306f708b_7
pipeline-model-api:2.2218.v56d0cda_37c72
pipeline-model-declarative-agent:1.1.1
pipeline-model-definition:2.2144.v077a_d1928a_40
pipeline-model-extensions:2.2218.v56d0cda_37c72
pipeline-npm:95.v5213efa_9585f
pipeline-rest-api:2.33
pipeline-stage-step:312.v8cd10304c27a_
pipeline-stage-tags-metadata:2.2144.v077a_d1928a_40
pipeline-stage-view:2.33
pitmutation:1.0-18
plain-credentials:183.va_de8f1dd5a_2b_
plugin-util-api:3.4.0
popper-api:1.16.1-3
popper2-api:2.11.6-2
port-allocator:1.10
postbuildscript:3.2.0-550.v88192b_d3e922
prometheus:2.5.1
publish-over:0.22
publish-over-ssh:1.25
quality-gates:2.7-SNAPSHOT (private-bb8ac554-Jochem)
rebuild:320.v5a_0933a_e7d61
resource-disposer:0.23
run-condition:1.7
s3:466.vf5b_3db_8e3eb_2
saml:4.464.vea_cb_75d7f5e0
scm-api:696.v778d637b_a_762
script-security:1369.v9b_98a_4e95b_2d
scriptler:374.vd80c089c9164
slack:684.v833089650554
snakeyaml-api:2.2-111.vc6598e30cc65
sonar:2.17.1
sonar-quality-gates:1.3.1
ssh-agent:333.v878b_53c89511
ssh-credentials:343.v884f71d78167
ssh-slaves:2.916.vd17b_43357ce4
sshd:3.330.vc866a_8389b_58
stashNotifier:1.439.v202358346a_7d
structs:338.v848422169819
subversion:2.17.3
swarm:3.47
terraform:1.0.10
theme-manager:262.vc57ee4a_eda_5d
timestamper:1.26
token-macro:400.v35420b_922dcb_
trilead-api:2.147.vb_73cc728a_32e
uno-choice:2.7.2
variant:60.v7290fc0eb_b_cd
windows-slaves:1.8.1
workflow-aggregator:596.v8c21c963d92d
workflow-api:1336.vee415d95c521
workflow-basic-steps:1042.ve7b_140c4a_e0c
workflow-cps:4000.v5198556e9cea_
workflow-cps-global-lib:609.vd95673f149b_b
workflow-durable-task-step:1353.v1891a_b_01da_18
workflow-job:1436.vfa_244484591f
workflow-multibranch:756.v891d88f2cd46
workflow-scm-step:427.v4ca_6512e7df1
workflow-step-api:678.v3ee58b_469476
workflow-support:936.v9fa_77211ca_e1
ws-cleanup:0.45

There are 2 types of executors for agents.
Regular executors and flyweight executors (OneOffExecutor) . regular executors are those that you have configured for you agent.
Flyweight executors are created on demand. A pipeline job will always create a flyweight executor on the controller built-in agent, they are normally not shown in the UI I think. Flyweight executors are considered in the idle attribute, so even when no regular executor is in use but a flyweight the agent will not be shown as idle.

A pipeline that is not using explicitly a node / agent will only consume a flyweight executor
e.g. this will not consume a regular executor

pipeline {
  agent none
  stages {
     stage("test") {
       steps {
         echo "Hello World"
       }
     }
  }

This will only consume an executor during the second stage

pipeline {
  agent none
  stages {
     stage("initialize") {
       steps {
         echo "Hello World"
       }
     }
     stage("initialize") {
       agent { label 'build' }
       steps {
         echo "Hello World"
         sh 'echo do something'
       }
     }
  }

Similarly in scripted pipeline only using the node step will lead to the usage of a regular executor

node('build') {
  sh 'echo do something'
}
# for the following part  no regular executor is required
build 'other'

Hello mawinter69! Thank you for your quick reply!

This does align with what I would expect. Unfortunately I am still confused as to why Jenkins then reports an agent to be idle. Let me rephrase the situation we are encountering:

  1. pipelineA starts on agentA
  2. agentA is marked as “offline” as we don’t want new jobs to be scheduled on it.
  3. pipelineA starts pipelineB, which runs on agentB
  4. Jenkins reports agentA to be idle … but pipelineB is still underway (and presumably also pipelineA, as it is configured to wait for pipelineB)
  5. we look at agentA’s configuration and because Jenkins flags it as idle (and both executors as not executing anything), we deregister and terminate it
  6. pipelineA aborts because the node was terminated with the following error message:
Agent {agentName} was deleted; cancelling node body

Am I missing something obvious here? :confused:

Kind regards,

This is the implementation for the idle field when you call /computer/<computername>/api/json.
Are you calling /computer/api/json or the agent specific url? If the former are you then extracting the correct agent for the idle check?

Usually I would expect those flyweight executors usually to be found on the controller only
Seems you do

# pipelineA
node('agentA') {
  sh 'do something'
  build 'pipelineB'
}

When you do instead

node('agentA') {
  sh 'do something'
}
build 'pipelineB'

Then agentA would be no longer in use by pipelineA while pipelineB is running

PS: an optomized query is /computer/api/json?tree=computer[displayName,idle,oneOffExecutors[*]]

Hello mawinter69!

Thank you again for your kind help!

  • we use the agent-specific URL
  • we use the correct agent, I double-checked in the payload the API returns. The agent name is returned in the payload (as a label) and we log that payload as part of our telemetry.
  • I can confirm that oneOffExecutors is empty.

Here is the cleansed payload returned from said event:

{'_class': 'hudson.slaves.SlaveComputer', 'actions': [{'_class': 'hudson.plugins.jobConfigHistory.ComputerConfigHistoryAction'}], 'assignedLabels': [{'name': 'linux-legacy'}, {'name': 'autoscaled-legacy-i-0a761a94d4ff0ccda'}], 'description': 'autoscaled-legacy-i-0a761a94d4ff0ccda stage legacy autoScaled agent 10.86.2.16', 'displayName': 'autoscaled-legacy-i-0a761a94d4ff0ccda', 'executors': [{}, {}], 'icon': 'symbol-computer-disconnected', 'iconClassName': 'symbol-computer-disconnected', 'idle': True, 'jnlpAgent': True, 'launchSupported': False, 'loadStatistics': {'_class': 'hudson.model.Label$1'}, 'manualLaunchAllowed': True, 'monitorData': {'hudson.node_monitors.SwapSpaceMonitor': None, 'hudson.node_monitors.TemporarySpaceMonitor': None, 'hudson.node_monitors.DiskSpaceMonitor': None, 'hudson.node_monitors.ArchitectureMonitor': None, 'hudson.node_monitors.ResponseTimeMonitor': None, 'hudson.node_monitors.ClockMonitor': None}, 'numExecutors': 2, 'offline': True, 'offlineCause': {'_class': 'hudson.slaves.OfflineCause$UserCause'}, 'offlineCauseReason': 'Toggled Offline by Autoscaling', 'oneOffExecutors': [], 'temporarilyOffline': True, 'absoluteRemotePath': None}

While we don’t use the node specification, I believe this boils down to the very same thing:

 agent { label 'linux-legacy' }

I have a hunch.
Could it be that this is some kind of race condition where the agent is in fact temporarily idle, let’s say for 5 seconds, until Jenkins evaluates the next step/stage and actually uses up an executor? I.e. we have something like script {cmd1 cmd2 cmd2} within step{}, could it be that between cmd1 and cmd2 there is simply a brief moment when Jenkins “switches” from one to the next where it actually does consider the agent to be idle because no executor is taken at this point (not even a oneOffExecutor)?

Kind regards,

Valentin

the executor should stay occupied for the whole pipeline execution when in declarative and defining the agent globally, or when you define it per stage then as long as the stage is executed.
For scripted the executor is occupied for the whole node step.
It might help if you can share the actual pipeline script

Here is the scaffold of the pipeline in question; I tried to make it a minimal example so it’s easier to see the overall structure. The original one has more stages, but they all follow the pattern of one of the two stages below:

#!/usr/bin/env groovy
pipeline
{

	agent{  label 'linux-legacy' }
	triggers{ cron('H 16 * * *') }

	options
	{
		buildDiscarder(logRotator(daysToKeepStr: '30', numToKeepStr: '10', artifactNumToKeepStr: '3'))
		disableConcurrentBuilds()
	}

	stages
	{
		stage('Check Out GIT, build and upload ')
		{
			steps
			{
			    script
			    {
                        cleanWs()
                        git ....
                        sh ("mvn clean deploy")
			    }
			}
		}
        stage('Functional Test')
		{
            parallel
			{
				stage('Functional Test1')
				{
					steps
					{
						script
						{
								build propagate: false, job: 'JOB1', parameters: []
								build propagate: false, job: 'JOB2', parameters: []
						}
					}
				}
				stage('Functional Test2')
				{
					steps
					{
						script
						{
								build propagate: false, job: 'JOB3', parameters: []
								build propagate: false, job: 'JOB4', parameters: []
								build propagate: false, job: 'JOB5', parameters: []
						}
					}
				}
            }
        }
    }
}

If I remember correctly, there were optimizations for steps which do not actually carry out work (like the build step only polling the controller for the downstream build to finish) to run in flyweight executor on the controller. So technically the node is idle…

The optimization was probably that they run without requiring a workspace, so you can use them outside of a node step or with agent none in declarative.

Do you need the agent after the first stage at a later point in time? If not you could do the following

#!/usr/bin/env groovy
pipeline
{

	agent none
	triggers{ cron('H 16 * * *') }

	options
	{
		buildDiscarder(logRotator(daysToKeepStr: '30', numToKeepStr: '10', artifactNumToKeepStr: '3'))
		disableConcurrentBuilds()
	}

	stages
	{
		stage('Check Out GIT, build and upload ')
		{
			agent{  label 'linux-legacy' }
			steps
			{
			    script
			    {
                        cleanWs()
                        git ....
                        sh ("mvn clean deploy")
			    }
			}
		}
        stage('Functional Test')
		{
            parallel
			{
				stage('Functional Test1')
				{
					steps
					{
						script
						{
								build propagate: false, job: 'JOB1', parameters: []
								build propagate: false, job: 'JOB2', parameters: []
						}
					}
				}
				stage('Functional Test2')
				{
					steps
					{
						script
						{
								build propagate: false, job: 'JOB3', parameters: []
								build propagate: false, job: 'JOB4', parameters: []
								build propagate: false, job: 'JOB5', parameters: []
						}
					}
				}
            }
        }
    }
}

I tested this pipeline and for me the agent is in use all the time until the job finished.
Verified this with the plugin Pipeline Agent Build History

1 Like

The initial assumption was that our understanding of idle was wrong; however, after looking further into the telemetry, I can confirm that both executors as well as oneOffExecutors are empty. Judging by what mawinter69 said, I assume that oneOffExecutors are the flyweight executors you are referring to.This means that idle may be correct (as in: no executors are running), but not the root cause of our problem.

None of these two executor types are “busy”, even though the pipeline has not finished.

Technically not, but I would still prefer a solution that does not encompass pushing down agent specifications into individual stages.
Disclosure: Platform Team speaking here. If I am to go down this route, it means telling DevOps teams to refactor hundreds of pipelines. I can already hear cries of joy. :slight_smile:

Thank you for suggesting this plugin! I installed it and will see whether I can get more data out with it. It is noteworthy that this problem only ever occurs with long-running jobs. We are talking 12, 18 or 20 hours of overall execution time. We did not observe this problem for pipelines that run single-digit hours.

I will try to do some more digging and ideally reproduce this while producing a reliable timeline with telemetry to give you something to work with.
Are there some flags I should be setting on Jenkins to capture relevant scheduling decisions as well?

Kind regards,

One thing from your earlier posts:

If you set an agent temporarily offline, the controller does release this agent immediately, I think. So it sees it as “idle”, as it does nor communicate with this agent at all. If you want to drain the agent, I guess you need to find a better route.

When I set an agent temporarily offline it is still shown as busy when a job is running there consuming a regular executor.
This I get calling api/json?tree=executors[*],idle on such a temporary disabled agent:

{
  "_class": "hudson.slaves.SlaveComputer",
  "executors": [
    {
      "currentExecutable": {
        "_class": "org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask$PlaceholderExecutable"
      },
      "idle": false,
      "likelyStuck": false,
      "number": 0,
      "progress": 72
    }
  ],
  "idle": false
}