Jobs aborted "Agent was Removed"

I’m having issues with a node running a python program through jenkins.
The app runs fine for 5-10 minutes, then it randomly gets aborted.

The only information i get about it is a console log (and a status on the job itself) which says “Agent was Removed”

How can i troubleshoot this and pinpoint the specific problem? I have tried enabling a global logger to FINE detail but it doesnt seem to catch any relevant information

Thanks

Jenkins setup:

Jenkins: 2.479.3
OS: Linux - 5.15.0-144-generic
Java: 17.0.13 - Eclipse Adoptium (OpenJDK 64-Bit Server VM)

Office-365-Connector:5.1.0
ant:511.v0a_a_1a_334f41b_
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-269.vfa_2321039a_83
asm-api:9.8-135.vb_2239d08ee90
authorize-project:1.7.2
bootstrap5-api:5.3.3-1
bouncycastle-api:2.30.1.78.1-248.ve27176eb_46cb_
branch-api:2.1178.v969d9eb_c728e
build-timeout:1.33
caffeine-api:3.1.8-133.v17b_1ff2e0599
calendar-view:0.3.4
checks-api:2.2.0
cloudbees-folder:6.942.vb_43318a_156b_2
commons-lang3-api:3.17.0-84.vb_b_938040b_078
commons-text-api:1.12.0-129.v99a_50df237f7
credentials:1405.vb_cda_74a_f8974
credentials-binding:687.v619cb_15e923f
dark-theme:479.v661b_1b_911c01
display-url-api:2.209.v582ed814ff2f
durable-task:581.v299a_5609d767
echarts-api:5.5.1-1
eddsa-api:0.3.0-4.v84c6f0f4969e
email-ext:1844.v3ea_a_b_842374a_
emailext-template:1.5
folder-auth:1.4
font-awesome-api:6.6.0-1
generic-webhook-trigger:2.3.1
git:5.7.0
git-client:6.1.0
github:1.40.0
github-api:1.321-468.v6a_9f5f2d5a_7e
github-branch-source:1797.v86fdb_4d57d43
gitlab-plugin:1.9.7
gradle:2.12.1
gson-api:2.12.1-113.v347686d6729f
instance-identity:201.vd2a_b_5a_468a_a_6
ionicons-api:74.v93d5eb_813d5f
jackson2-api:2.17.0-379.v02de8ec9f64c
jakarta-activation-api:2.1.3-1
jakarta-mail-api:2.1.3-1
javax-activation-api:1.2.0-7
javax-mail-api:1.6.2-10
jaxb:2.3.9-1
jersey2-api:2.44-151.v6df377fff741
jjwt-api:0.11.5-112.ve82dfb_224b_a_d
joda-time-api:2.13.0-93.v9934da_29b_a_e9
jquery3-api:3.7.1-2
json-api:20240303-41.v94e11e6de726
json-path-api:2.9.0-148.v22a_7ffe323ce
junit:1307.vdd5b_2646279e
ldap:725.v3cb_b_711b_1a_ef
mail-watcher-plugin:1.20
mailer:489.vd4b_25144138f
matrix-auth:3.2.2
matrix-project:840.v812f627cb_578
metrics:4.2.21-451.vd51df8df52ec
mina-sshd-api-common:2.14.0-138.v6341ee58e1df
mina-sshd-api-core:2.14.0-138.v6341ee58e1df
okhttp-api:4.11.0-172.vda_da_1feeb_c6e
pam-auth:1.11
pipeline-build-step:540.vb_e8849e1a_b_d8
pipeline-github-lib:61.v629f2cc41d83
pipeline-graph-analysis:216.vfd8b_ece330ca_
pipeline-graph-view:332.vb_232ced67fa_9
pipeline-groovy-lib:730.ve57b_34648c63
pipeline-input-step:495.ve9c153f6067b_
pipeline-milestone-step:119.vdfdc43fc3b_9a_
pipeline-model-api:2.2218.v56d0cda_37c72
pipeline-model-definition:2.2218.v56d0cda_37c72
pipeline-model-extensions:2.2218.v56d0cda_37c72
pipeline-rest-api:2.34
pipeline-stage-step:312.v8cd10304c27a_
pipeline-stage-tags-metadata:2.2218.v56d0cda_37c72
pipeline-stage-view:2.34
plain-credentials:183.va_de8f1dd5a_2b_
plugin-util-api:4.1.0
resource-disposer:0.25
role-strategy:743.v142ea_b_d5f1d3
scm-api:704.v3ce5c542825a_
script-security:1369.v9b_98a_4e95b_2d
snakeyaml-api:2.3-123.v13484c65210a_
ssh-credentials:349.vb_8b_6b_9709f5b_
ssh-slaves:3.1021.va_cc11b_de26a_e
sshd:3.330.vc866a_8389b_58
structs:338.v848422169819
theme-manager:262.vc57ee4a_eda_5d
thinBackup:2.1.1
timestamper:1.27
token-macro:444.v52de7e9c573d
trilead-api:2.147.vb_73cc728a_32e
variant:60.v7290fc0eb_b_cd
workflow-aggregator:600.vb_57cdd26fdd7
workflow-api:1371.ve334280b_d611
workflow-basic-steps:1058.vcb_fc1e3a_21a_9
workflow-cps:4007.vd705fc76a_34e
workflow-durable-task-step:1398.vf6c9e89e5988
workflow-job:1508.v9cb_c3a_a_89dfd
workflow-multibranch:795.ve0cb_1f45ca_9a_
workflow-scm-step:427.v4ca_6512e7df1
workflow-step-api:700.v6e45cb_a_5a_a_21
workflow-support:968.v8f17397e87b_8
ws-cleanup:0.48

Can you provide an excrept from the build log when the error happens? The very message Agent was removed implicates that not only Jenkins lost connection to the machine running the build, but the very Jenkins configuration have been changed to remove that machine from the agents list. Can you verify if the agent was actually removed?

Is there anything in your Jenkins setup that might be managing your agent lifetime for you? I might expect such behavior from some cloud provider like Azure or AWS, but I do not see anything related in your plugins list; so maybe some external automation?

Does your program keep producing logs to display in your build console? I know there are some mechanisms in Jenkins to stop the build if Jenkins considers it “stuck”; as in the build step not producing any stdout/stderr lines for some time, but very much doubt this could lead to an agent removal.

Does it happen only in a single job, or would the same occur if you create another one doing something like sh "while true; do date; sleep 60; done"?

Hey Artalis thank you for the help. I’ll try to provide useful information. This has been happening for a few days and every time i get the “Agent was removed” response, the agent is not actually removed (as in, i don’t need to launch again the agent.jar in the node machine to build another job.

I dont recall anything that could act as an external automation. Could you elaborate on this?The original jobs i’m having issues with are some RPA apps that automate interactions with a ERP installed on the node machine and a web app through selenium. Would that count as external automation?

Regarding the shell script you provided, i ran the following code in the node which i believe has the same purpose.

pipeline {
agent {
label ‘[[REDACTED]]’
}
stages {
stage(‘Monitor Agent Connection’) {
steps {
bat ‘’’
@echo off
:loop
echo %date% %time%
ping 127.0.0.1 -n 61 >nul
goto loop
‘’’
}
}
}
}

I ran the job a few times with similar results: The job gets aborted randomly arround 5 minutes in.
This is the output of a single execution

Started by user [[REDACTED]]
[Pipeline] Start of Pipeline
[Pipeline] node
Running on [[REDACTED_NODE]] in [[REDACTED]]test_agente
[Pipeline] {
[Pipeline] stage
[Pipeline] { (Monitor Agent Connection)
[Pipeline] bat
08-08-2025 9:58:38,09
08-08-2025 9:59:38,77
08-08-2025 10:00:39,39
08-08-2025 10:01:40,08
08-08-2025 10:02:40,74
ERROR: also cancelling shell steps running on [[REDACTED_NODE]]
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Agent was removed
org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: 4717dc77-952b-45b8-9f4a-f773d9ef7364
Finished: ABORTED

Thanks again for any help or suggestions about how to proceed.

How do you connect your agents? Are they inbound or outbound.
To investigate further I would check the following on the test job that also fails.

  • is the actual bat step script still running?
  • has the agents java process been restarted?
  • Do the logs of the agent in Jenkins indicate that the agent lost connection in between

Maybe there was some change in a firewall so that the connection from the controller to the agent is dropped. Though normally I would expect that in such a case Jenkins is able to reconnect to the agent and detect that the job is still running

1 Like

Hi thank you for cooperating with this issue.
I believe the connection is inbound because i launch the agent.jar in the node machine, as opposed to connecting via SSH which would be outbound? Though im not sure if thats what you meant with the question.

  • The test job previously described launches a cmd.exe in the background, and it is instantly terminated when the job gets aborted.
  • I looked at the agent java shell output, and when the jobs are aborted the last lines pop up (the ones that start with hudson.remoting.RemoteInvocationHandler). This is the full output. I looked up what these mean but i’m still unsure if these lines actually imply theres an exception or a problem behind, or if they are just stats being logged to the console.
ago 08, 2025 11:59:05 A. M. org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDirINFO: Using \j\remoting as a remoting work directory
ago 08, 2025 11:59:05 A. M. org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to \j\remotingago 08, 2025 11:59:05 A. M. hudson.remoting.Launcher createEngine
INFO: Setting up agent: REDACTED_NODE
ago 08, 2025 11:59:05 A. M. hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3206.vb_15dcf73f6a_9ago 08, 2025 11:59:05 A. M. org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using \j\remoting as a remoting work directoryago 08, 2025 11:59:06 A. M. hudson.remoting.Launcher$CuiListener status
INFO: WebSocket connection openago 08, 2025 11:59:06 A. M. hudson.remoting.Launcher$CuiListener status
INFO: Connectedago 08, 2025 12:54:11 P. M. hudson.remoting.RemoteInvocationHandler$Unexporter reportStats
INFO: rate(1min) = 104,0±49,7/sec; rate(5min) = 102,0±42,9/sec; rate(15min) = 100,1±64,7/sec; rate(total) = 102,5±94,6/sec; N = 650ago 08, 2025 12:55:11 P. M. hudson.remoting.RemoteInvocationHandler$Unexporter reportStatsINFO: rate(1min) = 132,0±134,2/sec; rate(5min) = 107,8±72,3/sec; rate(15min) = 102,0±72,9/sec; rate(total) = 103,0±96,2/sec; N = 652ago 08, 2025 12:56:11 P. M. hudson.remoting.RemoteInvocationHandler$Unexporter reportStats

  • Agent logs in the jenkins UI do not show any indication of lost connection. Full output:

Inbound agent connected from REDACTED_IP_ADDRESS
Remoting version: 3206.vb_15dcf73f6a_9
Launcher: JNLPLauncher
Communication Protocol: WebSocket
This is a Windows agent
Agent successfully connected and online

I don’t believe its possible for it to be a firewall related issue because i have another node which is physically next to the one with the problem i’m describing, under the same network configurations, and it doesn’t have this Agent was Removed behaviour.

Thanks again everyone for your time, and i’d greatly appreciate any further tips or pointers to troubleshoot this.