Jenkins agents don't reconnect on Jenkins controller restart

Hi,

our Jenkins Setup is build with follwoing Docker Images:

Jenkins Controller based on jenkins/jenkins:lts-jdk17
Jenkins Agent based on jenkins/agent:latest-jdk17 or jenkins/inbound-agent:latest

Agents were configured like descripted in readme: docker-agent/README_inbound-agent.md at 8c7824058ad86f7f18950cab18ecd4377c1e3872 · jenkinsci/docker-agent · GitHub

we are running three agents. If we restart auf Jenkins Controller we are getting following Log messages - but the agents dont reconnect:

Log from jenkins/agent:latest-jdk17 based Image:

May 19, 2025 2:02:16 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/jenkins/remoting as a remoting work directory
May 19, 2025 2:02:17 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /opt/jenkins/remoting
May 19, 2025 2:02:17 PM hudson.remoting.Launcher createEngine
INFO: Setting up agent: jenkins-node2
May 19, 2025 2:02:17 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3309.v27b_9314fd1a_4
May 19, 2025 2:02:17 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/jenkins/remoting as a remoting work directory
May 19, 2025 2:02:17 PM hudson.remoting.Launcher$CuiListener status
INFO: WebSocket connection open
May 19, 2025 2:02:17 PM hudson.remoting.Launcher$CuiListener status
INFO: Connected
May 19, 2025 2:03:37 PM hudson.remoting.Launcher$CuiListener status
INFO: Read side closed
May 19, 2025 2:03:37 PM hudson.remoting.Launcher$CuiListener status
INFO: Terminated
May 19, 2025 2:03:37 PM hudson.remoting.Launcher$CuiListener status
INFO: Performing onReconnect operation.
May 19, 2025 2:03:37 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$EngineListenerAdapterImpl onReconnect
INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@5e4f2e5f

Log from jenkins/inbound-agent:latest based Image:

May 19, 2025 12:03:10 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/jenkins/remoting as a remoting work directory
May 19, 2025 12:03:10 PM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
INFO: Both error and output logs will be printed to /opt/jenkins/remoting
May 19, 2025 12:03:10 PM hudson.remoting.Launcher createEngine
INFO: Setting up agent: jenkins-node4
May 19, 2025 12:03:10 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3309.v27b_9314fd1a_4
May 19, 2025 12:03:10 PM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
INFO: Using /opt/jenkins/remoting as a remoting work directory
May 19, 2025 12:03:11 PM hudson.remoting.Launcher$CuiListener status
INFO: Locating server among [http://jenkins.domain.org:8080/]
May 19, 2025 12:03:11 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver resolve
INFO: Remoting server accepts the following protocols: [JNLP4-connect, Ping]
May 19, 2025 12:03:11 PM hudson.remoting.Launcher$CuiListener status
INFO: Agent discovery successful
  Agent address: jenkins.domain.org
  Agent port:    50000
  Identity:      f6:1a:33:01:00:0a:9b:27:4b:a5:e2:04:1f:e7:0e:64
May 19, 2025 12:03:11 PM hudson.remoting.Launcher$CuiListener status
INFO: Handshaking
May 19, 2025 12:03:11 PM hudson.remoting.Launcher$CuiListener status
INFO: Connecting to jenkins.domain.org:50000
May 19, 2025 12:03:11 PM hudson.remoting.Launcher$CuiListener status
INFO: Server reports protocol JNLP4-connect-proxy not supported, skipping
May 19, 2025 12:03:11 PM hudson.remoting.Launcher$CuiListener status
INFO: Trying protocol: JNLP4-connect
May 19, 2025 12:03:11 PM org.jenkinsci.remoting.protocol.impl.BIONetworkLayer$Reader run
INFO: Waiting for ProtocolStack to start.
May 19, 2025 12:03:11 PM hudson.remoting.Launcher$CuiListener status
INFO: Remote identity confirmed: f6:1a:33:01:00:0a:9b:27:4b:a5:e2:04:1f:e7:0e:64
May 19, 2025 12:03:11 PM hudson.remoting.Launcher$CuiListener status
INFO: Connected
May 19, 2025 12:03:37 PM hudson.remoting.Launcher$CuiListener status
INFO: Terminated
May 19, 2025 12:03:47 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFO: Failed to connect to http://jenkins.domain.org:8080/. Will try again: java.net.ConnectException Connection refused
May 19, 2025 12:04:02 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFO: Failed to connect to http://jenkins.domain.org:8080/. Will try again: java.net.SocketTimeoutException Read timed out
May 19, 2025 12:04:12 PM org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver waitForReady
INFO: Controller isn't ready to talk to us on http://jenkins.domain.org:8080/tcpSlaveAgentListener/. Will try again: response code=503
May 19, 2025 12:04:22 PM hudson.remoting.Launcher$CuiListener status
INFO: Performing onReconnect operation.
May 19, 2025 12:04:22 PM jenkins.slaves.restarter.JnlpSlaveRestarterInstaller$EngineListenerAdapterImpl onReconnect
INFO: Restarting agent via jenkins.slaves.restarter.UnixSlaveRestarter@30914c32

We think the Problem is, that the UnixSlaveRestarter is called , but the agent inside the Docker Container doesnt restart.

Do you have some hints? So we can update and restart our Jenkins Controller and the agents reconnect automatitcally?

I have found a workaround. I added the “-noReconnect” parameter to our Jenkins agents. As a result, the Jenkins agents will terminate, and because of our “restart=always” parameter in the docker run command, the container will automatically restart afterward.

2 Likes

This workaround solves the issue for the moment. Anyway i do have other agents running in different/same networks that reconnect perfectly fine. So i am wondering if there is another real solution to this problem.