EC2 build instance intermittently disconnects from controller

Hello All,
I’m trying to troubleshoot an intermittent connectivity issue between my Jenkins Controller and a specific build node. Currently I have three EC2 instances, one being the Jenkins Controller (Version 2.332.1) and an ARM instance (the ec2 instance disconnecting from the Jenkins controller) along with a 64-bit (x86) instance (Which always maintains a good connection). The Jenkins controller is currently hosted on Amazon Linux 2, and the build agents being on Ubuntu 20.04. I have been experiencing intermittent connectivity issue over the last two months with the instance now disconnecting daily since last week.

Java versions are the same… Im not seeing any errors in logs that have pointed me towards a fix. Any help is greatly appreciated.

Thank you

Johnny M

Agent Configs

13:07:33  Started by user administrator
13:07:33  [Pipeline] Start of Pipeline
13:07:33  [Pipeline] echo
13:07:33  ====================
13:07:34  [Pipeline] echo
13:07:34  Name: <REMOVED>
13:07:34  [Pipeline] echo
13:07:34  getLabelString: <REMOVED>
13:07:34  [Pipeline] echo
13:07:34  getNumExectutors: 4
13:07:34  [Pipeline] echo
13:07:34  getRemoteFS: /home/ubuntu
13:07:34  [Pipeline] echo
13:07:34  getMode: NORMAL
13:07:34  [Pipeline] echo
13:07:34  getRootPath: /home/ubuntu
13:07:34  [Pipeline] echo
13:07:34  getDescriptor: hudson.slaves.DumbSlave$DescriptorImpl@<REMOVED>
13:07:34  [Pipeline] echo
13:07:34  getComputer: hudson.slaves.SlaveComputer@<REMOVED>
13:07:34  [Pipeline] echo
13:07:34  	computer.isAcceptingTasks: true
13:07:34  [Pipeline] echo
13:07:34  	computer.isLaunchSupported: true
13:07:34  [Pipeline] echo
13:07:34  	computer.getConnectTime: 1647283561016
13:07:34  [Pipeline] echo
13:07:34  	computer.getDemandStartMilliseconds: 9223372036854775807
13:07:34  [Pipeline] echo
13:07:34  	computer.isOffline: false
13:07:34  [Pipeline] echo
13:07:34  	computer.countBusy: 0
13:07:34  [Pipeline] echo
13:07:34  	computer.getLog: SSHLauncher{host='<REMOVED>', port=22, credentialsId='<REMOVED>', jvmOptions='', javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', launchTimeoutSeconds=60, maxNumRetries=10, retryWaitTime=15, sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.KnownHostsFileKeyVerificationStrategy, tcpNoDelay=true, trackCredentials=true}
13:07:34  [03/14/22 11:02:45] [SSH] Opening SSH connection to <REMOVED>:22.
13:07:34  The kexTimeout (65000 ms) expired.
13:07:34  SSH Connection failed with IOException: "The kexTimeout (65000 ms) expired.", retrying in 15 seconds. There are 10 more retries left.
13:07:34  The kexTimeout (65000 ms) expired.
13:07:34  SSH Connection failed with IOException: "The kexTimeout (65000 ms) expired.", retrying in 15 seconds. There are 9 more retries left.
13:07:34  The kexTimeout (65000 ms) expired.
13:07:34  SSH Connection failed with IOException: "The kexTimeout (65000 ms) expired.", retrying in 15 seconds. There are 8 more retries left.
13:07:34  connect timed out
13:07:34  SSH Connection failed with IOException: "connect timed out", retrying in 15 seconds. There are 7 more retries left.
13:07:34  connect timed out
13:07:34  SSH Connection failed with IOException: "connect timed out", retrying in 15 seconds. There are 6 more retries left.
13:07:34  Searching for <REMOVED> in /var/lib/jenkins/.ssh/known_hosts
13:07:34  Searching for <REMOVED>:22 in /var/lib/jenkins/.ssh/known_hosts
13:07:34  [03/14/22 11:09:15] [SSH] SSH host key matches key in Known Hosts file. Connection will be allowed.
13:07:34  [03/14/22 11:09:15] [SSH] Authentication successful.
13:07:34  [03/14/22 11:09:17] [SSH] The remote user's environment is:
13:07:34  BASH=/usr/bin/bash
13:07:34  BASHOPTS=checkwinsize:cmdhist:complete_fullquote:extquote:force_fignore:globasciiranges:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath
13:07:34  BASH_ALIASES=()
13:07:34  BASH_ARGC=([0]="0")
13:07:34  BASH_ARGV=()
13:07:34  BASH_CMDS=()
13:07:34  BASH_EXECUTION_STRING=set
13:07:34  BASH_LINENO=()
13:07:34  BASH_SOURCE=()
13:07:34  BASH_VERSINFO=([0]="5" [1]="0" [2]="17" [3]="1" [4]="release" [5]="aarch64-unknown-linux-gnu")
13:07:34  BASH_VERSION='5.0.17(1)-release'
13:07:34  DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
13:07:34  DIRSTACK=()
13:07:34  EUID=1000
13:07:34  GROUPS=()
13:07:34  HOME=/home/ubuntu
13:07:34  HOSTNAME=<REMOVED>
13:07:34  HOSTTYPE=aarch64
13:07:34  IFS=$' \t\n'
13:07:34  LANG=C.UTF-8
13:07:34  LOGNAME=ubuntu
13:07:34  MACHTYPE=aarch64-unknown-linux-gnu
13:07:34  MOTD_SHOWN=pam
13:07:34  OPTERR=1
13:07:34  OPTIND=1
13:07:34  OSTYPE=linux-gnu
13:07:34  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
13:07:34  PIPESTATUS=([0]="0")
13:07:34  PPID=1118
13:07:34  PS4='+ '
13:07:34  PWD=/home/ubuntu
13:07:34  SHELL=/bin/bash
13:07:34  SHELLOPTS=braceexpand:hashall:interactive-comments
13:07:34  SHLVL=1
13:07:34  SSH_CLIENT='<REMOVED>'
13:07:34  SSH_CONNECTION='<REMOVED>'
13:07:34  TERM=dumb
13:07:34  UID=1000
13:07:34  USER=ubuntu
13:07:34  XDG_RUNTIME_DIR=/run/user/1000
13:07:34  XDG_SESSION_CLASS=user
13:07:34  XDG_SESSION_ID=1
13:07:34  XDG_SESSION_TYPE=tty
13:07:34  _=']'
13:07:34  Checking Java version in the PATH
13:07:34  openjdk version "1.8.0_312"
13:07:34  OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~20.04-b07)
13:07:34  OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)
13:07:34  [03/14/22 11:09:17] [SSH] Checking java version of /home/ubuntu/jdk/bin/java
13:07:34  Couldn't figure out the Java version of /home/ubuntu/jdk/bin/java
13:07:34  bash: /home/ubuntu/jdk/bin/java: No such file or directory
13:07:34  
13:07:34  [03/14/22 11:09:17] [SSH] Checking java version of java
13:07:34  [03/14/22 11:09:18] [SSH] java -version returned 1.8.0_312.
13:07:34  [03/14/22 11:09:18] [SSH] Starting sftp client.
13:07:34  [03/14/22 11:09:18] [SSH] Copying latest remoting.jar...
13:07:34  Source agent hash is <REMOVED>. Installed agent hash is <REMOVED>
13:07:34  Verified agent jar. No update is necessary.
13:07:34  Expanded the channel window size to 4MB
13:07:34  [03/14/22 11:09:18] [SSH] Starting agent process: cd "/home/ubuntu" && java  -jar remoting.jar -workDir /home/ubuntu -jar-cache /home/ubuntu/remoting/jarCache
13:07:34  Mar 14, 2022 11:09:19 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
13:07:34  INFO: Using /home/ubuntu/remoting as a remoting work directory
13:07:34  Mar 14, 2022 11:09:19 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
13:07:34  INFO: Both error and output logs will be printed to /home/ubuntu/remoting
13:07:34  <===[JENKINS REMOTING CAPACITY]===>channel started
13:07:34  Remoting version: 4.12
13:07:34  This is a Unix agent
13:07:34  Evacuated stdout
13:07:34  Agent successfully connected and online
13:07:34  
13:07:34  [Pipeline] echo
13:07:34  	computer.getBuilds: []
13:07:34  [Pipeline] echo
13:07:34  ====================
13:07:34  [Pipeline] echo
13:07:34  Name: <REMOVED>
13:07:34  [Pipeline] echo
13:07:34  getLabelString: <REMOVED>
13:07:34  [Pipeline] echo
13:07:34  getNumExectutors: 4
13:07:34  [Pipeline] echo
13:07:34  getRemoteFS: /home/ubuntu
13:07:34  [Pipeline] echo
13:07:34  getMode: NORMAL
13:07:34  [Pipeline] echo
13:07:34  getRootPath: /home/ubuntu
13:07:34  [Pipeline] echo
13:07:34  getDescriptor: hudson.slaves.DumbSlave$DescriptorImpl@<REMOVED>
13:07:34  [Pipeline] echo
13:07:34  getComputer: hudson.slaves.SlaveComputer@<REMOVED>
13:07:34  [Pipeline] echo
13:07:34  	computer.isAcceptingTasks: true
13:07:34  [Pipeline] echo
13:07:34  	computer.isLaunchSupported: true
13:07:34  [Pipeline] echo
13:07:34  	computer.getConnectTime: 1647284131658
13:07:34  [Pipeline] echo
13:07:34  	computer.getDemandStartMilliseconds: 9223372036854775807
13:07:34  [Pipeline] echo
13:07:34  	computer.isOffline: false
13:07:34  [Pipeline] echo
13:07:34  	computer.countBusy: 0
13:07:34  [Pipeline] echo
13:07:34  	computer.getLog: SSHLauncher{host='<REMOVED>', port=22, credentialsId='<REMOVED>', jvmOptions='', javaPath='', prefixStartSlaveCmd='', suffixStartSlaveCmd='', launchTimeoutSeconds=60, maxNumRetries=10, retryWaitTime=15, sshHostKeyVerificationStrategy=hudson.plugins.sshslaves.verifiers.KnownHostsFileKeyVerificationStrategy, tcpNoDelay=true, trackCredentials=true}
13:07:34  [03/14/22 10:17:15] [SSH] Opening SSH connection to <REMOVED>:22.
13:07:34  Searching for <REMOVED> in /var/lib/jenkins/.ssh/known_hosts
13:07:34  Searching for <REMOVED>:22 in /var/lib/jenkins/.ssh/known_hosts
13:07:34  [03/14/22 10:17:15] [SSH] SSH host key matches key in Known Hosts file. Connection will be allowed.
13:07:34  [03/14/22 10:17:16] [SSH] Authentication successful.
13:07:34  [03/14/22 10:17:17] [SSH] The remote user's environment is:
13:07:34  BASH=/usr/bin/bash
13:07:34  BASHOPTS=checkwinsize:cmdhist:complete_fullquote:extquote:force_fignore:globasciiranges:hostcomplete:interactive_comments:progcomp:promptvars:sourcepath
13:07:34  BASH_ALIASES=()
13:07:34  BASH_ARGC=([0]="0")
13:07:34  BASH_ARGV=()
13:07:34  BASH_CMDS=()
13:07:34  BASH_EXECUTION_STRING=set
13:07:34  BASH_LINENO=()
13:07:34  BASH_SOURCE=()
13:07:34  BASH_VERSINFO=([0]="5" [1]="0" [2]="17" [3]="1" [4]="release" [5]="x86_64-pc-linux-gnu")
13:07:34  BASH_VERSION='5.0.17(1)-release'
13:07:34  DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
13:07:34  DIRSTACK=()
13:07:34  EUID=1000
13:07:34  GROUPS=()
13:07:34  HOME=/home/ubuntu
13:07:34  HOSTNAME=<REMOVED>
13:07:34  HOSTTYPE=x86_64
13:07:34  IFS=$' \t\n'
13:07:34  LANG=C.UTF-8
13:07:34  LOGNAME=ubuntu
13:07:34  MACHTYPE=x86_64-pc-linux-gnu
13:07:34  MOTD_SHOWN=pam
13:07:34  OPTERR=1
13:07:34  OPTIND=1
13:07:34  OSTYPE=linux-gnu
13:07:34  PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
13:07:34  PIPESTATUS=([0]="0")
13:07:34  PPID=2642295
13:07:34  PS4='+ '
13:07:34  PWD=/home/ubuntu
13:07:34  SHELL=/bin/bash
13:07:34  SHELLOPTS=braceexpand:hashall:interactive-comments
13:07:34  SHLVL=1
13:07:34  SSH_CLIENT='<REMOVED>'
13:07:34  SSH_CONNECTION='<REMOVED>'
13:07:34  TERM=dumb
13:07:34  UID=1000
13:07:34  USER=ubuntu
13:07:34  XDG_RUNTIME_DIR=/run/user/1000
13:07:34  XDG_SESSION_CLASS=user
13:07:34  XDG_SESSION_ID=7339
13:07:34  XDG_SESSION_TYPE=tty
13:07:34  _=']'
13:07:34  Checking Java version in the PATH
13:07:34  openjdk version "1.8.0_312"
13:07:34  OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~20.04-b07)
13:07:34  OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)
13:07:34  [03/14/22 10:17:18] [SSH] Checking java version of /home/ubuntu/jdk/bin/java
13:07:34  Couldn't figure out the Java version of /home/ubuntu/jdk/bin/java
13:07:34  bash: /home/ubuntu/jdk/bin/java: No such file or directory
13:07:34  
13:07:34  [03/14/22 10:17:18] [SSH] Checking java version of java
13:07:34  [03/14/22 10:17:18] [SSH] java -version returned 1.8.0_312.
13:07:34  [03/14/22 10:17:18] [SSH] Starting sftp client.
13:07:34  [03/14/22 10:17:18] [SSH] Copying latest remoting.jar...
13:07:34  Source agent hash is <REMOVED>. Installed agent hash is <REMOVED>
13:07:34  Verified agent jar. No update is necessary.
13:07:34  Expanded the channel window size to 4MB
13:07:34  [03/14/22 10:17:18] [SSH] Starting agent process: cd "/home/ubuntu" && java  -jar remoting.jar -workDir /home/ubuntu -jar-cache /home/ubuntu/remoting/jarCache
13:07:34  Mar 14, 2022 10:17:19 AM org.jenkinsci.remoting.engine.WorkDirManager initializeWorkDir
13:07:34  INFO: Using /home/ubuntu/remoting as a remoting work directory
13:07:34  Mar 14, 2022 10:17:19 AM org.jenkinsci.remoting.engine.WorkDirManager setupLogging
13:07:34  INFO: Both error and output logs will be printed to /home/ubuntu/remoting
13:07:34  <===[JENKINS REMOTING CAPACITY]===>channel started
13:07:34  Remoting version: 4.12
13:07:34  This is a Unix agent
13:07:34  Evacuated stdout
13:07:34  Agent successfully connected and online
13:07:35  
13:07:35  [Pipeline] echo
13:07:35  	computer.getBuilds: []
13:07:35  [Pipeline] End of Pipeline
13:07:35  Finished: SUCCESS
REST API

Remote logs on Jenkins ARM Build agent located at /remoting/logs

WARNING: Process leaked file descriptors. See https://www.jenkins.io/redirect/troubleshooting/process-leaked-file-descriptors for more information
java.lang.Exception
	at hudson.Proc$LocalProc.join(Proc.java:342)
	at hudson.Proc.joinWithTimeout(Proc.java:174)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2664)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2601)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommandIn(CliGitAPIImpl.java:2597)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.launchCommand(CliGitAPIImpl.java:1968)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.reset(CliGitAPIImpl.java:684)
	at org.jenkinsci.plugins.gitclient.CliGitAPIImpl.clean(CliGitAPIImpl.java:1058)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at hudson.remoting.RemoteInvocationHandler$RPCRequest.perform(RemoteInvocationHandler.java:924)
	at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:902)
	at hudson.remoting.RemoteInvocationHandler$RPCRequest.call(RemoteInvocationHandler.java:853)
	at hudson.remoting.UserRequest.perform(UserRequest.java:211)
	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
	at hudson.remoting.Request$2.run(Request.java:376)
	at hudson.remoting.InterceptingExecutorService.lambda$wrap$0(InterceptingExecutorService.java:78)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)