I’m a relative newb to Jenkins but have had to take over our instance after having to let go of the engineer who previously managed it. He documented nothing and was using personal keys/tokens for everything (which obviously broke when I disabled his accounts) despite being instructed to switch to deploy keys and service accounts for everything. I updated the keys with centrally managed ones, but now I can’t get any jobs to run. The only thing that changed was the credentials and I have verified that each of them has the requisite permissions.
We use an agent running on an EC2 instance to launch additional nodes from AMIs from a cloud config in Jenkins. Here is the config of that agent:
When I run one one of our jobs, i get the following console output:
00:00:00.001 Running as SYSTEM
00:00:00.002 [EnvInject] - Loading node environment variables.
00:00:00.018 Building remotely on dinky-launcher (launcher) in workspace /home/ubuntu/workspace/ProcessShapefileToDB
00:00:00.018 [WS-CLEANUP] Deleting project workspace...
00:00:00.018 [WS-CLEANUP] Deferred wipeout is used...
00:00:00.034 [WS-CLEANUP] Done
00:00:00.038 The recommended git tool is: NONE
00:00:00.070 using credential grant-github-auth
00:00:00.081 Cloning the remote Git repository
00:00:00.086 Cloning repository https://github.com/Roadway-Management-Technologies/RoadwayManagement
00:00:00.098 > git init /home/ubuntu/workspace/ProcessShapefileToDB # timeout=10
00:00:00.098 Fetching upstream changes from https://github.com/Roadway-Management-Technologies/RoadwayManagement
00:00:00.099 > git --version # timeout=10
00:00:00.099 > git --version # 'git version 2.34.1'
00:00:00.100 using GIT_ASKPASS to set credentials testing auth issues
00:00:00.101 > git fetch --tags --force --progress -- https://github.com/Roadway-Management-Technologies/RoadwayManagement +refs/heads/*:refs/remotes/origin/* # timeout=10
00:00:02.249 > git config remote.origin.url https://github.com/Roadway-Management-Technologies/RoadwayManagement # timeout=10
00:00:02.254 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # timeout=10
00:00:02.265 Avoid second fetch
00:00:02.267 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
00:00:02.276 Checking out Revision 7dc47ce8e465fc4ae8410b9de0bc682557bfd3d5 (refs/remotes/origin/master)
00:00:02.278 > git config core.sparsecheckout # timeout=10
00:00:02.283 > git checkout -f 7dc47ce8e465fc4ae8410b9de0bc682557bfd3d5 # timeout=10
00:00:02.490 Commit message: "Merge pull request #35 from Roadway-Management-Technologies/Shapefile-Preservation-Effort-PCI-Import"
00:00:02.493 > git rev-list --no-walk 7dc47ce8e465fc4ae8410b9de0bc682557bfd3d5 # timeout=10
00:00:02.518 Triggering ProcessShapefileToDB » 38,mdi-v4j-c6a2xl
00:00:07.520 Configuration ProcessShapefileToDB » 38,mdi-v4j-c6a2xl is still in the queue: All nodes of label ‘mdi-v4j-c6a2xl’ are offline
00:00:22.524 Configuration ProcessShapefileToDB » 38,mdi-v4j-c6a2xl is still in the queue: ‘EC2 (mindevimage-ec2-jenkins-agents) - parallel processor mdi-v4j (c6a.2xl) (i-08bfd217367a70aba)’ is offline
00:01:57.546 Configuration ProcessShapefileToDB » 38,mdi-v4j-c6a2xl is still in the queue: ‘EC2 (mindevimage-ec2-jenkins-agents) - parallel processor mdi-v4j (c6a.2xl) (i-08bfd217367a70aba)’ is offline
00:01:57.546 ‘dinky-launcher’ doesn’t have label ‘mdi-v4j-c6a2xl’
00:02:37.555 ProcessShapefileToDB » 38,mdi-v4j-c6a2xl completed with result FAILURE
00:02:37.576 Finished: FAILURE
The child EC2 instance spawns just fine and I can SSH into the box without issue. The Jenkins log itself outputs this after successfully connecting to the spawned instance via SSH.
Waiting for SSH to come up. Sleeping 5.
Jul 23, 2024 3:23:26 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to 172.31.47.240 on port 22, with timeout 10000.
Jul 23, 2024 3:23:27 PM INFO hudson.plugins.ec2.EC2Cloud log
The SSH key ssh-ed25519 ff:5a:a6:22:76:c4:e7:da:17:6b:06:05:77:c3:c4:47 has been successfully checked against the instance console for connections to EC2 (mindevimage-ec2-jenkins-agents) - parallel processor mdi-v4j (c6a.2xl) (i-08bfd217367a70aba)
Jul 23, 2024 3:23:27 PM INFO hudson.plugins.ec2.EC2Cloud log
Connected via SSH.
Jul 23, 2024 3:23:27 PM INFO hudson.plugins.ec2.EC2Cloud log
connect fresh as root
Jul 23, 2024 3:23:27 PM INFO hudson.plugins.ec2.EC2Cloud log
Connecting to 172.31.47.240 on port 22, with timeout 10000.
Jul 23, 2024 3:23:27 PM INFO hudson.plugins.ec2.EC2Cloud log
Connection allowed after the host key has been verified
Jul 23, 2024 3:23:27 PM INFO hudson.plugins.ec2.EC2Cloud log
Connected via SSH.
Jul 23, 2024 3:23:27 PM INFO hudson.plugins.ec2.EC2Cloud log
Creating tmp directory (/tmp) if it does not exist
Jul 23, 2024 3:23:30 PM INFO hudson.plugins.ec2.EC2Cloud log
Verifying: java -fullversion
Jul 23, 2024 3:23:33 PM INFO hudson.plugins.ec2.EC2Cloud log
Verifying: which scp
Jul 23, 2024 3:23:33 PM INFO hudson.plugins.ec2.EC2Cloud log
Copying remoting.jar to: /tmp
Jul 23, 2024 3:23:33 PM INFO hudson.plugins.ec2.EC2Cloud log
Launching remoting agent (via Trilead SSH2 Connection): java -jar /tmp/remoting.jar -workDir /home/ubuntu/
I’m at a loss here. Anyone have any suggestions for things to modify/investigate? I took over this team about 3 months ago and have no one else left with any knowledge of how this system was configured. Fun, I know.
EDIT: server info if its helpful
Jenkins: 2.414.2
OS: Linux - 6.5.0-1023-aws
Java: 17.0.11 - Ubuntu (OpenJDK 64-Bit Server VM)