Hi community,
Our team has identified a test scenario that could potentially cause the Jenkins agent/agent to crash or reboot. However, even if the agent crashes or reboots during execution, the Jenkins build itself does not fail/exist. To address this, I’ve created a similar pipeline to simulate the scenario:
Pipeline:
node("192.168.111.134"){
sh '''
i=1
while [ $i -le 600 ]
do
echo " $i"
((i++))
sleep 5
done
'''
}
While the pipeline is running, I rebooted the agent “192.168.111.134”, and here is the pipeline output:
Now, my question is how can I ensure that the Jenkins build fails/exists immediately upon the agent crashing?
15:34:41 Running on 192.168.111.134 in /home/tcnsh/k8s-workspace/workspace/test
15:34:41 [Pipeline] {
15:34:42 [Pipeline] sh
15:34:43 + i=1
15:34:43 + '[' 1 -le 600 ']'
15:34:43 + echo ' 1'
15:34:43 1
15:34:43 + (( i++ ))
15:34:43 + sleep 5
15:34:49 + '[' 2 -le 600 ']'
15:34:49 + echo ' 2'
15:34:49 2
15:34:49 + (( i++ ))
15:34:49 + sleep 5
15:34:54 + '[' 3 -le 600 ']'
15:34:54 + echo ' 3'
15:34:54 3
15:34:54 + (( i++ ))
15:34:54 + sleep 5
15:34:58 + '[' 4 -le 600 ']'
15:34:58 + echo ' 4'
15:34:58 4
15:34:58 + (( i++ ))
15:34:58 + sleep 5
15:35:00 Cannot contact 192.168.111.134: hudson.remoting.ChannelClosedException: Channel "hudson.remoting.Channel@777efc9f:192.168.111.134": Remote call on 192.168.111.134 failed. The channel is closing down or has closed down
15:45:01 wrapper script does not seem to be touching the log file in /home/tcnsh/k8s-workspace/workspace/test@tmp/durable-85599b58
15:45:01 (JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)