Jenkins Agent should wait for pipeline to complete on SIGTERM

I’m using kubernetes plugin to run some jobs but jenkins agent does not work as expected during any maintanence on kubernetes cluster itself.

Here is the case.

I below pipeline I have 2 stages. Also I have JNLP container which supposed to keep connected with jenkins while container is running.

While first stage is running when I delete the pod, both containers are keep running for additional 350 seconds while pod is in Terminating state.

Here is what happening in the containers inside pod.

  • In the cicd container the print operation is working as expected till 120 in the first stage.
  • The JNLP agent is disconnecting from jenkins and does not letting it run second stage even though jenkins agent container is still in running state but the agent process is already died and disconnected from jenkins.
    Stream closed EOF for jenkins-deployment-k8s-agents/dev-jenkins-agent-8-1cm3b-84sxp-rx5bw (jnlp)
  • second stage is not running at all because jenkins agent process is already died. And also jenkins agent container doesn’t know what is going on in the first stage after receiving the SIGTERM.
pipeline {
    agent {
        kubernetes {
      yaml """
            apiVersion: v1
            kind: Pod
            spec:
              terminationGracePeriodSeconds: 350
              containers:
              - name: cicd
                image: testimage
        """
        }
    }
    stages {
        stage('stage1') {
            steps {
                container('cicd') {
                    sh '''
                    x=1; while  [ $x -le 120 ]; do date && sleep 1 && echo $(( x++ )) >> /tmp/num; done
                    '''
                }
            }
        }
        stage('stage2') {
            steps {
                container('cicd') {
                    sh '''
                    x=200; while  [ $x -le 320 ]; do date && sleep 1 && echo $(( x++ )) >> /tmp/num; done
                    '''
                }
            }
        }        
    }
}

"The same behaviour impacts us during any cluster node upgrades which terminates the pods and interrupts the whole pipeline. I don’t want jenkins agent pods to be removed whenever there is a terminate (SIGTERM) command, rather the agent should remain connected to controller(controller) and let my jenkins finish the pipeline since I already defined a long enough termination grace period.

Is there anyone who can suggest a solution ?

My advice here FWIW: do not rely on terminationGracePeriodSeconds and do not try to recover if a Pod is deleted. Instead either retry the whole block, where feasible, or if not then arrange for the Pod not to get deleted prematurely in the first place.

Problem fixed by trapping SIGTERM in agent pod spec. Trap helped for agent process to stay alive and connected to jenkins controller.

apiVersion: v1
kind: Pod
spec:
  terminationGracePeriodSeconds: 120
  containers:
  - name: jnlp
    command:
      - sh
      - -xc
      - | 
        trap 'echo "Pod Received SIGTERM, Pipeline will continue till terminationGracePeriodSeconds ends; $(date +"%d-%m-%Y %T.%N %Z")"' SIGTERM;
        /usr/local/bin/jenkins-agent