Jenkins job consumes all of the inodes and then crashes the controller

I have a Jenkins job that is pretty simple. It gets some data from an API and then it runs some shell commands. It doesn’t checkout code and it doesn’t create any files.

The job does run about 100 tasks in parallel but it does so in batches of 3.

The task pretty much looks like this:

node() {
  sh('aws s3 ls')
  sh('aws s3 sync')
}

But for someone reason it’s creating a large number of files on every run… and I run this hourly on a cron.

[root@jenkins builds]# find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -nr
   4756 122
   4756 118
   4756 117
   4756 116
   4756 115
   4756 114
   4756 113
   4756 112
   4756 111

So it’s creating 4756 files per build?

So all the files are in $BUILD_NUMBER/workflow and it’s a ton of xml files.

When I look inside these files it appears its the result of CPS of my pipeline:

[root@jenkins workflow]# cat 4155.xml
<?xml version='1.1' encoding='UTF-8'?>
<Tag plugin="workflow-support@3.5">
  <node class="cps.n.StepAtomNode" plugin="workflow-cps@2.82">
    <parentIds>
      <string>4154</string>
    </parentIds>
    <id>4155</id>
    <descriptorId>org.jenkinsci.plugins.workflow.steps.durable_task.ShellStep</descriptorId>
  </node>
  <actions>
    <cps.a.ArgumentsActionImpl plugin="workflow-cps@2.82">
      <arguments>
        <entry>
          <string>script</string>
          <string>    #!/usr/bin/env bash
    aws --region us-east-2 s3 ls
    </string>
        </entry>
      </arguments>
      <isUnmodifiedBySanitization>true</isUnmodifiedBySanitization>
    </cps.a.ArgumentsActionImpl>
    <wf.a.TimingAction plugin="workflow-api@2.40">
      <startTime>1637034210307</startTime>
    </wf.a.TimingAction>
    <s.a.LogStorageAction/>
  </actions>

It probably doesn’t help that the Jenkins I’m using is currently outdated (2.235.5), same with all the plugins it’s running. We are currently testing 2.303.3 and all plugins updated, so in the next 2 weeks, I should be able to tell if it has something to do with that.

In the meantime I was wondering what is the best way to fix this for now? The job history really isn’t that important, so I could restrict it to only keep 2 or 3 builds but maybe there is a better way to stop this from happening?

If the plugin versions that you are using will support it, you could adjust the speed / durability settings of that job in hopes that it will write fewer files. See Scaling Pipelines for more information on the speed / durability settings that can be adjusted.

1 Like

Excellent link @MarkEWaite, I have seen the durability setting before and knew roughly what it did in terms of CPS but never read about it in detail before.

Yes, if your Pipeline stores large files or complex data to variables in the script, keeps that variable in scope for future use, and then runs steps. This sounds oddly specific but happens more than you’d expect.

:eyes:

This is exactly what I am doing… brb let me try this fix.

Yup, putting it in performance mode, that did the trick.

[root@jenkins builds]# find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -nr
   4756 124
   4756 123
   4756 122
     47 127
     47 126
     47 125
      1 permalinks
      1 legacyIds

Builds 127,126 and 125 are running with lower durability, but a huge decrease in files created.

1 Like