Git plugin checkout behaviour

I’m using following checkout config in my Jenkins pipeline.

checkout(
      [$class: 'GitSCM',
      branches: [[name: "*/${GIT_BRANCH}"]],
      doGenerateSubmoduleConfigurations: true,
      extensions: scm.extensions + [[$class: 'SubmoduleOption', parentCredentials: true]],
      userRemoteConfigs: [[credentialsId: 'creds',
      url: 'git@github.com:repo.git']]]
)

It then clones the entire repository for a specific branch. If I run the job from the same branch again it will just pull the changes which is fine. However if I run a different branch it will clone the entire repo again, so my workspace looks something like this:

636M    <job>_main
4.0K    <job>_main@tmp
636M    <job>_branch1
4.0K    <job>_branch1@tmp
636M    <job>_branch2
4.0K    <job>_branch2@tmp

This isn’t sustainable because the workspace just keeps growing. Hence, how can I change this so the new branches just do a pull on an existing one (e.g. main) and save the time and disk space needed for cloning it every time.

That looks a lot like a multi-branch Pipeline. Multi-branch Pipelines automatically create jobs when new branches are detected that contain a Jenkinsfile. Multibranch Pipelines automatically delete jobs when branches are deleted. While the branches exist, they have separate folders that store the job and the build results from the job.

Reusing the same folder for two different branches will confuse users and confuse Jenkins. The history displayed will be incorrect. There will be other surprises. You generally want to let multibranch Pipelines handle the workspaces (directories) for you and accept that each branch gets its own.

If you’d like to save disc space and reduce clone time, you could use the hints from “Git in the Large” at:

That talk guides you to use the following techniques to reduce disc use and improve clone performance:

  • Narrow refspecs to clone only the branches that you need
  • Reference repository to reduce the number of copies of history on the disc
  • Shallow clone to reduce the amount of history copied

Those techniques can be applied to any Jenkins job that uses the git plugin.

Thanks for the tips!

Would any of those features interfere with the deletion of deleted branches’ workspaces?

No, they would not. When git uses a reference repository, it creates a pointer from the new workspace to the reference repository.

1 Like

Interestingly, with shallow clone (depth=1) and refspec limited to one branch the size of the local repository hasn’t changed at all. I was also trying with the reference path but was getting “Cannot update the submodule” error every time I’ve added Advanced Submodule Options to the config in the console. Is it possible to add it in the pipeline? I couldn’t find it in the docs.

Update:
The size didn’t change because most of it is in the submodule. I’m trying to put a shallow clone on the submodule but getting the above error.

Update 2:
Got it working in the end. Thanks again for the help!

One last question. When I provide a reference path to a local repository then all the history and branches will stay there. Therefore, I don’t need extra options like shallow clone or custom refspec as they won’t make any difference. Am I correct to assume that?

The reference repository includes the history of the repository up to the point in time when it was last updated. Most reference repositories are not updated as frequently as the workspace repository. As more and more commits, tags, and branches arrive in the workspace that are not on the reference repository, those commits, tags, and branches will be transferred from the remote git repository to the local workspace. Shallow clone and custom refspec can reduce that data transfer. Updating the reference repository with most recent content can also reduce that data transfer.

1 Like