Jenkins issue: FATAL: command execution failed

Hi @all,
Hope you are doing great!
I’m getting below error while deploying microservice using Jenkins.
If anyone has experience with these errors, could you please share your insights or point me in the right direction?

/bin/sh -xe /tmp/jenkins46756789098765.sh
FATAL: command execution failed
java.io.IOException: error=0, Failed to exec spawn helper: pid: 4907, signal: 11
	at java.lang.ProcessImpl.forkAndExec(Native Method)
	at java.lang.ProcessImpl.<init>(ProcessImpl.java:314)
	at java.lang.ProcessImpl.start(ProcessImpl.java:244)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1110)
Also:   hudson.remoting.Channel$CallSiteStackTrace: Remote call to EC2 
		at hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1800)
		at hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:356)
		at hudson.remoting.Channel.call(Channel.java:1001)
		at hudson.Launcher$RemoteLauncher.launch(Launcher.java:1121)
		at hudson.Launcher$ProcStarter.start(Launcher.java:508)
		at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:144)
		at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
		at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:20)
		at hudson.model.AbstractBuild$AbstractBuildExecution.perform(AbstractBuild.java:803)
		at hudson.model.Build$BuildExecution.build(Build.java:197)
		at hudson.model.Build$BuildExecution.doRun(Build.java:163)
		at hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:513)
		at hudson.model.Run.execute(Run.java:1906)
		at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
		at hudson.model.ResourceController.execute(ResourceController.java:97)
		at hudson.model.Executor.run(Executor.java:429)
Caused: java.io.IOException: Cannot run program "/bin/sh" (in directory "/home/ec2-user/workspace/employee-ms"): error=0, Failed to exec spawn helper: pid: 4907, signal: 11
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1143)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1073)
	at hudson.Proc$LocalProc.<init>(Proc.java:252)
	at hudson.Proc$LocalProc.<init>(Proc.java:221)
	at hudson.Launcher$LocalLauncher.launch(Launcher.java:996)
	at hudson.Launcher$ProcStarter.start(Launcher.java:508)
	at hudson.Launcher$RemoteLaunchCallable.call(Launcher.java:1390)
	at hudson.Launcher$RemoteLaunchCallable.call(Launcher.java:1333)
	at hudson.remoting.UserRequest.perform(UserRequest.java:211)
	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
	at hudson.remoting.Request$2.run(Request.java:376)
	at hudson.remoting.InterceptingExecutorService.lambda$wrap$0(InterceptingExecutorService.java:78)
	at java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.lang.Thread.run(Thread.java:833)
Build step 'Execute shell' marked build as failure
Finished: FAILURE```

signal 11 means segmentation fault: https://www.cyberciti.biz/tips/segmentation-fault-on-linux-unix.html
Hard to tell what the problem is. But getting a segmentation fault in /bin/sh is strange.
Is it reproducible?

yes, it is reproducible

A very similar issue occours on our linux agents since some unattended linux updates today (in this example when trying to execute println "whoami".execute().text in the Script Console of the node):

java.io.IOException: error=0, Failed to exec spawn helper: pid: 1357112, signal: 11
	at java.base/java.lang.ProcessImpl.forkAndExec(Native Method)
	at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:314)
	at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:244)
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1110)
Caused: java.io.IOException: Cannot run program "whoami": error=0, Failed to exec spawn helper: pid: 1357112, signal: 11
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1143)
	at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1073)
	at java.base/java.lang.Runtime.exec(Runtime.java:594)
	at java.base/java.lang.Runtime.exec(Runtime.java:418)
	at java.base/java.lang.Runtime.exec(Runtime.java:315)
	at org.codehaus.groovy.runtime.ProcessGroovyMethods.execute(ProcessGroovyMethods.java:544)
	at org.codehaus.groovy.runtime.dgm$895.invoke(Unknown Source)
	at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
	at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:116)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:120)
	at Script1.run(Script1.groovy:1)
	at groovy.lang.GroovyShell.evaluate(GroovyShell.java:574)
	at groovy.lang.GroovyShell.evaluate(GroovyShell.java:612)
	at groovy.lang.GroovyShell.evaluate(GroovyShell.java:583)
	at hudson.util.RemotingDiagnostics$Script.call(RemotingDiagnostics.java:149)
	at hudson.util.RemotingDiagnostics$Script.call(RemotingDiagnostics.java:118)
	at hudson.remoting.UserRequest.perform(UserRequest.java:211)
	at hudson.remoting.UserRequest.perform(UserRequest.java:54)
	at hudson.remoting.Request$2.run(Request.java:377)
	at hudson.remoting.InterceptingExecutorService.lambda$wrap$0(InterceptingExecutorService.java:78)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:840)

I was able to fix it with this option: -Djdk.lang.Process.launchMechanism=vfork
(as mentioned here)

1 Like

Where can we add it in jenkins???
I need this config and saw the post. Is there like env section or something?

Hi @Dregos13 , it depends on your setup.
My Jenkins instance is running as a Linux service,
so I used sudo systemctl --full edit jenkins.service to get to the config file and added it to the JAVA_OPTS:

Environment="JAVA_OPTS=-Djava.awt.headless=true  \
        -Djdk.lang.Process.launchMechanism=vfork \
        -Djenkins.model.Jenkins.buildsDir=/mydata/JenkinsJobsBuilds/\\${ITEM_FULL_NAME}/builds  \
        -Djenkins.model.Jenkins.workspacesDir=/mydata/JenkinsJobs/workspace/\\${ITEM_FULL_NAME} "
2 Likes

Seeing the same thing in our Jenkins infrastructure.
Some agents ran unattended updates and picked up the security updates to OpenJDK 17 (17.0.10) that were made public today, and I am suspicious that is related.
However, I cannot reliably reproduce the spawn failure (yet).

What version of the JDK are your agents running?

I think that I just figured out what is provoking this issue.
If the JDK on the agent is updated via the unattended updates, but the Jenkins agent java process is not restarted, the spawn error occurs.
Disconnecting the agent and re-launching allows the script console snippet to run.

1 Like

In my case, I use with Jenkins controller agent on AWS with Ec2 Plugin

On the Jenkins controller, I set the setting JAVA_OPTS=-Djava.awt.headless=true
On the Jenkins agent, I create new golden image from the agent (It run failed the jobs) and use it as the image for creating Jenkins agent in the next time
It is working well at the moment

Thank you, that solved my pain!

Yes, there was (among others) a change in the version of the package openjdk-17-jre-headless:amd64:
from 17.0.9+9-1~22.04 to 17.0.10+7-1~22.04.1
and the issue didn’t occour on the nodes that where still on 17.0.9+9-1~22.04

We had this unwelcome behaviour this morning for no obvious reason - the Jenkins Controller was simply trying to execute git to retrieve a JenkinsFile from the repository.
After much digging around and confirming that the jspawn file exists etc etc… I simply restarted Jenkins and now it’s working fine.
You can do this via the web interface or do:
sudo systemctl stop jenkins
sudo systemctl start jenkins

2 Likes

This problem cropped up for my team yesterday. We use Jenkins on ubuntu 22.04. Many builds ran fine and then several failed quickly with this error yesterday morning. We restarted Jenkins and builds ran fine for about 24 hours until they started failing with the same error. We have not manually done updates recently, but I’m curious what folks mean by “Some agents ran unattended updates” - does Ubuntu 22.04 automatically update itself?

Our openjdk-17-jre-headless version is 17.0.10+7-1~22.04.1

We followed instructions here:

  1. sudo systemctl --full edit jenkins.service
  2. Find the line with JAVA_OPTS
  3. Add in a new line: \ -Djdk.lang.Process.launchMechanism=vfork "
  4. stop jenkins service sudo systemctl stop jenkins
  5. start jenkins service sudo systemctl start jenkins

I’m surprised there’s not a new Jenkins package to incorporate this fix.

2 Likes

Modern versions of Ubuntu, by default, will perform unattended updates for packages with security fixes.

We have a fleet of agents that auto-scale. The AMI that they launch had OpenJDK 17.0.9. If the agent was up and in use long enough, the unattended update would update the OpenJDK package. This had the effect of deleting the existing files/directories, but left the agent process running. When that agent process attempts to use jspawnhelper, it is no longer present where it thinks it should be, and boom!

Our solution was to update our AMI to apt-mark hold the OpenJDK package so that the unattended updates should no longer affect it. We have a regular schedule of updating the AMI to pick up security updates and such, so we are not concerned about that aspect.

2 Likes

After updating another machine I can confirm what was mentioned in this post: simply disconnecting and reconnecting the node fixes the issue.

This time the issue occoured although -Djdk.lang.Process.launchMechanism=vfork had already been in place, so it effectively doesn’t fix it. So I removed this setting and everything is still working fine.

for the records:
see also Loading...
and Bug #2055280 “openjdk-17-jre-headless 17.0.10+7-1~22.04.1: segfa...” : Bugs : openjdk-17 package : Ubuntu

Thanks. Just restarting jenkins service in the linux agent resolved the issue

1 Like

How will this work if Jenkins is run as a Docker container? We don’t run jobs on the controller, only on build slaves and am getting this error. I asked the question in Stack Overflow

If you’re running the jobs on a static agent that is having its Java version upgraded while the agent is running, then that is the likely cause of the problem. Changing the Java property seems to be hiding the danger that you’re upgrading the Java installation of a running agent.

If you’re running the jobs on an agent defined with a container, then something seems wrong in the agent. A containerized agent should not have the issue because the version of Java in the containerized agent should not change while the agent is running.