Issue: java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached

Hey Experts,
We are using
Jenkins: 2.426.3
Java: 11.0.23
Tomcat Apache: 9.0.85

We have seen our Jenkins randomly loose JNLP connections due to

2024-08-07 13:28:20.453	
07-Aug-2024 07:58:20.453 SEVERE [TCP agent listener port=50000] hudson.TcpSlaveAgentListener.run Failed to accept TCP connections
	java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached
		at java.base/java.lang.Thread.start0(Native Method)
		at java.base/java.lang.Thread.start(Thread.java:803)
		at hudson.TcpSlaveAgentListener.run(TcpSlaveAgentListener.java:188)

Our JVM Opts are currently set to

export JAVA_OPTS="-DJENKINS_HOME=$JENKINS_HOME \
-XX:ReservedCodeCacheSize=1024m -XX:+UseCodeCacheFlushing -Xms$MIN_HEAP_SIZE -Xmx$MAX_HEAP_SIZE \
-XX:+UseG1GC -XX:G1ReservePercent=20 \
-Xlog:gc:/usr/local/tomcat/logs/jenkins.gc-%t.log \
-Xlog:gc \
-Xlog:age*=debug \
-Dhudson.ClassicPluginStrategy.useAntClassLoader=true \
-Dkubernetes.websocket.ping.interval=20000 \
-Dhudson.slaves.SlaveComputer.allowUnsupportedRemotingVersions=true \
-Djava.awt.headless=true -Dhudson.slaves.ChannelPinger.pingIntervalSeconds=300"
export JAVA_OPTS="$JAVA_OPTS -Dhudson.model.DirectoryBrowserSupport.CSP=\"default-src 'none' netdna.bootstrapcdn.com; img-src 'self' 'unsafe-inline' data:; style-src 'self' 'unsafe-inline' https://www.google.com ajax.googleapis.com netdna.bootstrapcdn.com; script-src 'self' 'unsafe-inline' 'unsafe-eval' https://www.google.com ajax.googleapis.com netdna.bootstrapcdn.com cdnjs.cloudflare.com; child-src 'self';\" -Dcom.cloudbees.hudson.plugins.folder.computed.ThrottleComputationQueueTaskDispatcher.LIMIT=100"

with Max Heap Size = 200GB & Min Heap Size = ~12GB
We are using a OL8 VM with 80 OCPU(s) and 1280GB Memory.

Our Jenkins ulimts are set to

cat >>/etc/security/limits.conf <<EOF
<user>    soft   nofile    16384
<user>     hard   nofile    65536

<user>     soft   nproc    16384
<user>     hard   nproc    32768
EOF

Also, attaching the threadDump collected
jstack_2024-08-07_699666.log (1.2 MB)

Dynamically seeing

Threads on xxx-jenkins@<IP>: Number = 522, Maximum = 1,317, Total started = 860,454

We have good number of threads either on WAITING or TIMED_WAITING

Can someone assist is analysing the issue further/steps that can be taken to avoid this issue in the future.

Regards

Note: Restarting the services fixes the issues.

We have configured Tomcat Apache server.xml with the following maxThreads count

    <Connector port="8080" protocol="HTTP/1.1"
               connectionTimeout="30000"
               redirectPort="8484" />
    <Connector port="8181" protocol="HTTP/1.1"
               maxThreads="300"
               connectionTimeout="30000"
               redirectPort="8484" />
    <Connector port="8484" protocol="org.apache.coyote.http11.Http11NioProtocol"
               maxThreads="450" maxHttpHeaderSize="1048576" SSLEnabled="true" keepAliveTimeout="300000">

Since the issue is seen w.r.t port 50000 which is the default Jenkins JNLP Port, is there a way we can control the threads spun-up w.r.t the Port 50000

Hi @hpriya we’re encountering the same issue with random ‘Out of Memory’ errors in completely random stuff. (I’ve never had any problem like this in my 3 years of working with Jenkins.)

I’m not sure but my guess is some plugin or java related leaking memory after the last update.

@lucasdj Plugins yes, Analysing the heap-dump certainly helps memory-leak-issues.

We are trying to upgrade the plugins to as latest version available as possible compatible with the Jenkins Core version. Right now, we are on 4.462.3, Java 11.0.26.

Got some additional JAVA_OPTS we are testing on

-Djava.util.concurrent.locks.ReentrantReadWriteLock.timeout
-Djenkins.remoting.pooledThreadCount
-Djenkins.remoting.maxThreadCount
-XX:MaxPermSize

Hopefully this helps!