The Jenkins service is running version 2.426.1 and is in production usage, utilized by multiple teams.
The service intermittently throws an error: ‘java.lang.OutOfMemoryError: unable to create native thread,’ causing it to stop working as expected. Upon restart, it functions normally for a few days (typically 4 to 5 days) before encountering the same error again.
Initial investigation reveals that the service crashes when it reaches its maximum allowable tasks (Java threads), which is 4915.
After collecting a thread dump and analyzing it with tools, the following stack trace output was obtained.
Before restart:
Count | Line |
---|---|
4849 | java.base@11.0.16/java.lang.Thread.run(Thread.java:829) |
4595 | java.base@11.0.16/java.lang.Thread.sleep(Native Method) |
4583 | org.tmatesoft.svn.core.internal.io.svn.StreamLogger$$Lambda$896/0x00007f000b4e5960.run(Unknown Source) |
4583 | org.tmatesoft.svn.core.internal.io.svn.StreamLogger.lambda$new$0(StreamLogger.java:58) |
229 | java.base@11.0.16/jdk.internal.misc.Unsafe.park(Native Method) |
160 | java.base@11.0.16/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234) |
After restart:
Count | Line |
---|---|
143 | java.base@11.0.16/java.lang.Thread.run(Thread.java:829) |
99 | java.base@11.0.16/jdk.internal.misc.Unsafe.park(Native Method) |
76 | java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) |
76 | java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1114) |
52 | java.base@11.0.16/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:234) |
47 | java.base@11.0.16/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1054) |
It has been observed that the sleep thread (major number) is being terminated after the service restarts.
Assistance is needed to identify this thread leak issue and determine which build or job is causing it. Thank you in advance.