Jenkins controller <---> agent communication with Docker Swarm

js3344 · November 22, 2024, 4:41pm

Jenkins setup:

Jenkins 2.440.2-lts w/ JDK17
Agents using JDK11
Docker Swarm cluster w/ 3 masters and 6 workers
Everything runs in AWS on the same VPC and connectivity has been successfully tested

Please help! My last post was hidden for some reason. We have a pretty big production issue where our controller node(s) are unable to communciate with our worker nodes after having to restore the system after a botched upgrade of the controller to 2.479.1. We had to create 3 new controller nodes and were able to get the controller up and running, but anytime a job is scheduled, it says that any agents are offline. This is the error:

[4:36:23 PM] Creating Service with Name : agt-_portal_PR_69_2-1246
java.net.SocketException: Broken pipe
at java.base/sun.nio.ch.NioSocketImpl.implWrite(Unknown Source)
at java.base/sun.nio.ch.NioSocketImpl.write(Unknown Source)
at java.base/sun.nio.ch.NioSocketImpl$2.write(Unknown Source)
at java.base/java.net.Socket$SocketOutputStream.write(Unknown Source)
at java.base/sun.security.ssl.SSLSocketOutputRecord.encodeChangeCipherSpec(Unknown Source)
at java.base/sun.security.ssl.OutputRecord.changeWriteCiphers(Unknown Source)
at java.base/sun.security.ssl.ChangeCipherSpec$T10ChangeCipherSpecProducer.produce(Unknown Source)
at java.base/sun.security.ssl.Finished$T12FinishedProducer.onProduceFinished(Unknown Source)

poddingue · November 26, 2024, 11:05am

The java.net.SocketException: Broken pipe error usually hints at some communication troubles between your Jenkins controller and agents. Here’s a relaxed way to tackle and possibly fix this issue:

Check Network Connectivity: Let’s start simple. Ensure your network connections between the Jenkins controller and agents are solid. You might want to use tools like ping and telnet to see if everything’s communicating properly.
Firewall and Security Groups: Double-check that your firewall settings and, if you’re using AWS, your security groups are allowing traffic to flow freely on the necessary ports. For instance, JNLP agents typically use port 50000.
Agent Configuration: Make sure that all agents are set up correctly to link up with the controller nodes. If there’ve been any changes in your setup, you might need to update these configurations.
Search in Jenkins Logs: If you’re still stuck, the Jenkins controller and agent logs are good places to dig deeper. They can sometimes tell you more about what’s causing the communication mishaps.
Docker Swarm Configuration: If you’re using Docker Swarm, ensure all your configurations are spot on, and that services are running as expected. It’s crucial that Swarm nodes can talk to each other without any hitches.
SSL/TLS Configuration: Since the error mentions SSL/TLS, check to make sure your SSL/TLS certificates are properly set up. A misconfigured certificate or a broken chain can also cause these errors.
Restart Jenkins Services: When all else fails, a good old restart can sometimes do the trick. Try rebooting your Jenkins controller and agents to see if that clears up any transient issues.

Hope these steps help you get back on track!

Topic		Replies	Views
Jenkins Master <----> Worker Docker Swarm Communication Ask a question	1	33	November 26, 2024
JVM Version Incompatibility Between Jenkins Controller and Agent after the plugin updates Ask a question controller	3	130	May 8, 2025
Jenkins agents randomly disconnecting from controller Ask a question question	7	15265	April 29, 2022
After connecting jenkins agent to jenkins controller, connection is broken after a few hours Ask a question question	1	3724	September 5, 2022
Agents not able connect to controller (already connected agents are fine) Using Jenkins	3	2687	June 9, 2023

Jenkins controller <---> agent communication with Docker Swarm

Related topics