GSoC 2026 Proposal Discussion: AI Chatbot to guide user workflow

thatsdc · January 30, 2026, 5:58am

Hello everyone! I’m Daniele Caldarigi, a 2nd year CS Student from Italy. I’m very excited about the AI Chatbot project draft and I’m currently drafting a detailed technical proposal.

I have a strong background in Java, Python (FastAPI), and React, which I believe fits the requirements for this project perfectly. I’m particularly interested in implementing a RAG-based architecture to handle Jenkins documentation.

I have a few questions to better refine my proposal:

Persistence: Should the plugin support multiple persistent chat sessions (like ChatGPT), or is a single ephemeral session preferred for specific workflow assistance?
Tech Stack: Do you have specific preferences or constraints for the AI orchestration layer (e.g., LangChain, LlamaIndex) and the vector database?
Communication: Would you prefer to discuss the proposal details here on the forum or on a shared Google Doc later on?

Regarding contributions, I am already looking into some issue to familiarize myself with the Jenkins ecosystem. Are there any specific areas or plugins related to this project that you would recommend exploring?

Looking forward to your feedback! Best regards, Daniele

panicking · January 30, 2026, 6:56pm

Hi Daniele

Welcome on board then. Today we really discuss on AI related topic in Jenkins and as you mention there are several inactive at the moment. Some of the people are at fosdem but I think that they will show up

krisstern · January 30, 2026, 7:18pm

Hi Daniele,

For the GSoC 2026 proposal we prefer to use a running Google Doc for discussion. As per your few questions:

I think it would be best if we could support multiple persistent chat sessions.
I prefer LangChain for the AI orchestration layer, no preference on the vector database though.
We could follow up here as well as on Gitter/Matrix.

You should try to experiment with Jenkins as a user first, try to use a few of our more popular plugins. For plugin popularity maybe you could refer to https://plugins.jenkins.io/.

thatsdc · February 1, 2026, 5:50pm

Hi Kris ,

Thank you for your feedback! I’ve been experimenting with the Jenkins ecosystem and exploring the core plugins to find a suitable first issue to tackle.

Regarding the AI Chatbot project, I have a couple of questions to better align my proposal with the organization’s vision:

LLM Strategy: Does the Jenkins org have a preference between Open Source/Self-hosted LLMs (e.g., Llama 3 via Ollama/EC2) versus Cloud-based APIs (e.g., OpenAI/Gemini)? I’m considering a provider-agnostic approach using LlamaIndex, but knowing your preference for the final production environment would be very helpful.
Proposal Sharing: I’m currently drafting the Google Doc. What is the preferred way to share it for initial feedback? Should I post the link here in the public channel for community review, or is there a specific process for draft submissions?

krisstern · February 2, 2026, 9:58pm

Hi Daniele,

Sounds good!

Regarding your quesions:

We do prefer open-source / self-hosted LLMs over cloud-based and proprietary solutions.
We currently do not have a dedicated channel for sharing your draft as a Google doc with us now but later on we will have a Google form you can use to share this with us discretely once Jenkins has been accepted as a GSoC 2026 mentoring org. We will not know until late February though.

thatsdc · February 8, 2026, 6:53pm

Hi Kris ,

Thanks again for your previous feedback.

After getting familiar with Jenkins, I’ve made my first contribution and am looking forward to working on other issues. In the meantime, I’m refining the architecture for the AI Chatbot project proposal. I’m currently evaluating two different approaches for the AI backend and would love to hear the community’s preference:

Local-first (Subprocess/Launcher): Running Python scripts directly on the Jenkins Controller. While it’s more “compact,” it couples the heavy LLM workload with Jenkins’ core processes.
Decoupled (REST API/FastAPI): Communication via HTTP with an external backend. This allows for both self-hosted configurations (e.g., via Docker/Ollama) and cloud-based options (AWS/OpenAI), ensuring Jenkins’ stability and scalability.

Regarding the models, I have one question: is an open-weight model (like Llama or Mistral) acceptable, or does the community strictly prefer a fully open-source model (including training data and code)?

krisstern · February 11, 2026, 12:13pm

We prefer open-source if possible. But if needs be we can also consider supporting open-weight options.

thatsdc · February 14, 2026, 1:43pm

Hi Kris,

I have some updates regarding the proposal and the plugin’s architectural design. I’ve developed a plugin prototype to test the core integrations and have made progress in the following areas:

Context Awareness: I have identified the key Jenkins APIs and extension points needed to fetch useful context for the assistant (e.g., pipeline details, run logs, Jenkins version, and installed plugins).
Frontend: I successfully integrated a React build that renders a floating action button and a chat sidebar directly within the Jenkins UI.
Backend: I have decided on a decoupled approach using FastAPI. This ensures process isolation for the LLM (protecting Jenkins’ stability) and allows for easier scalability, giving users the flexibility to decide where to host the backend service.

I am currently working on the backend architecture and selecting the specific tools it will rely on. As we discussed, the core tech stack will be completely open-source so the plugin won’t strictly depend on external proprietary parties. However, the backend will be LLM-agnostic, allowing users to plug in either locally hosted open-source models or third-party APIs (like Claude, Gemini, OpenAI, etc.).

Current Architecture Flow:

Plugin & Frontend: The Java plugin extracts the necessary contextual information (current screen, focused pipeline/project, run/build configs, and logs). This data is passed to the React frontend, which will manages the chat interface and sends the compiled context alongside the user’s prompt to the backend.

image2879×1515 283 KB

(here in the image we can see the plugin reading the context information about a specific failed run)
Backend: I am currently working on the chat history management to ensure the assistant maintains conversational context. The actual chat history will be stored in a local DB (e.g., SQLite) to be quickly retrieved when the user opens the chat sidebar.

I would greatly appreciate any feedback or suggestions you might have regarding this architecture.

I also have a couple of specific questions about the backend implementation I’d love your input on:

Streaming Protocol: For the communication between Jenkins and the FastAPI backend, would you prefer an approach based on SSE (Server-Sent Events) which is standard for chatbot streaming—or do you see any specific advantages in using WebSockets for this use case?
Log Chunking Strategy: Given that Jenkins logs can be extremely voluminous, which strategy do you think is most appropriate to avoid saturating the LLM’s context window? Should I prioritize feeding the LLM just the tail of the log (where errors usually appear), or would it be better to implement a Selective Retrieval system using LlamaIndex to semantically search the logs?

kietcoderlor · March 31, 2026, 7:57am

Hi, I’m Kiet Nguyen (GitHub/Gitter: kietcoderlor), freshman in AI. I’m applying for the AI Chatbot to Guide User Workflow project.

I’ve been going through the existing resources-ai-chatbot-plugin codebase to understand the current architecture - the FastAPI + LlamaIndex pipeline, FAISS/BM25 hybrid retrieval, and the all-MiniLM-L6-v2 embedding setup. My proposal focuses on the documentation-grounded guidance layer: structured retrieval over Jenkins docs for four user intents (Learn, Configure, Troubleshoot, Discover), with a built-in evaluation pipeline.

Reading Daniele’s updates, I see his direction leans more toward live build diagnostics and Jenkins API context. Mine is a bit different, which is mostly focused on helping users who don’t yet know the right terminology or which plugin to use, before they have a failing build to debug.

I’ve 2 quick questions:

Is there room for a proposal that stays focused on the documentation/retrieval layer, or does the 2026 idea expect live build context to be part of the scope?
For the LLM, is llama.cpp with a quantized model the right starting point given the open-source preference?

Thanks!
Kiet

Topic		Replies	Views
[GSoC 2026 PROPOSAL] Karishma - Continue AI-Powered Chatbot for Quick Access to Jenkins Resources GSoC	4	80	March 29, 2026
[GSoC 2026 PROPOSAL] Kiet Nguyen - AI Chatbot to Guide User Workflow GSoC	1	38	April 1, 2026
GSoC'26 Discussion : AI Chatbot to guide user workflow GSoC gsoc	14	248	March 10, 2026
[GSoC 2026 Proposal] AI-powered Jenkins chatbot - Feedback Request GSoC	0	24	March 29, 2026
[GSoC 2026] Introduction - AI Powered Chatbot for Quick Access to Jenkins Resources GSoC	1	27	March 31, 2026

GSoC 2026 Proposal Discussion: AI Chatbot to guide user workflow

Related topics