Proposal Discussion: Local-First Workflow Assistant

meetgoti07 · February 17, 2026, 11:14am

Hi everyone,

I am researching the “AI Chatbot to Guide User Workflow” project for GSoC 2026. My vision is to build a Local-First Agent that doesn’t just answer documentation questions but actively assists users by diagnosing build failures and suggesting configuration fixes.

To ensure this is viable without cloud APIs (OpenAI/Claude), I have drafted an architecture based on a Java Plugin + Python Sidecar pattern running a quantized local model (e.g., Llama-3-8B or Phi-3).

Before finalizing my proposal, I would love feedback from the mentors on a few architectural assumptions:

1. Deployment & Architecture

Question: To run the Python-based AI stack (llama-cpp-python), would you prefer a Managed Process approach (where the plugin manages a local venv and subprocess) or a Docker Sidecar approach?
- My Preference: Managed Process, as it allows the plugin to work on instances where Docker might not be available or the socket is restricted.

2. Hardware Constraints

Question: What is the “Minimum Viable Hardware” I should target for a standard Jenkins Controller?
- Assumption: I am designing for instances with at least 4GB of spare RAM available for the AI. If the detected RAM is lower, I plan to disable the feature or fallback to a tiny model (e.g., Qwen-1.5B).

3. Agent Autonomy Level

Question: For “Workflow Guidance,” do we want the agent to be purely advisory (Read-Only), or can it propose actions (e.g., “Trigger build with clean parameters”)?
- Proposal: “Human-in-the-Loop” execution. The agent proposes a tool call, but the UI requires explicit user confirmation to execute it.

4. Binary Distribution

Question: Since LLM models are large (GBs), I plan to make the plugin a “Loader” that downloads the GGUF model from HuggingFace on the first launch, rather than bundling it in the .hpi file. Is this acceptable?

Any guidance would be greatly appreciated!

krisstern · February 19, 2026, 11:37am

Hi Meet @meetgoti07, thank you for your interest in the project! I think the main focus would be to have a prototype of the plugin to have the capability to answer user questions regarding the various possible workflows first, before we can add a local-first agent environment to the plugin to enhance the chatbot. I think we could have a way for the user to provide their own API key for cloud-based LLM’s like Gemini and Claude, but a local LLM is a must for the default mode of operation.

For the rest of your questions, once the list of GSoC 2026 mentoring orgs has been confirmed by Google we will set up a Google form for you to interact with the mentoring team and get feedback, so now maybe a bit early for that.

meetgoti07 · February 19, 2026, 4:22pm

Thank you @krisstern for the clarification.

Giving feature of Bring Your Own Keys is a great idea.

sharma-sugurthi · March 7, 2026, 7:50am

hey Meet, really cool idea with the local-first approach.

I was looking at points 2 and 4 and had a quick question. since Jenkins controllers notoriously eat up whatever Java heap space they can get, running the LLM inference directly on the controller might risk OOM crashes if there isn’t actually 4GB to spare(I actually recently raised a PR in the repo regarding developer experience and resource overhead, so this kind of performance also i thought of). also, for the model download on first launch-how would this handle enterprise Jenkins instances that are air-gapped and block outbound internet access to places like HuggingFace?

curious to hear your thoughts and ideas on handling those environments!!!

meetgoti07 · March 28, 2026, 5:12am

Hey,

These are valid concerns, and I’ve already shifted to a sidecar architecture to handle this.

LLM inference runs in a separate Python process, not inside the Jenkins JVM. The JVM just proxies requests over localhost, and the sidecar has a memory cap (~1GB), so it won’t impact controller heap. Even if it crashes, Jenkins stays unaffected.

Also, default mode is BYO-LLM, so inference happens on external providers. Local models are optional, not core.

For air-gapped setups, admins can point to internal LLM endpoints, and models can be pre-provisioned locally via config instead of downloading at runtime.

Would love to check out your PR as well.

Topic		Replies	Views
GSoC 2026 Proposal Discussion: AI Chatbot to guide user workflow GSoC	8	312	March 31, 2026
GSoC'26 Discussion : AI Chatbot to guide user workflow GSoC gsoc	14	308	March 10, 2026
[GSOC 2026] Bringing Context-Aware, Sovereign AI to Jenkins Workflows GSoC docker , pipeline	0	30	March 26, 2026
[GSoC 2026] Meet Goti \| Proposal Discussion \| AI Chatbot to Guide User Workflow GSoC gsoc	0	84	March 28, 2026
[GSoC 2026 INTRODUCTION] Kammari Ashritha - AI Chatbot to Guide User Workflow GSoC gsoc	2	50	March 23, 2026

Proposal Discussion: Local-First Workflow Assistant

Related topics