Proposal Discussion: Local-First Workflow Assistant

Hi everyone,

I am researching the “AI Chatbot to Guide User Workflow” project for GSoC 2026. My vision is to build a Local-First Agent that doesn’t just answer documentation questions but actively assists users by diagnosing build failures and suggesting configuration fixes.

To ensure this is viable without cloud APIs (OpenAI/Claude), I have drafted an architecture based on a Java Plugin + Python Sidecar pattern running a quantized local model (e.g., Llama-3-8B or Phi-3).

Before finalizing my proposal, I would love feedback from the mentors on a few architectural assumptions:

1. Deployment & Architecture

  • Question: To run the Python-based AI stack (llama-cpp-python), would you prefer a Managed Process approach (where the plugin manages a local venv and subprocess) or a Docker Sidecar approach?

    • My Preference: Managed Process, as it allows the plugin to work on instances where Docker might not be available or the socket is restricted.

2. Hardware Constraints

  • Question: What is the “Minimum Viable Hardware” I should target for a standard Jenkins Controller?

    • Assumption: I am designing for instances with at least 4GB of spare RAM available for the AI. If the detected RAM is lower, I plan to disable the feature or fallback to a tiny model (e.g., Qwen-1.5B).

3. Agent Autonomy Level

  • Question: For “Workflow Guidance,” do we want the agent to be purely advisory (Read-Only), or can it propose actions (e.g., “Trigger build with clean parameters”)?

    • Proposal: “Human-in-the-Loop” execution. The agent proposes a tool call, but the UI requires explicit user confirmation to execute it.

4. Binary Distribution

  • Question: Since LLM models are large (GBs), I plan to make the plugin a “Loader” that downloads the GGUF model from HuggingFace on the first launch, rather than bundling it in the .hpi file. Is this acceptable?

Any guidance would be greatly appreciated!

Hi Meet @meetgoti07, thank you for your interest in the project! I think the main focus would be to have a prototype of the plugin to have the capability to answer user questions regarding the various possible workflows first, before we can add a local-first agent environment to the plugin to enhance the chatbot. I think we could have a way for the user to provide their own API key for cloud-based LLM’s like Gemini and Claude, but a local LLM is a must for the default mode of operation.

For the rest of your questions, once the list of GSoC 2026 mentoring orgs has been confirmed by Google we will set up a Google form for you to interact with the mentoring team and get feedback, so now maybe a bit early for that.

Thank you @krisstern for the clarification.

Giving feature of Bring Your Own Keys is a great idea.

hey Meet, really cool idea with the local-first approach.

I was looking at points 2 and 4 and had a quick question. since Jenkins controllers notoriously eat up whatever Java heap space they can get, running the LLM inference directly on the controller might risk OOM crashes if there isn’t actually 4GB to spare(I actually recently raised a PR in the repo regarding developer experience and resource overhead, so this kind of performance also i thought of). also, for the model download on first launch-how would this handle enterprise Jenkins instances that are air-gapped and block outbound internet access to places like HuggingFace?

curious to hear your thoughts and ideas on handling those environments!!!