[GSoC 2026] Continue AI-Powered Chatbot - Contributor Introduction + Qns

GunaPalanivel · March 9, 2026, 3:57pm

Hi Jenkins community,

I’m Guna Palanivel, applying for GSoC 2026: Continue AI-Powered Chatbot for Quick Access to Jenkins Resources (clarifying this is the resource-ai-chatbot-plugin continuation, not the user workflow guidance project).

Background

I’ve been contributing to jenkinsci/resources-ai-chatbot-plugin since January 2026:

5 merged PRs: Crawler fix (#62), WebSocket streaming (#68), file upload (#61), auth cleanup (#158), TBD
4 PRs in review: Jenkins auth (#105), streaming UI (#91), pipeline config (#113), E2E tests (#261)
13 issues filed: Memory serialization (#207), dead reformulation loop (#191), config routing (#221), and more

Full history: https://github.com/jenkinsci/resources-ai-chatbot-plugin/pulls?q=is:pr+author:GunaPalanivel

Proposal Focus
My proposal addresses the incomplete GSoC 2025 work with three phases:

Phase 1 (Weeks 1-4): Stabilize Core

Fix 9 identified bugs (StackOverflow stub, relevance scoring, dead code)
Add E2E test framework (currently 0% → 80%+ coverage)
Weekly measurement: Test coverage % (60→85%), P95 latency tracking

Phase 2 (Weeks 5-8): Agentic Mode + Multi-Turn

Reflection-based retrieval (fix issue #191: query reformulation loop)
Sliding window memory (fix issue #207: unbounded growth)
Weekly measurement: Reflection convergence rate, memory efficiency

Phase 3 (Weeks 9-12): Evaluation Pipeline

Dataset: 100+ Q/A pairs (JSON + CSV), Jenkins Core/Plugins/Errors
Framework: Ragas (Faithfulness >0.85, Context Recall >0.80, Answer Relevance >0.75)
Judge LLM: Mistral 7B Instruct Q5_K_M (avoids self-evaluation bias vs chatbot’s Q4_K_M)
CI trigger: run-eval label on PRs (not every push)
Jenkins auth integration (#78)
Dataset versioning: weekly CI check for stale URLs, quarterly refresh

Questions for Mentors

Phase-by-phase measurement: I saw @berviantoleo 's feedback about evaluating in each phase. I’ve added weekly metrics (coverage %, latency, reflection quality). Does this meet expectations?
Judge LLM quality: Is local Mistral Q5_K_M sufficient as judge, or should I target 13B+ parameter models? Groq API is feature-flagged as fallback for labeled PRs.
Dataset validation: 100 queries as minimum—is mentor validation of a subset (e.g., 20 queries) feasible during community bonding?
Stretch goal: Issue #69 (log analysis agent) involves knowledge graph construction. Defer to post-GSoC, or attempt Week 13 if ahead of schedule?

Looking forward to feedback from @krisstern as well.

Draft proposal: Submitted Via Google Docs link

Thanks,
Guna

berviantoleo · March 10, 2026, 2:47am

I won’t push any candidates to achieve really high test coverages. I consider 80% is nice to have, but not mandatory to achieve. It’s not an easy part to ensure the functionality is testable.

I can’t answer the other questions right now. For detailed feedback, I will review the draft proposal after I receive the link (which you submitted).

GunaPalanivel · March 10, 2026, 7:38am

Hi @berviantoleo , Thanks for the reply! That makes total sense. Testing LLM/RAG pipelines is definitely tricky, so I’ll focus on getting the core features stable and reliable for Phase’s rather than chasing a perfect coverage number.

No rush at all on the proposal

Topic		Replies	Views
[GSoC 2026] Introduction & Proposal for Continue AI-Powered Chatbot - Yugansh (@Yugansh5013) GSoC question , gsoc	1	77	March 11, 2026
[GSoC 2026 PROPOSAL] Karishma - Continue AI-Powered Chatbot for Quick Access to Jenkins Resources GSoC	4	91	March 29, 2026
[GSoC 2026 PROPOSAL] Ayush Singh (Flamki) — Continue AI-Powered Chatbot for Jenkins Resources GSoC	10	250	March 20, 2026
[GSoC 2026] Introduction - AI Powered Chatbot for Quick Access to Jenkins Resources GSoC	1	33	March 31, 2026
GSoC 2026 Proposal Discussion: AI Chatbot to guide user workflow GSoC	8	300	March 31, 2026

[GSoC 2026] Continue AI-Powered Chatbot - Contributor Introduction + Qns

Related topics