Hi everyone,
I’m Kanaga Abishek B, a software developer based in Chennai, India (IST, UTC+5:30), and I’m applying for the “Use OpenTelemetry for Jenkins Jobs on ci.jenkins.io” project for GSoC 2026.
I have already shipped an end-to-end OpenTelemetry tracing system from scratch — OTLP ingestion, Cassandra storage with multi-index support (trace ID, service name, tags), CLI query tool, and Docker/Helm packaging for self-hosting. This means I have implemented both sides of the OTel pipeline: the SDK instrumentation side and the collector/storage side. I am not approaching this project to learn OTel — I am applying existing hands-on knowledge to a production-scale problem.
What I have done so far
Local stack running: Jenkins + OTel Collector (contrib) + Jaeger v2 + Prometheus running via Docker Compose. I have traced real multi-stage pipelines and verified span export to Jaeger and metric scraping via the Prometheus exporter endpoint.
Plugin codebase studied: I have reviewed the four open issues blocking production deployment:
- Issue #1170 — NPE in
GitCheckoutStepHandler: the null guard is missing on span context before span creation - Issue #1174 — Queue metrics not registered: the
UpDownCounter/Histograminstruments are never registered with the globalMeterProviderat plugin init - Issue #1202 — TRACEPARENT frozen in
envcontext:EnvironmentContributor.buildEnvironmentFor()is called once at Phase:Run and never re-injected at stage boundaries. I have reviewed PR #1219 which targets this issue and verified the root cause against the scripted + declarative pipeline outputs in the bug report - Issue #1161 — ES 10k log limit: the current query has no pagination;
search_afterAPI resolves this
My proposal
I have written a full proposal following the official Jenkins GSoC template covering:
- 4-phase breakdown (bug fixes → enhancements → canary deployment → dashboards)
- Proposal/Fallback/Validation structure for each of the 4 bug fixes
- Tail-based sampling strategy for jenkins scale (100% failed traces, 20% successful)
- Metric cardinality controls via OTel Collector relabeling
- Canary rollout plan with explicit rollback strategy
- Success metrics table with midterm and final checkpoints
Questions for the mentors
-
I notice the plugin already has
withSpanAttributeandsetSpanAttributespipeline steps. My Phase 2 proposes awithOpenTelemetrystep targeting resource-level attributes (service.name,service.instance.id) rather than span-level attributes. Is this the right level of abstraction for multi-team use case, or is the infra team’s need better served by a different approach? -
Is the primary deployment target Grafana Tempo for traces, or is there an existing backend already configured/preferred by the Jenkins infra team?
I am actively working on a PR for Issue #1170 and will have it submitted before the application deadline.
Thank you — looking forward to working with this community.
Kanaga Abishek B
GitHub: kanagaabishek (Kanaga abishek.B) · GitHub