Attendees
- @dduportal (Damien Duportal)
- @smerle33 (Stéphane Merle)
- @poddingue (Bruno Verachten)
- @kmartens27 (Kevin Martens)
- Akash Mishra
Announcements
- Weekly:
- 2.444 => Failed due unforseen consequences of [get.jenkins.io/mirrors/mirrorbit - Azure] High costs due to usage of Azure File Storage · Issue #3917 · jenkins-infra/helpdesk · GitHub (a file storage was removed during migration to premium storage)
- Fixed by re-creating the missing file storage, documenting it and filling it with data from pkg.origin.jenkins.io VM.
- Thanks @lemeurherve and @en3hD3iMRx6_6IXLNY0Rag for handling this release!
- 2.445
- Release is out (WAR)
- Packaging job failed due a network transient error: restarted, almost finished
- Docker image is out
- Changelog merged \o/
- 2.444 => Failed due unforseen consequences of [get.jenkins.io/mirrors/mirrorbit - Azure] High costs due to usage of Azure File Storage · Issue #3917 · jenkins-infra/helpdesk · GitHub (a file storage was removed during migration to premium storage)
Upcoming Calendar
- Next Weekly: 20 Feb. 2024: 2.446
- Next LTS: 21 Feb.: 2.440.1
- Next Security Release as per jenkinsci-advisories: N.A.
- Next major event:
- SCalex 13/14th March 2024
Notes
-
Done:
- Add
deprecated
topic to job-fan-in-plugin GitHub repository - Datadog plugin 6.x failures
- Jenkins plugin
datadog
6.0.0 made controllers to corrupt build data during a stack overflow error.- Corruption fixed by 6.0.1, stack overflow error fixed by 6.0.2 applied everywhere where we use it
- ci.jenkins.io incorrectly shows build dates in Dec 1969 (the epoch)
- (Random ?) 404 pages on ci.jenkins.io
- Jenkins controllers logs aren’t collected by Datadog
- Jenkins plugin
- Incrementals Publisher is down: 400
- Relies on RPU archived artifacts which were corrupted by datadog (see below)
- Fixed as RPU ran again with success on ci.jenkins.io
- Next step: let’s start using reports.jenkins.io instead (issue to write @dduportal)
- Note: thanks @lemeurherve and @timja foir the incremental service upgrades!
- Create discourse (community.jenkins.io) service dropdown at helpdesk
- community.jenkins.io page views exceed OSS plan
- Increase
jenkinsci
organization’s seats - [get.jenkins.io/mirrors/mirrorbit - Azure] High costs due to usage of Azure File Storage
- Caused https://updates.jenkins.io/current/latest/jenkins.war returns 404 due to missing filestorage (see announcements section)
- Massive re-downloads of the same files
- [
get.jenkins.io
] Track storage usage and provisionning - Agents aren’t spawning on infra.ci
- AKS autoscaling from zero does NOT work when using spot instances for nodes :facepalm:
- Budget is with keeping always 1 spot instance (infra-reports is run quite often for infra.ci)
- In reference to issue #3183
- Add
-
- Export download mirrors list to a textual representation
- Deploys mirrors metas to https://reports.jenkins.io/infrastructure/v1/index.json
- But there still is an unexpected bug: list is empty except for fallback
- Update Jira LTS from 9.4.x to 9.12.x
- Waiting for LF to give us an operation date
- Unexpected delays building small plugin on linux agent
- DigitalOcean Kubernetes agents are still disabled on ci.jenkins.io
- Requires Kubernetes 1.27 (see below in new items)
- [Jenkins Agents] Clean up deprecated JNLP arguments
- Failed for Azure VM + Azure Containers, both Linux and Windowds. Need to investigate more
- Revoke an OpenVPN cert for NotMyFault
- Nothing done here. Let’s check if we can diagnose this this week or we will drop it.
- Most probably here, but it is a question of scope: we can revoke a user, but can we revoke a former cert of a valid user (easily?)
- To host versioned jenkins.io docs on docs.jenkins.io
- Need to continue working on this one (post-FOSDEM)
- [uplink] Download failing for
JavaSystemProperties
witherror: missing chunk number 0 for toast value xx in pg_toast_xxx
- Still corrupted database records, slow running requests running to find them all
- Intermittent out of memory for Java 21 builds of Jenkins core on ci.jenkins.io
- Nothing done, still need to be investigated
- Migration left over from publicK8s to arm64
- Next step is LDAP: need to plan migration of persistent data (as arm64 VMs are in a different AZ than x86) to a zone-replicated PV and then migrate LDAP to arm64
- Next candidate: keycloak
- Still no news from mirrorbits maintainer
- infra.ci.jenkins.io on
arm64
(controller and agents)- WiP on the “all in one” image (tools: ruby/bundler) to migrate jenkins-infra/infra-reports to the arm64 image (was using
docker-builder
image)- Tricky issues around
PATH
order and ASDF vs. system installations
- Tricky issues around
- Next candidate: other consumers of
docker-builder
and Puppet (jenkins-infra/jenkins-infra) - Long term: spin up a new AKS cluster only for infra.ci agents on the Azure sponsored subscription: see Add a new private kubernetes cluster in the new sponsored azure subscription · Issue #3923 · jenkins-infra/helpdesk · GitHub
- WiP on the “all in one” image (tools: ruby/bundler) to migrate jenkins-infra/infra-reports to the arm64 image (was using
- Past Release sites are taking long time to load
- Related to [get.jenkins.io/mirrors/mirrorbit - Azure] High costs due to usage of Azure File Storage
- Tried NFSv4 (premium only) instead of SMB/CIFS in the AKS CSI driver, but failed on the 2 first attempts
- @lemeurherve raised a long-running question (also raised by @en3hD3iMRx6_6IXLNY0Rag 2 years ago): why not generating this HTML pages on each core release
- Check if we could replace
blobxfer
byazcopy
- Good news, @lemeurherve was able to find a way to generate short-lived SAS token using Azure Service Principal
- WiP on contributors.jenkins.io
- Next step: update-jenkins-io (and the others)
- Long term: pkg.origin.jenkins.io VM
- Blocks [INFRA-3100] Migrate updates.jenkins.io to another Cloud
- Updatecli: Use separated pipelines + organization scanning for all updatecli processes in jenkins-infra
- Was making usage of graph view hard for us (terraform and packer jobs on infra.ci)
- Long running tasks: 1 or 2 jobs max per week
- WiP: azure (terraform) and packer
- Export download mirrors list to a textual representation
-
ToDo (next milestone) (infra-team-sync-2024-02-20 Milestone · GitHub)
- Upgrade to Kubernetes 1.27
- Digital Ocean drops 1.26.x end of Feb.
- WiP: Let’s roll for
kubectl
and DigitalOcean clusters (as not used: risk is zero) - (Positive) side effect:
kubectl
version on our all-in-one image was incorrect: fixed thanks to this! - Next candidate: AWS EKS clusters \o/
- update updatecli manifest on sharedtools to add a condition for golang with packer-images versions match
- Golang version is tracked today BUT we only want the image present in the all-in-one image used in production otherwise we break builds for shared-tools and docker-openvpn when upgrading image
- removing the central cache has seemingly broken dependabot (sometimes!) · Issue #3919 · jenkins-infra/helpdesk · GitHub => need to diagnose (closable?)
- Add a new private kubernetes cluster in the new sponsored azure subscription · Issue #3923 · jenkins-infra/helpdesk · GitHub => @dduportal drives witgh @smerle as secondary
- Upgrade to Kubernetes 1.27