Attendees
- @dduportal (Damien Duportal)
- @MarkEWaite (Mark Waite)
- @smerle33 (Stéphane Merle)
- @poddingue (Bruno Verachten)
- @jayfranc999 (Jay Reddy)
Announcements
- Weekly 2.467 (09 July 2024)
- Failed during the test phase due to flakiness
- Restarted and went well
- Weekly 2.468 (16 July 2024)
- No errors, handled by and thanks!
- Weekly 2.469 (23 July 2024)
- Started on time, we are watching it
- LTS 2.452.3 (10 July 2024)
- JDK11 ok
- Next meetings:
- Next week OK (30 July 2024)
- Cancel 6 of August 2024 though
- CVE-2024-6387 (OpenSSH)
- We (still) have to check our SSH restrictions on VMs .
- Draft PR (WiP) feat: restrict SSH access to VMs by dduportal · Pull Request #556 · jenkins-infra/aws · GitHub
- Need to work with VPN routing (both client and serverside) as per Tim and Daniel feedbacks.
- Need an issue for the new milestone as it is bigger than expected
Upcoming Calendar
- Next Weekly: 2.470 (30 July 2024)
- Next LTS: 2.462.1 Mark Waite leads, Jeremie Playout shadows
- RC tomorrow (24 July 2024)
- Final Release (7 August 2024)
- Next Security Release as per jenkinsci-advisories: N.A.
- Upcoming credentials expirations (~3 weeks):
- Azure FileShare token for docs.jenkins.io already done (sync. with other) => calendar to update
- RPU user token expires the 13 August
- Next major event:
- CD Mini Summit in Vienna - 19 Sept. 2024
Cloud Budgets
- Azure (CDF paid)
- April: $4,550 (invoice)
- May: $4,339 (invoice)
- June: $4,287 (estimated, $4187 of billing + $100 support)
- July (current): $3,195 consumed ( Forecast at ~4.4k)
- Issue to create to migrate privatek8s (to decrease bill)
- Issue to create to migrate cert.ci and trusted.ci VMs (3) - (to decrease bill)
- Azure Sponsorship (Microsoft Credits) - Remaining: $71.331 (+60k :party:) until May 2025 (instead of August 2024)
- April: $2k
- May: $5k consumed
- June: $7.3k consumed
- July (current): $6885 consumed ( Forecast at ~9k - mostly Spring Security related workload)
- Issue to create about using spot instances to decrease costs (at least ask for quotas to Azure Support)
- DigitalOcean
- April: $840
- May: $648
- June: $165.32
- July (current): $123 consumed (Forecast at $165)
- Issue to create for a “reference mirror”
- AWS:
- CloudBees:
- April: $9,782
- May: $8,281
- June: $5,862
- July (current): $4.7 consumed (Forecast at 6.6$k )
- Sponsored account
- Global Status:
- Credits left: $60,000 until 31 January 2025
- Untouched
- Global Status:
- CloudBees:
Notes
-
Done:
- Windows agents stopped allocating on ci.jenkins.io Monday about 02:30 AM UTC
- caused by [ci.jenkins.io] Service Principal used by
ci.jenkins.io
to spawn Azure agents expires on2024-07-22
- But fixed immediately
- caused by [ci.jenkins.io] Service Principal used by
- [cert.ci] adding myself @smerle33 as administrator on cert.ci
- Need to work on the authN for cert and trusted
- [infra.ci.jenkins.io] Azure Client credential for deploying docs.jenkins.io expires on
2024-08-08
- [infra.ci.jenkins.io] Azure Client credential for deploying plugins.jenkins.io expires on
2024-07-27
- [trusted.ci.jenkins.io] Azure Client credential for deploying javadoc.jenkins.io expires on
2024-07-28
- [trusted.ci.jenkins.io] Azure Client credential for deploying
jenkins.io
expires the2024-07-23
- [packer-images] Datadog public GPG rotated for 2024
- did set up an updatecli process to open PR with new GPG key if any (to avoid the same problem next year)
- Archive jenkins-infra/docker-plugins-self-service
- Update ci.jenkins.io, trusted.ci, cert.ci and release.ci to latest LTS version 2.452.3
- [infra.ci,weekly.ci] split repository jenkins-infra/docker-jenkins-weekly
- [trusted.ci, cert.ci] Missing JDK17 and JDK21 tools
- [Post Mortem] Jenkins Core release
2.465
failed and was replaced by2.466
- [Terraform
datadog
] Move Terraform state to a managed Azure bucket - [Terraform] Azure Client passwords of terraform states expire 16 July 2024
- [Terraform Projects] Bump to Terraform 1.9.x
- Design an automated update mechanism for the private repository jenkins-infra/terraform-states
- Windows agents stopped allocating on ci.jenkins.io Monday about 02:30 AM UTC
-
Work in Progress(infra-team-sync-2024-07-23 Milestone · GitHub):
- [INFRA-3100] Migrate updates.jenkins.io to another Cloud
- UC2 new publication has been merged
- We deliver parallellized publications to 2 different Cloudflare regions (EU and US)
- We also cleaned up and finished deployment on the HTTPD server (.htaccess only)
- crawler is also up to date with both new and old UC
- Next steps:
- Validate the new UC features
- Stress test this new UC
- Start using it ourselves: Docker images (weekly.ci/release.ci and infra.ci)
- Add JDK21 agents (build)
- WiP on ci.jenkins.io: Linux VM inbound agents (JDK17 + JDK21) by @jay and
- Vagrant tested
- PR opened: next step deliver to production
- Next steps
- trusted.ci SSH Linux VM agents (require a bit of pre-setup but testeable end to end)
- Windows (both inbound and SSH) later
- WiP on ci.jenkins.io: Linux VM inbound agents (JDK17 + JDK21) by @jay and
- [cert.ci.jenkins.io] Jobs failing with
TransportException: github.com:443 failed to respond
when usinggit
pipeline step with JGit tool- JGit with (this specific) private repository failed to clone with TCP errors in a parallel context
- Patch: using native git CLI on cert.ci
- Nice to have: setup git tool to have git native CLI as primary tool for all and JGit as fallback
- Request by Mark in the comments will be treated somewhere else (most probably diagnose / reproduce the bug)
- new.stats.jenkins.io slow to load
- We have better performances since gzip and brotli compression
- Current status is ok for the Infra team
- Remove 999999-SNAPSHOT version of Remoting from Artifactory
- Under Tim’s hands
- [infra.ci.jenkins.io] Builds stucks due to GH API rate limit
- Still need to be worked on
- [Plugin Health Score] Scores not computed - Getting logs from plugin-health.jenkins.io
- Adrien is working on it (but overloaded) as we identified the problematic records in database
- Migration left over from publicK8s to arm64
- No more work on it
- Stephane checked the LDAP part and have some clues => but holidays incoming
- We have a link to a similar error (related to LDIF indexing)
- We identified and fixed the LDAP build for arm64
- Updatecli: Use separated pipelines + organization scanning for all updatecli processes in jenkins-infra
- WiP, next batch will be Terraform Jobs
- Herve already created all jobs on infra.ci and prepared a big batch of PRs on repositories
- Next batch will be Jenkins Infra Docker images
- GHA → Jenkins
- case by case due to need for contributors to see the updatecli builds (at least the diff)
- run diff on ci.jio and apply on infra.ci? (or another controller)
- run all on infra.ci and only publish GH checks on PRs
- Funny side effects by moving from GHA VMs to Jenkins pod agents
- Need another issue for these cases
- case by case due to need for contributors to see the updatecli builds (at least the diff)
- [INFRA-3100] Migrate updates.jenkins.io to another Cloud
-
ToDo (next milestone) (infra-team-sync-2024-07-30 Milestone · GitHub)
- Issue to create about SSH restriction through VPN (AWS and DO)
- Temurin JDK upgrade July 2024
- All JDKs seem available
- Need to cleanup leftovers of EA versions for Jenkins tool on ci.jio for s390x agent
- Damien will pair with Jay on s390x JDK CLI tools to update (in anticipation of Ansible setup later this year)
- Reminder by Mark that we have 3 batches:
- Jenkins JDK tool installes
- Jenkins Agent JDK installed CLI tool
- Jenkins Controller and Agents runtimes
- Dockerhub rate limit broke the www.jenkins.io CI build
- Looks like it was a one time issue on DockerHub. Fixed by manual reaply on short term
- However jenkins.io pipeline, on ci.jenkins.io, does not have a
docker login
authentication so it is anonymously pulling. As it uses ruby official image, the rate limit might apply on one of the 3 outbound IPs so this problem might happen again (we moved to NAT gateway a few months ago so it is “recent”).- We should improve the pipeline
- Also need to diagnose the ATH failure to see if there are improvement we could make (like for jenkins.io)
- ci.j.io plugin jobs don’t trigger on branch scan
- Daniel gave all required information in the issue and is a legit request
- Need to be announced (as it might create a 1 hour “Scan repo” overload on ci.jio when deploying the new config)
- https://github.com/jenkins-infra/helpdesk/issues/4194
- New mirrors to add:
- New mirror in Japan ready to roll
- Adding an new mirror in Taiwan ready to roll
- Adding an OSSPlanet mirror they acknowledged it but the initial rsync is slow (from OSUOSL) => on hold until they give us news