Infrastructure Team Meeting - Mar. 01, 2022

Participants

Damien Duportal (@dduportal ), Herv茅 Le Meur (@hlemeur ), Stephane Merle (@smerle), Mark Waite (@MarkEWaite )

Official minutes on GitHub.

Announcement :loudspeaker:

  1. Weekly 2.337 release
    • Release is available, Docker images not yet all visible
    • Release checklist still to be run

Notes :book:

  • Issues on ci.jenkins.io:

  • Issue on VPN: Cannot connect to VPN: server certificate for vpn.jenkins.io expired 路 Issue #2798 路 jenkins-infra/helpdesk 路 GitHub

    • Expiring certificate: thanks to @olblak we have been able to add documentation on how to easily regenerate the certificate
    • Server side certificate expired Feb 26, 2022
      • Needed more documentation on how to generate the server-side certificate
      • Generated certificates with @smerle but were missing some specific attributes
      • Pointed to the location, is now documented
    • @lemeurherve and @smerle both have access
  • Issues on trusted.ci.jenkins:

  • Post incidents: calendar updated for credentials expiration routines (Azure SP secrets + VPN certificates)

    • No more expired credentials in Azure SPs as for today
    • infra.ci鈥檚 Azure packer credential to be rotated - 2 weeks left
    • All of the codevalet-*, rtyler* and olblak* app had been removed
    • All apps with credentials expired since 2 years+ removed
      • Removed expired credentials for service principal applications
    • Still a few that need more detailed review
      • Should have enough permissions for @smerle and @lemeurherve able to rotate
      • Credential to be rotated expires in next two weeks
  • Azure AD permissions

    • Stephane + Herve have the same rights as Damien
    • We could not assign 鈥減rivileged roles鈥 to groups (neither create 鈥渃ustom groups鈥) with our current Azure plan (require a premium account)
      • Not needed, so let鈥檚 manage 鈥渕anually鈥 (less than 15 people) + it seems that Terrraform might be able to manage this part
    • Enforced MFA to everyone on the Azure Portal / API
      • Had to enforce per-user for now
  • Digital Ocean :party:

    • Added to ci.jenkins.io since 12 days
      • Added and they are operating as expected
    • TODO:
      • Measure costs consumed (no visible access to the billing page)
      • Updating ci.jenkins.io documentation for agents
      • DigitalOcean sponsorship
        • Add DigitalOcean on the sponsor section of home page
        • We have a blog post to start
        • Our cluster will be updated in 6 days with their Kubernetes updates / patches
          • Similar policy to Azure, changes are applied on their version, visible to us
  • Request from security team to add Windows agent on cert-ci

    • Done, thanks Stephane! VPN routes are :white_check_mark:
    • Weird issue: upgrading LTS from 2.319.1 to 2.319.3 deleted the cloud config!
      • No explanation for the removal
      • Manually managed configuration was deleted during the upgrade
      • Should regularly store copies of the configuration
      • Recreated templates
    • Tried new label pattern:
      • Agents label only with 鈥渒ernel鈥-related dimensions (OS, CPU, Docker, Size). For cert.ci: azure vm linux / azure vm windows for instance
      • Using tools, with the method shown by Mark, with 鈥渇allbacks鈥 (e.g. using shell script to define local path, otherwise fallback to default installer), for 鈥渆asy to get鈥 tools: Git, JDK, Maven, etc.
        • Manage them with the global tools system inside Jenkins
        • Use locally installed JDK from the agent
      • Puppet templating allows to provide 鈥渋mproved鈥 naming: jdk8 and jdk-8 for tools, to handle user typos for instance
      • see Jessie鈥檚 comment at the end: fix(buildPlugin) handle container nodes with JDK > 11 by dduportal 路 Pull Request #302 路 jenkins-infra/pipeline-library 路 GitHub
        • Maybe we should retire maven and make it maven-8?
        • Container OK, but not VM: todo
        • pipeline-library change then
  • infra-report to be migrated out from trusted.ci into infra.ci

  • iptables on ci.jenkins.io after spam: closed (iptables rules cleared by reboot + no more spam)

  • IRC notifs: the new IRC channel is :white_check_mark:

    • We had to run the puppet agent on the puppet master itself + reboot to apply the changes