Infrastructure Team Meeting - Dec 14, 2021

Participants

Damien Duportal (@dduportal ), Hervé Le Meur (@hlemeur ), Mark Waite (@MarkEWaite ), Stephane Merle, Etienne Studer of Gradle Enterprise (Guest)

Official minutes on GitHub.

Notes

  • Accelerating ci.jenkins.io build and test with Gradle Enterprise - @MarkEWaite

    • Etienne Studer of Gradle Enterprise to discuss infrastructure needs
      • Concepts
        • Reducing time to evaluate a change
        • Reducing time to complete a fix
        • Reduce failures related to duplicate work
      • Collects information from builds and tests while they are running
        • Can be done for local builds and CI builds
        • More than log capture and log analysis
        • Root cause analysis use deeper insights
        • Improve collaboration between people assessing issues
        • CI issue investigation possible without requiring infra team
          • A new dependency pulled into the build
        • Smaller issues can be addressed efficiently
      • Build scan that captures details about the build
        • xwiki example to see dependencies and shareable analysis
      • Faster feedback cycles, CI and locally
        • Faster build improves many different areas
        • Reduce overhead for new contributors (faster startup)
        • Spring boot example - don’t do work we’ve done already
          • Reduced spring boot build from 40 minutes to 2 minutes
        • Reduce CI resource use
          • Skip goals because results are already cached
          • Skip tests that are unaffected by the change
          • Faster local builds create less demand on CI build
            • Fast local build instead of “let CI check it”
        • Reduce the rerun pattern with fewer flaky builds
        • Spring boot build reports flaky tests to show most flaky
          • Shows expensive tests and intermittent failing tests
        • CI pushes to the cache, while developers only read from cache
        • A pull and build can use the cache
          • Caching at the goal level for maven builds
        • Predictive test selection (pre-release)
          • Preview released this week, not yet production
          • Uses historic data rather than code traversal
      • Deployment
        • Deploys into a Kubernetes cluster
        • Supports external database
        • Multiple cache nodes and pre-emptive replication allowed
      • Configuring a project to use it
        • Add extensions.xml with Gradle Enterprise maven extension
          • extensions.xml
          • gradle-enterprise.xml
          • Build cache credentials added to CI to push to cache
    • Infrastructure team assess experiment plan, assess impact
      • 30 day GE trial process
        • Install and configure
        • Connect a build
        • Capture a baseline
        • Optimize performance
        • Quantify improvement
        • Present return on investment
      • Visualizations that show build cache use and build performance
        • Detect flaky builds
      • How to do it?
        • ge.jenkins.io prototype
          • 142 goals in 6 projects - 72 minutes
          • 142 goals in 6 projects - 2 minutes cached
      • Credentials on ci.jenkins.io considered compromised
        • Would we need to generate the cache elsewhere (staged)?
        • Will we get the same benefit from staged cache?
          • Benefit from local cache in all cases
          • Remote cache with less frequent publishing less benefit
          • May need two caches, one on AWS, one on Azure
        • Comparing cost of maintaining service vs. paying build?
          • Don’t have cost estimates, used on AWS at their accounts
          • Costs are not so large for difference between them
          • Cache overhead much less than build/test cost
      • GE hosts 13 open source instances themselves
        • $130 per instance per month
      • Kubernetes cluster that hosts our services on one provider
        • Local cache possible? - yes
      • Kubernetes cluster that hosts ephemeral agents
        • Local cache possible? - yes
      • Can specific artifacts be removed from cache? - yes
        • Sometimes we have 0 byte artifacts written to cache
        • Cache key of the entry is available
        • Ask to delete the key from all nodes
    • Mark begin the conversations with governance board and developers
    • Mark schedule a developer online meetup to share the experience
      • Etienne present to the meetup
      • Mark to schedule with Etienne
  • Log4Shell - CVE-2021-44228

1 Like