From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail05.groups.io (mail05.groups.io [45.79.224.7]) by spool.mail.gandi.net (Postfix) with ESMTPS id 96C68941D6A for ; Fri, 26 Jul 2024 20:03:26 +0000 (UTC) DKIM-Signature: a=rsa-sha256; bh=wpuYZYZdZH7ptvYwd+3xQchuziEGSopBSgCpLCvf/80=; c=relaxed/simple; d=groups.io; h=DKIM-Filter:Message-ID:Date:MIME-Version:User-Agent:To:From:Subject:Precedence:List-Subscribe:List-Help:Sender:List-Id:Mailing-List:Delivered-To:Resent-Date:Resent-From:Reply-To:List-Unsubscribe-Post:List-Unsubscribe:Content-Language:Content-Type:Content-Transfer-Encoding; s=20240206; t=1722024206; v=1; b=2A97Xo1vX0JBkrgVcB9YvSFpARLAE6XPRlKk39L/JycLkKL/7WC9H/6ZiwPxiNrthmgET+CD OYh8gLBVQYxNZBLA7dHJSbrj36xuoaTpM0gjovp5K/Cz1keoO+Kq2pihGCv5dwV1z4NX/xsctaR /VRCGckRVs/n5K8GC4jd7ubsnUOAp/pKomygmMlyZ0DoW1gCNAGoECmAg7TdukAMtNhEBQh7Pck KdZ4KhRHGywUpi1F3KoVth+KxL/mUE287hnoonQTi8pJ74FZLodkHcbz0y830YcWlzdmeirXdgG mqpY27RCHPC1xG8zlrzPfc8NiBamkLQJcu9Xe9ZpGTZ9Q== X-Received: by 127.0.0.2 with SMTP id h2KxYY7687511xNMXosAW9cL; Fri, 26 Jul 2024 13:03:25 -0700 X-Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by mx.groups.io with SMTP id smtpd.web11.3656.1722024204535817191 for ; Fri, 26 Jul 2024 13:03:24 -0700 X-Received: from [10.6.0.181] (unknown [20.39.63.4]) by linux.microsoft.com (Postfix) with ESMTPSA id C9BE220B7165 for ; Fri, 26 Jul 2024 13:03:23 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com C9BE220B7165 Message-ID: <7e5a4fe5-b52c-44e7-9f4d-a7aaf0a3d399@linux.microsoft.com> Date: Fri, 26 Jul 2024 16:03:23 -0400 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird To: "devel@edk2.groups.io" From: "Michael Kubacki" Subject: [edk2-devel] Quick Change to Improve edk2 CI Time Precedence: Bulk List-Subscribe: List-Help: Sender: devel@edk2.groups.io List-Id: Mailing-List: list devel@edk2.groups.io; contact devel+owner@edk2.groups.io Resent-Date: Fri, 26 Jul 2024 13:03:24 -0700 Resent-From: mikuback@linux.microsoft.com Reply-To: devel@edk2.groups.io,mikuback@linux.microsoft.com List-Unsubscribe-Post: List-Unsubscribe=One-Click List-Unsubscribe: X-Gm-Message-State: jGfdqxJKeu3UBe6ERLuZ5AZQx7686176AA= Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-GND-Status: LEGIT Authentication-Results: spool.mail.gandi.net; dkim=pass header.d=groups.io header.s=20240206 header.b=2A97Xo1v; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=linux.microsoft.com (policy=none); spf=pass (spool.mail.gandi.net: domain of bounce@groups.io designates 45.79.224.7 as permitted sender) smtp.mailfrom=bounce@groups.io TLDR: Pipelines should be relatively faster soon --- As you know, PR gates can take a long time in edk2. While there are=20 various improvements that could be made, I want to highlight a change in=20 the queue now, it's impact, and how it will help. Pipelines are composed of jobs which are composed of steps. Pipelines=20 are scheduled onto build machines (agents) at job granularity. Today, a=20 matrix of build configurations kicks off many jobs for a single pipeline=20 run. Examples of pipelines that contain jobs: - Windows VS2019 PR - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=3D34 - Number of jobs spawned: 14 - Ubuntu GCC5 PR - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=3D33 - Number of jobs spawned: 15 - PlatformCI_OvmfPkg_Windows_VS2019_PR - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=3D51 - Number of jobs spawned: 11 - PlatformCI_OvmfPkg_Ubuntu_GCC5_PR - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=3D49 - Number of jobs spawned: 19 - PlatformCI_EmulatorPkg_Windows_VS2019_PR - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=3D45 - Number of jobs spawned: 12 - PlatformCI_EmulatorPkg_Ubuntu_GCC5_PR - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=3D42 - Number of jobs spawned: 6 - PlatformCI_ArmVirtPkg_Ubuntu_GCC5_PR - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=3D47 - Number of jobs spawned: 18 - tianocore.PatchCheck - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=3D35 - Number of jobs spawned: 1 Total Jobs Per PR: 96 Each job must acquire an "agent" of which 30 are available to the=20 project. So you can see, how several PRs with recent pushes can lead to=20 outcomes like this: This agent request is not running because you have reached the maximum number of requests that can run for parallelism type 'Microsoft-Hosted Public'. Current position in queue: 617. Current parallelism: 30. Max Parallelism for parallelism type 'Microsoft-Hosted Public': 30 There is a lot that could be discussed about reducing jobs,=20 conditionalizing builds on others, etc. But, that's out of the scope of=20 this email. --- Another issue exacerbates the problem. It comes from a single job in two=20 of those pipelines ("Ubuntu GCC5 PR" and " Windows VS2019 PR") whose=20 task is to wait for every other job in the pipeline to finish, so it can=20 download the code coverage results from each and merge them into a=20 single code coverage report to publish for the pipeline run. That job is queued after the last job from the original matrix of all=20 other jobs in the pipeline is done. This means that those two pipelines=20 (therefore the PR) are essentially being thrown back to the beginning of=20 the build queue when they should have completed. For example, if 2 more PRs had been pushed while the matrix jobs for a=20 pipeline are running, that single code coverage job which takes about 3=20 to 5 minutes itself to run, is now going to cause the entire PR pipeline=20 results to be queued behind 2 x ~96 =3D ~192 other jobs where some jobs=20 take up to 30 minutes. This doesn't include CI pipelines that also add=20 to the build queue when a change is merged. --- PR 5978 (https://github.com/tianocore/edk2/pull/5978) disables the code=20 coverage job as a near term solution until someone has enough time to=20 test a new approach that is more efficient. This means the code coverage report will no longer be available in=20 pipeline runs. However, the individual coverage.xml files will still be=20 published in the artifacts for each job, and you can create a converged=20 report locally from those pipeline artifacts if needed. The pipeline itself has an example of how to do that: https://github.com/tianocore/edk2/blob/master/.azurepipelines/templates/pr-= gate-build-job.yml#L100 Thanks, Michael -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#120053): https://edk2.groups.io/g/devel/message/120053 Mute This Topic: https://groups.io/mt/107567922/7686176 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io] -=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-