public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
* [edk2-devel] Quick Change to Improve edk2 CI Time
@ 2024-07-26 20:03 Michael Kubacki
  2024-07-27 11:14 ` Ard Biesheuvel
  0 siblings, 1 reply; 3+ messages in thread
From: Michael Kubacki @ 2024-07-26 20:03 UTC (permalink / raw)
  To: devel@edk2.groups.io

TLDR: Pipelines should be relatively faster soon

---

As you know, PR gates can take a long time in edk2. While there are 
various improvements that could be made, I want to highlight a change in 
the queue now, it's impact, and how it will help.

Pipelines are composed of jobs which are composed of steps. Pipelines 
are scheduled onto build machines (agents) at job granularity. Today, a 
matrix of build configurations kicks off many jobs for a single pipeline 
run.

Examples of pipelines that contain jobs:

- Windows VS2019 PR
   - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=34
   - Number of jobs spawned: 14
- Ubuntu GCC5 PR
   - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=33
   - Number of jobs spawned: 15
- PlatformCI_OvmfPkg_Windows_VS2019_PR
   - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=51
   - Number of jobs spawned: 11
- PlatformCI_OvmfPkg_Ubuntu_GCC5_PR
   - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=49
   - Number of jobs spawned: 19
- PlatformCI_EmulatorPkg_Windows_VS2019_PR
   - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=45
   - Number of jobs spawned: 12
- PlatformCI_EmulatorPkg_Ubuntu_GCC5_PR
   - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=42
   - Number of jobs spawned: 6
- PlatformCI_ArmVirtPkg_Ubuntu_GCC5_PR
   - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=47
   - Number of jobs spawned: 18
- tianocore.PatchCheck
   - https://dev.azure.com/tianocore/edk2-ci/_build?definitionId=35
   - Number of jobs spawned: 1

Total Jobs Per PR: 96

Each job must acquire an "agent" of which 30 are available to the 
project. So you can see, how several PRs with recent pushes can lead to 
outcomes like this:

   This agent request is not running because you have reached the maximum
   number of requests that can run for parallelism type 'Microsoft-Hosted
   Public'. Current position in queue: 617. Current parallelism: 30. Max
   Parallelism for parallelism type 'Microsoft-Hosted Public': 30

There is a lot that could be discussed about reducing jobs, 
conditionalizing builds on others, etc. But, that's out of the scope of 
this email.

---

Another issue exacerbates the problem. It comes from a single job in two 
of those pipelines ("Ubuntu GCC5 PR" and " Windows VS2019 PR") whose 
task is to wait for every other job in the pipeline to finish, so it can 
download the code coverage results from each and merge them into a 
single code coverage report to publish for the pipeline run.

That job is queued after the last job from the original matrix of all 
other jobs in the pipeline is done. This means that those two pipelines 
(therefore the PR) are essentially being thrown back to the beginning of 
the build queue when they should have completed.

For example, if 2 more PRs had been pushed while the matrix jobs for a 
pipeline are running, that single code coverage job which takes about 3 
to 5 minutes itself to run, is now going to cause the entire PR pipeline 
results to be queued behind 2 x ~96 = ~192 other jobs where some jobs 
take up to 30 minutes. This doesn't include CI pipelines that also add 
to the build queue when a change is merged.

---

PR 5978 (https://github.com/tianocore/edk2/pull/5978) disables the code 
coverage job as a near term solution until someone has enough time to 
test a new approach that is more efficient.

This means the code coverage report will no longer be available in 
pipeline runs. However, the individual coverage.xml files will still be 
published in the artifacts for each job, and you can create a converged 
report locally from those pipeline artifacts if needed.

The pipeline itself has an example of how to do that:

https://github.com/tianocore/edk2/blob/master/.azurepipelines/templates/pr-gate-build-job.yml#L100

Thanks,
Michael


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#120053): https://edk2.groups.io/g/devel/message/120053
Mute This Topic: https://groups.io/mt/107567922/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [edk2-devel] Quick Change to Improve edk2 CI Time
  2024-07-26 20:03 [edk2-devel] Quick Change to Improve edk2 CI Time Michael Kubacki
@ 2024-07-27 11:14 ` Ard Biesheuvel
  2024-07-27 22:29   ` Rebecca Cran
  0 siblings, 1 reply; 3+ messages in thread
From: Ard Biesheuvel @ 2024-07-27 11:14 UTC (permalink / raw)
  To: devel, mikuback, Michael Kinney, Leif Lindholm

On Fri, 26 Jul 2024 at 22:03, Michael Kubacki
<mikuback@linux.microsoft.com> wrote:
>
> TLDR: Pipelines should be relatively faster soon
>

That is a *very* welcome improvement, thanks.

After a day of coding, I've spent another 3 days fighting the CI to
get my changes in.

This is related to removing obsolete ResetSystemLib implementations as
well as the reset runtime in EmbedderdPkg. Due to the
interdependencies with edk2-platforms which I am trying to keep in a
building state at each point in time, I split the work in to 3 or 4
PRs on the EDK2 side, and just managing those along with the hours and
hours of CI delays and lockups have really sucked all the joy out of
contributing to Tianocore.

Getting the CI delay down would help me a lot. But we also need a way
for maintainers -who are ultimately in charge of their packages- to
simply overrule the CI and merge a PR regardless of the CI outcome.

Without this, I am seriously considering whether being a maintainer in
Tianocore is worth the effort for me.

-- 
Ard.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#120054): https://edk2.groups.io/g/devel/message/120054
Mute This Topic: https://groups.io/mt/107567922/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [edk2-devel] Quick Change to Improve edk2 CI Time
  2024-07-27 11:14 ` Ard Biesheuvel
@ 2024-07-27 22:29   ` Rebecca Cran
  0 siblings, 0 replies; 3+ messages in thread
From: Rebecca Cran @ 2024-07-27 22:29 UTC (permalink / raw)
  To: devel, ardb, mikuback, Michael Kinney, Leif Lindholm

On 7/27/24 5:14 AM, Ard Biesheuvel wrote:

> Getting the CI delay down would help me a lot. But we also need a way
> for maintainers -who are ultimately in charge of their packages- to
> simply overrule the CI and merge a PR regardless of the CI outcome.
>
> Without this, I am seriously considering whether being a maintainer in
> Tianocore is worth the effort for me.

If it comes to it, I'd be willing to donate money to the project to 
allow us to purchase CI resources, or donate time on the machines I own, 
moving them into a datacenter if needed. Though I know in the past 
there's been a strong reluctance against using anything that isn't 
cloud-based.


-- 
Rebecca Cran



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#120057): https://edk2.groups.io/g/devel/message/120057
Mute This Topic: https://groups.io/mt/107567922/7686176
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [rebecca@openfw.io]
-=-=-=-=-=-=-=-=-=-=-=-



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-07-27 22:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-26 20:03 [edk2-devel] Quick Change to Improve edk2 CI Time Michael Kubacki
2024-07-27 11:14 ` Ard Biesheuvel
2024-07-27 22:29   ` Rebecca Cran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox