* PR fails due to OVMF time out @ 2023-03-31 15:00 Ni, Ray 2023-03-31 19:46 ` Gerd Hoffmann 0 siblings, 1 reply; 6+ messages in thread From: Ni, Ray @ 2023-03-31 15:00 UTC (permalink / raw) To: devel@edk2.groups.io Cc: Kinney, Michael D, Ard Biesheuvel, 'Gerd Hoffmann' [-- Attachment #1: Type: text/plain, Size: 565 bytes --] Hi, I found several of my PRs cannot pass the CI due to OVMF boot timeout. e.g.: Mktme fix by niruiyu * Pull Request #4072 * tianocore/edk2 (github.com)<https://github.com/tianocore/edk2/pull/4072> I also one PR from Ard (X64 text reloc fixes by ardbiesheuvel * Pull Request #4216 * tianocore/edk2 (github.com)<https://github.com/tianocore/edk2/pull/4216>) failed as well. But I remember Mike increased the boot timeout from 1min to 2 mins. Why does OVMF boot require more time than before? Is that because Gerd enabled the SMP 4 boot? Thanks, Ray [-- Attachment #2: Type: text/html, Size: 2764 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PR fails due to OVMF time out 2023-03-31 15:00 PR fails due to OVMF time out Ni, Ray @ 2023-03-31 19:46 ` Gerd Hoffmann 2023-04-02 18:23 ` Michael D Kinney 0 siblings, 1 reply; 6+ messages in thread From: Gerd Hoffmann @ 2023-03-31 19:46 UTC (permalink / raw) To: Ni, Ray; +Cc: devel@edk2.groups.io, Kinney, Michael D, Ard Biesheuvel On Fri, Mar 31, 2023 at 03:00:37PM +0000, Ni, Ray wrote: > Hi, > I found several of my PRs cannot pass the CI due to OVMF boot timeout. > e.g.: Mktme fix by niruiyu * Pull Request #4072 * tianocore/edk2 (github.com)<https://github.com/tianocore/edk2/pull/4072> > > I also one PR from Ard (X64 text reloc fixes by ardbiesheuvel * Pull Request #4216 * tianocore/edk2 (github.com)<https://github.com/tianocore/edk2/pull/4216>) failed as well. > > But I remember Mike increased the boot timeout from 1min to 2 mins. > Why does OVMF boot require more time than before? > > Is that because Gerd enabled the SMP 4 boot? That could be the reason. It happened to work in my CI test run, but maybe I was just lucky. take care, Gerd ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PR fails due to OVMF time out 2023-03-31 19:46 ` Gerd Hoffmann @ 2023-04-02 18:23 ` Michael D Kinney 2023-04-03 8:21 ` Ard Biesheuvel 0 siblings, 1 reply; 6+ messages in thread From: Michael D Kinney @ 2023-04-02 18:23 UTC (permalink / raw) To: 'Gerd Hoffmann', Ni, Ray Cc: devel@edk2.groups.io, Ard Biesheuvel, Kinney, Michael D Hi Gerd, I have investigated this failure with enabling -smp 4. I think this is an important feature that should be on by default. I did find that the results where inconsistent at 2 or 4 cpus on my laptop, but going 8 or higher would fail consistently. I used -smp 32 for all the testing below. First, this failure mode is only seen with SMM_REQUIRE=1 builds. If I set SMM_REQUIRE=0, then QEMU can boot with different smp settings. This appears to be a QEMU MP SMM related issue. I first tried going back in edk2 history looking for a point where smp boot would work using the version of QEMU used by EDK II CI agents which is 2021.5.5 (QEMU 6.0.0) * https://github.com/tianocore/edk2/blob/fc00ff286a541c047b7d343e66ec10890b80d3ea/OvmfPkg/PlatformCI/.azurepipelines/Windows-VS2019.yml#L142 I went back in edk2 history to the 2017/2018 timeframe and the issue was still present, and I recall testing this feature back then myself, so I do not think it is related to edk2 changes. I then did a binary search through the QEMU releases: PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20170113.exe PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20170808.exe 2.10.0-rc2 PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20171122.exe 2.11.0-rc2 PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20171211.exe 2.11.0-rc5 PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20171217.exe 2.11.0 FAIL: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180321.exe 2.12.0-rc0 - AioContextPolling - Unrelated failure PASS: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180404.exe 2.12.0-rc1 PASS: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180711.exe 3.0.0-rc0 PASS: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180807.exe 3.0.0-rc4 PASS: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180815.exe 3.0.0 FAIL: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20181108.exe 3.1.0-rc0 - Long Delay at BSD Entry FAIL: https://qemu.weilnetz.de/w64/2019/qemu-w64-setup-20181211.exe 3.1.0 - Long Delay at BDS Entry FAIL: https://qemu.weilnetz.de/w64/2019/qemu-w64-setup-20190815.exe 4.1.0 - Long Delay at BDS Entry It appears the failure was introduced in the transition from 3.0.0 -> 3.1.0. QEMU 3.1 introduced Multi-Threaded TCG for x86 CPUs * https://www.qemu.org/2018/12/12/qemu-3-1-0/ * https://qemu.readthedocs.io/en/latest/devel/multi-thread-tcg.html#multi-threaded-tcg The feature can be disabled with the following QEMU command line option: * -accel tcg,thread=single * https://doc.ycharbi.fr/fichiers/virtualisation/qemu/documentation/qemu-doc-3.1.0.html When I apply -accel tcg,thread-single to the QEMU command line, QEMU 3.1.0 passes. When I apply -accel tcg,thread-single to the QEMU command line, QEMU 6.0.0 passes. I think we can create a new PR with: -smp 4 -accel tcg,thead-single I consider this a workaround, and we should investigate why Multi-Threaded TCG breaks QEMU MP SMM. Best regards, Mike > -----Original Message----- > From: 'Gerd Hoffmann' <kraxel@redhat.com> > Sent: Friday, March 31, 2023 12:47 PM > To: Ni, Ray <ray.ni@intel.com> > Cc: devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>; Ard Biesheuvel <ardb@kernel.org> > Subject: Re: PR fails due to OVMF time out > > On Fri, Mar 31, 2023 at 03:00:37PM +0000, Ni, Ray wrote: > > Hi, > > I found several of my PRs cannot pass the CI due to OVMF boot timeout. > > e.g.: Mktme fix by niruiyu * Pull Request #4072 * tianocore/edk2 > (github.com)<https://github.com/tianocore/edk2/pull/4072> > > > > I also one PR from Ard (X64 text reloc fixes by ardbiesheuvel * Pull Request #4216 * tianocore/edk2 > (github.com)<https://github.com/tianocore/edk2/pull/4216>) failed as well. > > > > But I remember Mike increased the boot timeout from 1min to 2 mins. > > Why does OVMF boot require more time than before? > > > > Is that because Gerd enabled the SMP 4 boot? > > That could be the reason. It happened to work in my CI test run, > but maybe I was just lucky. > > take care, > Gerd ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: PR fails due to OVMF time out 2023-04-02 18:23 ` Michael D Kinney @ 2023-04-03 8:21 ` Ard Biesheuvel 2023-04-03 10:37 ` [edk2-devel] " Gerd Hoffmann 0 siblings, 1 reply; 6+ messages in thread From: Ard Biesheuvel @ 2023-04-03 8:21 UTC (permalink / raw) To: Kinney, Michael D, Laszlo Ersek Cc: Gerd Hoffmann, Ni, Ray, devel@edk2.groups.io On Sun, 2 Apr 2023 at 20:23, Kinney, Michael D <michael.d.kinney@intel.com> wrote: > > Hi Gerd, > > I have investigated this failure with enabling -smp 4. I think this is an > important feature that should be on by default. > Agreed. And given your investigation below (thanks!), we should be able to revert the revert as long as we add '-accel tcg,thread-single', right? It would be nice if we could also get some coverage for KVM, but for the time being, this seems to be an improvement already (AFAIU, we want -smp 4 primarily for the logic used in the code, while there is very little code that would actually run concurrently, if any) -- Ard. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [edk2-devel] PR fails due to OVMF time out 2023-04-03 8:21 ` Ard Biesheuvel @ 2023-04-03 10:37 ` Gerd Hoffmann 2023-04-04 20:04 ` Michael D Kinney 0 siblings, 1 reply; 6+ messages in thread From: Gerd Hoffmann @ 2023-04-03 10:37 UTC (permalink / raw) To: devel, ardb; +Cc: Kinney, Michael D, Laszlo Ersek, Ni, Ray On Mon, Apr 03, 2023 at 10:21:27AM +0200, Ard Biesheuvel wrote: > On Sun, 2 Apr 2023 at 20:23, Kinney, Michael D > <michael.d.kinney@intel.com> wrote: > > > > Hi Gerd, > > > > I have investigated this failure with enabling -smp 4. I think this is an > > important feature that should be on by default. > > > > Agreed. And given your investigation below (thanks!), we should be > able to revert the revert as long as we add '-accel > tcg,thread-single', right? That is how I read it, yes. > It would be nice if we could also get some coverage for KVM, but for > the time being, this seems to be an improvement already (AFAIU, we > want -smp 4 primarily for the logic used in the code, while there is > very little code that would actually run concurrently, if any) Yes, it's to improve MpInit test coverage. Most code would continue to run single-threaded. take care, Gerd ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [edk2-devel] PR fails due to OVMF time out 2023-04-03 10:37 ` [edk2-devel] " Gerd Hoffmann @ 2023-04-04 20:04 ` Michael D Kinney 0 siblings, 0 replies; 6+ messages in thread From: Michael D Kinney @ 2023-04-04 20:04 UTC (permalink / raw) To: devel@edk2.groups.io, kraxel@redhat.com, ardb@kernel.org Cc: Laszlo Ersek, Ni, Ray, Kinney, Michael D Hi Gerd, Do you want to submit a new email review/PR with this suggested change? Thanks, Mike > -----Original Message----- > From: devel@edk2.groups.io <devel@edk2.groups.io> On Behalf Of Gerd Hoffmann > Sent: Monday, April 3, 2023 3:38 AM > To: devel@edk2.groups.io; ardb@kernel.org > Cc: Kinney, Michael D <michael.d.kinney@intel.com>; Laszlo Ersek <lersek@redhat.com>; Ni, Ray <ray.ni@intel.com> > Subject: Re: [edk2-devel] PR fails due to OVMF time out > > On Mon, Apr 03, 2023 at 10:21:27AM +0200, Ard Biesheuvel wrote: > > On Sun, 2 Apr 2023 at 20:23, Kinney, Michael D > > <michael.d.kinney@intel.com> wrote: > > > > > > Hi Gerd, > > > > > > I have investigated this failure with enabling -smp 4. I think this is an > > > important feature that should be on by default. > > > > > > > Agreed. And given your investigation below (thanks!), we should be > > able to revert the revert as long as we add '-accel > > tcg,thread-single', right? > > That is how I read it, yes. > > > It would be nice if we could also get some coverage for KVM, but for > > the time being, this seems to be an improvement already (AFAIU, we > > want -smp 4 primarily for the logic used in the code, while there is > > very little code that would actually run concurrently, if any) > > Yes, it's to improve MpInit test coverage. Most code would continue to > run single-threaded. > > take care, > Gerd > > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-04-04 20:05 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-03-31 15:00 PR fails due to OVMF time out Ni, Ray 2023-03-31 19:46 ` Gerd Hoffmann 2023-04-02 18:23 ` Michael D Kinney 2023-04-03 8:21 ` Ard Biesheuvel 2023-04-03 10:37 ` [edk2-devel] " Gerd Hoffmann 2023-04-04 20:04 ` Michael D Kinney
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox