public inbox for devel@edk2.groups.io
 help / color / mirror / Atom feed
From: "Michael D Kinney" <michael.d.kinney@intel.com>
To: 'Gerd Hoffmann' <kraxel@redhat.com>, "Ni, Ray" <ray.ni@intel.com>
Cc: "devel@edk2.groups.io" <devel@edk2.groups.io>,
	Ard Biesheuvel <ardb@kernel.org>,
	"Kinney, Michael D" <michael.d.kinney@intel.com>
Subject: Re: PR fails due to OVMF time out
Date: Sun, 2 Apr 2023 18:23:12 +0000	[thread overview]
Message-ID: <CO1PR11MB49293D2949B04786EBEFA93CD28D9@CO1PR11MB4929.namprd11.prod.outlook.com> (raw)
In-Reply-To: <hhigti55nalqi2gzp6la23jdk5lfrji75iftprjbcy7zwfg775@yr6n5edvst5f>

Hi Gerd,

I have investigated this failure with enabling -smp 4.  I think this is an
important feature that should be on by default.

I did find that the results where inconsistent at 2 or 4 cpus on my laptop,
but going 8 or higher would fail consistently. I used -smp 32 for all the
testing below.

First, this failure mode is only seen with SMM_REQUIRE=1 builds.  If I
set SMM_REQUIRE=0, then QEMU can boot with different smp settings.
This appears to be a QEMU MP SMM related issue.

I first tried going back in edk2 history looking for a point where smp
boot would work using the version of QEMU used by EDK II CI agents which
is 2021.5.5 (QEMU 6.0.0)
* https://github.com/tianocore/edk2/blob/fc00ff286a541c047b7d343e66ec10890b80d3ea/OvmfPkg/PlatformCI/.azurepipelines/Windows-VS2019.yml#L142

I went back in edk2 history to the 2017/2018 timeframe and the issue was
still present, and I recall testing this feature back then myself, so I
do not think it is related to edk2 changes.

I then did a binary search through the QEMU releases:
PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20170113.exe
PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20170808.exe    2.10.0-rc2
PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20171122.exe    2.11.0-rc2
PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20171211.exe    2.11.0-rc5
PASS: https://qemu.weilnetz.de/w64/2017/qemu-w64-setup-20171217.exe    2.11.0
FAIL: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180321.exe    2.12.0-rc0 - AioContextPolling - Unrelated failure
PASS: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180404.exe    2.12.0-rc1
PASS: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180711.exe    3.0.0-rc0
PASS: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180807.exe    3.0.0-rc4
PASS: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20180815.exe    3.0.0 
FAIL: https://qemu.weilnetz.de/w64/2018/qemu-w64-setup-20181108.exe    3.1.0-rc0  - Long Delay at BSD Entry
FAIL: https://qemu.weilnetz.de/w64/2019/qemu-w64-setup-20181211.exe    3.1.0      - Long Delay at BDS Entry
FAIL: https://qemu.weilnetz.de/w64/2019/qemu-w64-setup-20190815.exe    4.1.0      - Long Delay at BDS Entry

It appears the failure was introduced in the transition from 3.0.0 -> 3.1.0.

QEMU 3.1 introduced Multi-Threaded TCG for x86 CPUs
* https://www.qemu.org/2018/12/12/qemu-3-1-0/
* https://qemu.readthedocs.io/en/latest/devel/multi-thread-tcg.html#multi-threaded-tcg

The feature can be disabled with the following QEMU command line option:
* -accel tcg,thread=single
* https://doc.ycharbi.fr/fichiers/virtualisation/qemu/documentation/qemu-doc-3.1.0.html

When I apply -accel tcg,thread-single to the QEMU command line, QEMU 3.1.0 passes.
When I apply -accel tcg,thread-single to the QEMU command line, QEMU 6.0.0 passes.

I think we can create a new PR with: -smp 4 -accel tcg,thead-single

I consider this a workaround, and we should investigate why Multi-Threaded
TCG breaks QEMU MP SMM.

Best regards,

Mike

> -----Original Message-----
> From: 'Gerd Hoffmann' <kraxel@redhat.com>
> Sent: Friday, March 31, 2023 12:47 PM
> To: Ni, Ray <ray.ni@intel.com>
> Cc: devel@edk2.groups.io; Kinney, Michael D <michael.d.kinney@intel.com>; Ard Biesheuvel <ardb@kernel.org>
> Subject: Re: PR fails due to OVMF time out
> 
> On Fri, Mar 31, 2023 at 03:00:37PM +0000, Ni, Ray wrote:
> > Hi,
> > I found several of my PRs cannot pass the CI due to OVMF boot timeout.
> > e.g.: Mktme fix by niruiyu * Pull Request #4072 * tianocore/edk2
> (github.com)<https://github.com/tianocore/edk2/pull/4072>
> >
> > I also one PR from Ard (X64 text reloc fixes by ardbiesheuvel * Pull Request #4216 * tianocore/edk2
> (github.com)<https://github.com/tianocore/edk2/pull/4216>) failed as well.
> >
> > But I remember Mike increased the boot timeout from 1min to 2 mins.
> > Why does OVMF boot require more time than before?
> >
> > Is that because Gerd enabled the SMP 4 boot?
> 
> That could be the reason.  It happened to work in my CI test run,
> but maybe I was just lucky.
> 
> take care,
>   Gerd


  reply	other threads:[~2023-04-02 18:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-31 15:00 PR fails due to OVMF time out Ni, Ray
2023-03-31 19:46 ` Gerd Hoffmann
2023-04-02 18:23   ` Michael D Kinney [this message]
2023-04-03  8:21     ` Ard Biesheuvel
2023-04-03 10:37       ` [edk2-devel] " Gerd Hoffmann
2023-04-04 20:04         ` Michael D Kinney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-list from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CO1PR11MB49293D2949B04786EBEFA93CD28D9@CO1PR11MB4929.namprd11.prod.outlook.com \
    --to=devel@edk2.groups.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox