From: "Gerd Hoffmann" <kraxel@redhat.com>
To: Michael Brown <mcb30@ipxe.org>
Cc: devel@edk2.groups.io, Oliver Steffen <osteffen@redhat.com>,
Jiewen Yao <jiewen.yao@intel.com>,
Ard Biesheuvel <ardb+tianocore@kernel.org>,
Pawel Polawski <ppolawsk@redhat.com>,
Jordan Justen <jordan.l.justen@intel.com>,
Laszlo Ersek <lersek@redhat.com>
Subject: Re: [PATCH 1/1] OvmfPkg/NestedInterruptTplLib: replace ASSERT() with a warning logged.
Date: Fri, 28 Apr 2023 15:38:03 +0200 [thread overview]
Message-ID: <em2rxaumz2dymkq6m44g273gywt4jtx2obuaiogqbkidcfotbv@xs2gwgv67djp> (raw)
In-Reply-To: <01020187c79d5eef-190d7d28-3c30-442a-913b-4dae66b71839-000000@eu-west-1.amazonses.com>
On Fri, Apr 28, 2023 at 11:26:32AM +0000, Michael Brown wrote:
> On 28/04/2023 10:10, Gerd Hoffmann wrote:
> > OVMF can't guarantee that the ASSERT() doesn't happen. Misbehaving
> > EFI applications can trigger this. So log a warning instead and try
> > to continue.
> >
> > Reproducer: Fetch windows 11 22H2 iso image, boot it in qemu with OVMF.
> >
> > Traced to BootServices->Stall() being called with IPL=TPL_HIGH_LEVEL
> > and Interrupts /enabled/ while windows is booting.
>
> Do we know how the system ended up in a state of being at TPL_HIGH_LEVEL
> with interrupts enabled? This ought not to be possible: the code in EDK2
> will (as far as I can tell) always maintain the invariant that interrupts
> are disabled while at TPL_HIGH_LEVEL.
Not clear. Have not found anything in edk2 either. With a breakpoint
at BootServices->Stall (CoreStall function) I can see it being called
with IPL=TPL_HIGH_LEVEL, most likely from windows boot code, stack
traces point to locations in low memory where no edk2 code is located.
After running for a while (and probably loading files from cdrom to
memory) these calls suddenly start happen with interrupts enabled.
I suspect the windows boot loader does something fishy here, but I can't
proof it, I have not yet pinned down the exact location where interrupts
get enabled while running at IPL=TPL_HIGH_LEVEL (which is what I suspect
is happening, but of course this is not the only possible theory how
that ASSERT got triggered).
Not fully sure how to best continue debugging this, I don't think gdb
can set an watchpoint on eflags.if ...
> > --- a/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c
> > +++ b/OvmfPkg/Library/NestedInterruptTplLib/Tpl.c
> > @@ -39,7 +39,15 @@ NestedInterruptRaiseTPL (
> > //
> > ASSERT (GetInterruptState () == FALSE);
> > InterruptedTPL = gBS->RaiseTPL (TPL_HIGH_LEVEL);
> > - ASSERT (InterruptedTPL < TPL_HIGH_LEVEL);
> > + if (InterruptedTPL >= TPL_HIGH_LEVEL) {
> > + DEBUG ((
> > + DEBUG_WARN,
> > + "%a: Called at IPL %d, trying to fixup and continue...\n",
> > + __func__,
> > + InterruptedTPL
> > + ));
> > + InterruptedTPL = TPL_HIGH_LEVEL - 1;
> > + }
>
> This workaround looks wrong to me: it will cause the subsequent call to
> NestedInterruptRestoreTPL() to drop the TPL back down to TPL_HIGH_LEVEL-1,
> which is lower than it would have been when the interrupt occurred. The
> completed hardware interrupt would then result in an overall change of TPL,
> which cannot be correct.
The idea was to set InterruptedTPL to the highest level which is allowed
to run with interrupts enabled and continue running with interrupts
enabled, hoping that things get back into normal once the next timer
interrupt arrives.
Which -- now that I'm thinking about it again -- is clearly bogus, we
are in the timer interrupt so IRQs have already been disabled at that
point, so the fixup idea doesn't make much sense.
So just leave InterruptedTPL alone and hope for the best?
take care,
Gerd
next prev parent reply other threads:[~2023-04-28 13:38 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-28 9:10 [PATCH 1/1] OvmfPkg/NestedInterruptTplLib: replace ASSERT() with a warning logged Gerd Hoffmann
2023-04-28 11:26 ` Michael Brown
2023-04-28 13:38 ` Gerd Hoffmann [this message]
2023-04-28 14:50 ` [edk2-devel] " Michael Brown
2023-05-03 6:27 ` Gerd Hoffmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-list from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=em2rxaumz2dymkq6m44g273gywt4jtx2obuaiogqbkidcfotbv@xs2gwgv67djp \
--to=devel@edk2.groups.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox